Simon Willison’s Weblog

Subscribe
Atom feed for sql-injection

16 posts tagged “sql-injection”

2025

Every time I get into an online conversation about prompt injection it's inevitable that someone will argue that a mitigation which works 99% of the time is still worthwhile because there's no such thing as a security fix that is 100% guaranteed to work.

I don't think that's true.

If I use parameterized SQL queries my systems are 100% protected against SQL injection attacks.

If I make a mistake applying those and someone reports it to me I can fix that mistake and now I'm back up to 100%.

If our measures against SQL injection were only 99% effective none of our digital activities involving relational databases would be safe.

I don't think it is unreasonable to want a security fix that, when applied correctly, works 100% of the time.

(I first argued a version of this back in September 2022 in You can’t solve AI security problems with more AI.)

# 16th June 2025, 11:54 pm / sql-injection, security, prompt-injection

[NAME AVAILABLE ON REQUEST FROM COMPANIES HOUSE]. I just noticed that the legendary company name ; DROP TABLE "COMPANIES";-- LTD is now listed as [NAME AVAILABLE ON REQUEST FROM COMPANIES HOUSE] on the UK government Companies House website.

For background, see No, I didn't try to break Companies House by culprit Sam Pizzey.

# 9th April 2025, 4:52 pm / sql, sql-injection

I Went To SQL Injection Court (via) Thomas Ptacek talks about his ongoing involvement as an expert witness in an Illinois legal battle lead by Matt Chapman over whether a SQL schema (e.g. for the CANVAS parking ticket database) should be accessible to Freedom of Information (FOIA) requests against the Illinois state government.

They eventually lost in the Illinois Supreme Court, but there's still hope in the shape of IL SB0226, a proposed bill that would amend the FOIA act to ensure "that the public body shall provide a sufficient description of the structures of all databases under the control of the public body to allow a requester to request the public body to perform specific database queries".

Thomas posted this comment on Hacker News:

Permit me a PSA about local politics: engaging in national politics is bleak and dispiriting, like being a gnat bouncing off the glass plate window of a skyscraper. Local politics is, by contrast, extremely responsive. I've gotten things done --- including a law passed --- in my spare time and at practically no expense (drastically unlike national politics).

# 25th February 2025, 10:45 pm / data-journalism, databases, government, law, politics, sql, sql-injection, thomas-ptacek

2024

SQL injection-like attack on LLMs with special tokens. Andrej Karpathy explains something that's been confusing me for the best part of a year:

The decision by LLM tokenizers to parse special tokens in the input string (<s>, <|endoftext|>, etc.), while convenient looking, leads to footguns at best and LLM security vulnerabilities at worst, equivalent to SQL injection attacks.

LLMs frequently expect you to feed them text that is templated like this:

<|user|>\nCan you introduce yourself<|end|>\n<|assistant|>

But what happens if the text you are processing includes one of those weird sequences of characters, like <|assistant|>? Stuff can definitely break in very unexpected ways.

LLMs generally reserve special token integer identifiers for these, which means that it should be possible to avoid this scenario by encoding the special token as that ID (for example 32001 for <|assistant|> in the Phi-3-mini-4k-instruct vocabulary) while that same sequence of characters in untrusted text is encoded as a longer sequence of smaller tokens.

Many implementations fail to do this! Thanks to Andrej I've learned that modern releases of Hugging Face transformers have a split_special_tokens=True parameter (added in 4.32.0 in August 2023) that can handle it. Here's an example:

>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
>>> tokenizer.encode("<|assistant|>")
[32001]
>>> tokenizer.encode("<|assistant|>", split_special_tokens=True)
[529, 29989, 465, 22137, 29989, 29958]

A better option is to use the apply_chat_template() method, which should correctly handle this for you (though I'd like to see confirmation of that).

# 20th August 2024, 10:01 pm / security, sql-injection, transformers, ai, andrej-karpathy, prompt-injection, generative-ai, llms, tokenization

SQL Injection Isn’t Dead: Smuggling Queries at the Protocol Level (via) PDF slides from a presentation by Paul Gerste at DEF CON 32. It turns out some databases have vulnerabilities in their binary protocols that can be exploited by carefully crafted SQL queries.

Paul demonstrates an attack against PostgreSQL (which works in some but not all of the PostgreSQL client libraries) which uses a message size overflow, by embedding a string longer than 4GB (2**32 bytes) which overflows the maximum length of a string in the underlying protocol and writes data to the subsequent value. He then shows a similar attack against MongoDB.

The current way to protect against these attacks is to ensure a size limit on incoming requests. This can be more difficult than you may expect - Paul points out that alternative paths such as WebSockets might bypass limits that are in place for regular HTTP requests, plus some servers may apply limits before decompression, allowing an attacker to send a compressed payload that is larger than the configured limit.

How Web Apps Handle Large Payloads. Potential bypasses: - Unprotected endpoints - Compression - WebSockets (highlighted) - Alternate body types - Incrementation.  Next to WebSockets:  - Compression support - Large message size - Many filters don't apply

# 12th August 2024, 3:36 pm / http, mongodb, postgresql, security, sql-injection, websockets

2023

Bard now helps you code (via) Google have enabled Bard’s code generation abilities—these were previously only available through jailbreaking. It’s pretty good—I got it to write me code to download a CSV file and insert it into a SQLite database—though when I challenged it to protect against SQL injection it hallucinated a non-existent “cursor.prepare()” method. Generated code can be exported to a Colab notebook with a click.

# 21st April 2023, 3:32 pm / google, sql-injection, ai, generative-ai, bard, llms

2022

Prompt injection attacks against GPT-3

Visit Prompt injection attacks against GPT-3

Riley Goodside, yesterday:

[... 1,457 words]

SOC2 is about the security of the company, not the company’s products. A SOC2 audit would tell you something about whether the customer support team could pop a shell on production machines; it wouldn’t tell you anything about whether an attacker could pop a shell with a SQL Injection vulnerability.

Thomas Ptacek

# 7th July 2022, 8:31 pm / security, sql-injection, thomas-ptacek, fly

2020

Pysa: An open source static analysis tool to detect and prevent security issues in Python code (via) Interesting new static analysis tool for auditing Python for security vulnerabilities—things like SQL injection and os.execute() calls. Built by Facebook and tested extensively on Instagram, a multi-million line Django application.

# 7th August 2020, 8:50 pm / django, facebook, python, security, sql-injection, staticanalysis

2012

How are websites hacked to have their content defaced? How can I prevent such attacks on my website?

There are countless ways in which a website could be defaced—way too many for a single Quora answer!

[... 266 words]

What are the best practices to avoid XSS and SQL Injections attacks (platform agnostic)?

Input validation is, in my opinion, a red herring. Sure—if you ask the user for an integer or date you should make sure they entered one before attempting to save it anywhere or use it for processing, but injection attacks often involve text fields (e.g. names, or comments posted on Quora) and validating those on input is a recipe for banning “Tim O’Reilly” from ever creating a proper profile on your site!

[... 316 words]

2009

For those who haven't heard the story the details were pulled from a Christian dating site db.singles.org which had a query parameter injection vulnerability. The vulnerability allowed you to navigate to a person's profile by entering the user id and skipping authentication. Once you got there the change password form had the passwords in plain text. Someone wrote a scraper and now the entire database is on Mediafire and contains thousands of email/password combinations.

rossriley on Hacker News

# 23rd August 2009, 10:10 am / passwords, security, sql-injection

2008

How one site dealt with SQL injection attack (via) Horrifying story of developer incompetence from Autoweb: “The contractor had no idea how to find and fix the Web page vulnerability that allowed the SQL injection attack code to execute successfully.”

# 2nd May 2008, 9:01 pm / autoweb, incompetence, security, sql-injection

Mass Attack FAQ. Thousands of IIS Web servers have been infected with an automated mass XSS attack, not through a specific IIS vulnerability but using a universal XSS SQL query that targets SQL Server and modifies every text field to add the attack JavaScript. If an app has even a single SQL injection hole (and many do) it is likely to be compromised.

# 26th April 2008, 9:12 am / iis, massattack, security, sql, sql-injection, sqlserver, xss

2002

OWASP Security guide

The Open Web Application Security Project (OWASP) have a free guide to building secure web applications, which covers a large range of common problems such as cross site scripting and SQL injection vulnerabilities. The report is a 60 page PDF and although I haven’t had time to go through it yet it looks like an excellent read.

PHP immune to SQL injection attacks

An interesting thread on SitePoint about SQL injection attacks. One of the points brought up is that PHP is by default virtually immune to injection attacks thanks to magic quotes (discussed here yesterday).