Simon Willison’s Weblog

Subscribe

Saturday, 6th January 2024

Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations (via) NIST—the National Institute of Standards and Technology, a US government agency, released a 106 page report on attacks against modern machine learning models, mostly covering LLMs.

Prompt injection gets two whole sections, one on direct prompt injection (which incorporates jailbreaking as well, which they misclassify as a subset of prompt injection) and one on indirect prompt injection.

They talk a little bit about mitigations, but for both classes of attack conclude: “Unfortunately, there is no comprehensive or foolproof solution for protecting models against adversarial prompting, and future work will need to be dedicated to investigating suggested defenses for their efficacy.”

# 4:08 am / ai, prompt-injection, generative-ai, llms

Microsoft Research relicense Phi-2 as MIT (via) Phi-2 was already an interesting model—really strong results for its size—made available under a non-commercial research license. It just got significantly more interesting: Microsoft relicensed it as MIT open source.

# 6:06 am / microsoft, open-source, mitlicense, ai, generative-ai, llms, phi

GPT in 500 lines of SQL (via) Utterly brilliant piece of PostgreSQL hackery by Alex Bolenok, who implements a full GPT-2 style language model in SQL on top of pg_vector. The final inference query is 498 lines long!

# 10:55 pm / postgresql, sql, ai, generative-ai, llms