22nd July 2024 - Link Blog
Breaking Instruction Hierarchy in OpenAI's gpt-4o-mini. Johann Rehberger digs further into GPT-4o's "instruction hierarchy" protection and finds that it has little impact at all on common prompt injection approaches.
I spent some time this weekend to get a better intuition about
gpt-4o-minimodel and instruction hierarchy, and the conclusion is that system instructions are still not a security boundary.From a security engineering perspective nothing has changed: Do not depend on system instructions alone to secure a system, protect data or control automatic invocation of sensitive tools.
Recent articles
- Initial impressions of Claude Fable 5 - 9th June 2026
- Running Python code in a sandbox with MicroPython and WASM - 6th June 2026
- Claude Opus 4.8: "a modest but tangible improvement" - 28th May 2026