LLM SVG Generation Benchmark (via) Here's a delightful project by Tom Gally, inspired by my pelican SVG benchmark. He asked Claude to help create more prompts of the form Generate an SVG of [A] [doing] [B] and then ran 30 creative prompts against 9 frontier models - prompts like "an octopus operating a pipe organ" or "a starfish driving a bulldozer".
Here are some for "butterfly inspecting a steam engine":

And for "sloth steering an excavator":

It's worth browsing the whole collection, which gives a really good overall indication of which models are the best at SVG art.
Recent articles
- Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult - 24th November 2025
- sqlite-utils 4.0a1 has several (minor) backwards incompatible changes - 24th November 2025
- Olmo 3 is a fully open LLM - 22nd November 2025