Simon Willison’s Weblog

Subscribe

Pelicans on a bicycle. I decided to roll out my own LLM benchmark: how well can different models render an SVG of a pelican riding a bicycle?

I chose that because a) I like pelicans and b) I'm pretty sure there aren't any pelican on a bicycle SVG files floating around (yet) that might have already been sucked into the training data.

My prompt:

Generate an SVG of a pelican riding a bicycle

I've run it through 16 models so far - from OpenAI, Anthropic, Google Gemini and Meta (Llama running on Cerebras), all using my LLM CLI utility. Here's my (Claude assisted) Bash script: generate-svgs.sh

Here's Claude 3.5 Sonnet (2024-06-20) and Claude 3.5 Sonnet (2024-10-22):

Gemini 1.5 Flash 001 and Gemini 1.5 Flash 002:

GPT-4o mini and GPT-4o:

o1-mini and o1-preview:

Cerebras Llama 3.1 70B and Llama 3.1 8B:

And a special mention for Gemini 1.5 Flash 8B:

The rest of them are linked from the README.