Simon Willison’s Weblog

Subscribe
Atom feed for pelican-riding-a-bicycle

61 posts tagged “pelican-riding-a-bicycle”

My benchmark for LLMs: "Generate an SVG of a pelican riding a bicycle".

"User might be a kid playing with words" according to Qwen3-4B-Thinking.

2024

Pelicans on a bicycle. I decided to roll out my own LLM benchmark: how well can different models render an SVG of a pelican riding a bicycle?

I chose that because a) I like pelicans and b) I'm pretty sure there aren't any pelican on a bicycle SVG files floating around (yet) that might have already been sucked into the training data.

My prompt:

Generate an SVG of a pelican riding a bicycle

I've run it through 16 models so far - from OpenAI, Anthropic, Google Gemini and Meta (Llama running on Cerebras), all using my LLM CLI utility. Here's my (Claude assisted) Bash script: generate-svgs.sh

Here's Claude 3.5 Sonnet (2024-06-20) and Claude 3.5 Sonnet (2024-10-22):

Gemini 1.5 Flash 001 and Gemini 1.5 Flash 002:

GPT-4o mini and GPT-4o:

o1-mini and o1-preview:

Cerebras Llama 3.1 70B and Llama 3.1 8B:

And a special mention for Gemini 1.5 Flash 8B:

The rest of them are linked from the README.

# 25th October 2024, 11:56 pm / svg, ai, openai, generative-ai, llama, llms, llm, anthropic, gemini, cerebras, pelican-riding-a-bicycle