Notes on Google’s Gemma 3

12th March 2025

Google’s Gemma team released an impressive new model today (under their not-open-source Gemma license). Gemma 3 comes in four sizes—1B, 4B, 12B, and 27B—and while 1B is text-only the larger three models are all multi-modal for vision:

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling.

Here’s the Gemma 3 Technical Report PDF, which makes the big claim that they find “Gemma3-27B-IT comparable to Gemini-1.5-Pro across benchmarks”.

I tried out the largest model using the latest Ollama—this is the second time I’ve spotted a major model release partnering with Ollama on launch day, the first being Mistral Small 3.

I ran this (after upgrading Ollama through their menu icon upgrade option):

ollama pull gemma3:27b

That pulled 17GB of model weights. I’ve been trying it out using LLM and llm-ollama:

llm install llm-ollama
llm -m gemma3:27b 'Build a single page HTML+CSS+JavaScript UI that gives me a large textarea for writing in which constantly saves what I have entered to localStorage (restoring when I reload the page) and displays a word counter'

That was a replay of a prompt I ran against Claude Artifacts a few months ago. Here’s what Gemma built, and the full chat transcript. It’s a simple example but it worked just right.

Animated demo - a simple textarea, I type text in it and a word count updates above.

Let’s try out the vision mode, by asking it to describe this photo:

Two pelicans flying against a blue sky

llm -m gemma3:27b describe -a https://static.simonwillison.net/static/2025/two-pelicans.jpg

Here’s the full response:

Overall Impression:

The image features two brown pelicans in flight against a solid, bright blue background. The composition is dynamic, capturing the birds mid-wingbeat.

Details:

Pelicans: Both pelicans are a rich, dark brown color. They have large, pale yellow-orange beaks and pouches. One pelican is shown with its wings fully extended, appearing to be gliding or soaring. The other is captured with its wings in a more dynamic, upward-sweeping motion.

Background: The background is a flat, uniform shade of bright blue, which makes the pelicans stand out.

Composition: The birds are positioned in opposite directions, creating a sense of movement and balance. The image is well-lit, highlighting the texture of the pelicans’ feathers.

Overall, the image is a striking and visually appealing depiction of these majestic birds in flight.

Is it right to say “the birds are positioned in opposite directions”? I thought that was a mistake at first, but I’ll give it a pass since their wings are positioned in opposite directions to each other.

Something I’ve been curious about recently is longer context support: how well can a local model on my laptop deal with summarization or data extraction tasks against longer pieces of text?

I decided to try my Hacker News summarize script using Gemma, against the thread there discussing the Gemma 3 technical paper.

First I did a quick token count (using the OpenAI tokenizer but it’s usually a similar number to other models):

curl 'https://hn.algolia.com/api/v1/items/43340491' | ttok

This returned 22,260—well within Gemma’s documented limits but still a healthy number considering just last year most models topped out at 4,000 or 8,000.

I ran my script like this:

hn-summary.sh 43340491 -m gemma3:27b

It did a pretty good job! Here’s the full prompt and response. The one big miss is that it ignored my instructions to include illustrative quotes—I don’t know if modifying the prompt will fix that but it’s disappointing that it didn’t handle that well, given how important direct quotes are for building confidence in RAG-style responses.

Here’s what I got for Generate an SVG of a pelican riding a bicycle:

llm -m gemma3:27b 'Generate an SVG of a pelican riding a bicycle'

A collection of abstract shapes, definitely not a pelican on a bicycle

You can also try out the new Gemma in Google AI Studio, and via their API. I added support for it to llm-gemini 0.15, though sadly it appears vision mode doesn’t work with that API hosted model yet.

llm install -U llm-gemini
llm keys set gemini
# paste key here
llm -m gemma-3-27b-it 'five facts about pelicans of interest to skunks'

Here’s what I got. I’m not sure how pricing works for that hosted model.

Gemma 3 is also already available through MLX-VLM—here’s the MLX model collection—but I haven’t tried that version yet.

Posted 12th March 2025 at 4:15 pm · Follow me on Mastodon, Bluesky, Twitter or subscribe to my newsletter

Simon Willison’s Weblog

Notes on Google’s Gemma 3

More recent articles

Monthly briefing