4th September 2025 - Link Blog
Introducing EmbeddingGemma. Brand new open weights (under the slightly janky Gemma license) 308M parameter embedding model from Google:
Based on the Gemma 3 architecture, EmbeddingGemma is trained on 100+ languages and is small enough to run on less than 200MB of RAM with quantization.
It's available via sentence-transformers, llama.cpp, MLX, Ollama, LMStudio and more.
As usual for these smaller models there's a Transformers.js demo (via) that runs directly in the browser (in Chrome variants) - Semantic Galaxy loads a ~400MB model and then lets you run embeddings against hundreds of text sentences, map them in a 2D space and run similarity searches to zoom to points within that space.

Recent articles
- Notes on the xAI/Anthropic data center deal - 7th May 2026
- Live blog: Code w/ Claude 2026 - 6th May 2026
- Vibe coding and agentic engineering are getting closer than I'd like - 6th May 2026