experimental-phi3-webgpu (via) Run Microsoft’s excellent Phi-3 model directly in your browser, using WebGPU so didn’t work in Firefox for me, just in Chrome.
It fetches around 2.1GB of data into the browser cache on first run, but then gave me decent quality responses to my prompts running at an impressive 21 tokens a second (M2, 64GB).
I think Phi-3 is the highest quality model of this size, so it’s a really good fit for running in a browser like this.
Recent articles
- First impressions of Claude Cowork, Anthropic's general agent - 12th January 2026
- My answers to the questions I posed about porting open source code with LLMs - 11th January 2026
- Fly's new Sprites.dev addresses both developer sandboxes and API sandboxes at the same time - 9th January 2026