9th May 2024 - Link Blog
experimental-phi3-webgpu (via) Run Microsoft’s excellent Phi-3 model directly in your browser, using WebGPU so didn’t work in Firefox for me, just in Chrome.
It fetches around 2.1GB of data into the browser cache on first run, but then gave me decent quality responses to my prompts running at an impressive 21 tokens a second (M2, 64GB).
I think Phi-3 is the highest quality model of this size, so it’s a really good fit for running in a browser like this.
Recent articles
- I vibe coded my dream macOS presentation app - 25th February 2026
- Writing about Agentic Engineering Patterns - 23rd February 2026
- Adding TILs, releases, museums, tools and research to my blog - 20th February 2026