Simon Willison’s Weblog

Subscribe

Tuesday, 2nd June 2026

My latest sandboxing experiment: This alpha package bundles a lightly customized WASM build of MicroPython with a wrapper to execute code in it via wasmtime.

I really like how you can paste a large volume of text into claude.ai (or the Claude desktop/mobile apps) and it will detect it as a large paste and turn it into a file attachment instead.

I decided to have Codex desktop build me a version of that as a prototype.

You can also open files directly - including images which will be shown as thumbnails - or drag files onto the textarea.

Sighting 11:17 AM — California Brown Pelican, in Fort Mason, CA, US
California Brown Pelican
California Brown Pelican

I'm at the Microsoft Build conference today, held at Fort Mason in San Francisco. There are California Brown Pelicans diving into the water directly behind venue!

Fixes for some limitations that emerged while I was trying to use this to build datasette-agent-micropython.

I want Datasette Agent to be able to generate and execute Python code safely. This alpha is looking very promising so far. GPT-5.5 has so far failed to break out of the sandbox!

Microsoft announced two new text LLMs this morning - MAI-Thinking-1 (reasoning, 35B parameters, available to "select early partners") and MAI-Code-1-Flash (5B parameters, "purpose-built for GitHub Copilot and VS Code to deliver high performance and lower cost [...] rolling out to GitHub Copilot individual users in Visual Studio Code"). I've not been able to try either of them just yet.

It's very interesting to see Microsoft releasing models with such low parameter counts, especially given how expensive larger models are to access right now. They claim MAI-Thinking-1 "is preferred to Sonnet 4.6 in our blind human side-by-side evaluations", which is impressive for a 35B model seeing as I frequently run models larger than that on my own laptop.

Also of note:

We trained [MAI-Thinking-1] from the ground up on enterprise grade, clean and commercially licensed data, without distillation from third-party models.

And for MAI-Code-1-Flash as well:

It is built end-to-end by Microsoft using clean and appropriately licensed data.

I would very much like to learn more about this "appropriately licensed" data! Could these be the first generally useful code-specialist models that didn't train on an unlicensed dump of the web?

# 10:21 pm / microsoft, ai, generative-ai, llms, llm-release

Monday, 1st June 2026

2026 » June

MTWTFSS
1234567
891011121314
15161718192021
22232425262728
2930