21st July 2025
An AI tool that gets gold on the IMO is obviously immensely impressive. Does it mean math is “solved”? Is an AI-generated proof of the Riemann hypothesis clearly on the horizon? Obviously not.
Worth keeping timescales in mind here: IMO competitors spend an average of 1.5 hrs on each problem. High-quality math research, by contrast, takes month or years.
What are the obstructions to AI performing high-quality autonomous math research? I don’t claim to know for sure, but I think they include many of the same obstructions that prevent it from doing many jobs: Long context, long-term planning, consistency, unclear rewards, lack of training data, etc.
It’s possible that some or all of these will be solved soon (or have been solved) but I think it’s worth being cautious about over-indexing on recent (amazing) progress.
— Daniel Litt, Assistant Professor of mathematics, University of Toronto
Recent articles
- LLM 0.32a0 is a major backwards-compatible refactor - 29th April 2026
- Tracking the history of the now-deceased OpenAI Microsoft AGI clause - 27th April 2026
- DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026