20th February 2026
Long running agentic products like Claude Code are made feasible by prompt caching which allows us to reuse computation from previous roundtrips and significantly decrease latency and cost. [...]
At Claude Code, we build our entire harness around prompt caching. A high prompt cache hit rate decreases costs and helps us create more generous rate limits for our subscription plans, so we run alerts on our prompt cache hit rate and declare SEVs if they're too low.
Recent articles
- Two new Showboat tools: Chartroom and datasette-showboat - 17th February 2026
- Deep Blue - 15th February 2026
- The evolution of OpenAI's mission statement - 13th February 2026