Simon Willison’s Weblog

Subscribe

8th September 2025

I recently spoke with the CTO of a popular AI note-taking app who told me something surprising: they spend twice as much on vector search as they do on OpenAI API calls. Think about that for a second. Running the retrieval layer costs them more than paying for the LLM itself.

James Luan, Engineering architect of Milvus

This is a quotation collected by Simon Willison, posted on 8th September 2025.