Examples are the #1 thing I recommend people use in their prompts because they work so well. The problem is that adding tons of examples increases your API costs and latency. Prompt caching fixes this. You can now add tons of examples to every prompt and create an alternative to a model finetuned on your task with basically zero cost/latency increase. […]
This works even better with smaller models. You can generate tons of examples (test case + solution) with 3.5 Sonnet and then use those examples to create a few-shot prompt for Haiku.
Recent articles
- AI assisted search-based research actually works now - 21st April 2025
- Maybe Meta's Llama claims to be open source because of the EU AI act - 19th April 2025
- Image segmentation using Gemini 2.5 - 18th April 2025