Reservoir Sampling (via) Yet another outstanding interactive essay by Sam Rose (previously), this time explaining how reservoir sampling can be used to select a "fair" random sample when you don't know how many options there are and don't want to accumulate them before making a selection.
Reservoir sampling is one of my favourite algorithms, and I've been wanting to write about it for years now. It allows you to solve a problem that at first seems impossible, in a way that is both elegant and efficient.
I appreciate that Sam starts the article with "No math notation, I promise." Lots of delightful widgets to interact with here, all of which help build an intuitive understanding of the underlying algorithm.
Sam shows how this algorithm can be applied to the real-world problem of sampling log files when incoming logs threaten to overwhelm a log aggregator.
The dog illustration is commissioned art and the MIT-licensed code is available on GitHub.
Recent articles
- Trying out llama.cpp's new vision support - 10th May 2025
- Saying "hi" to Microsoft's Phi-4-reasoning - 6th May 2025
- Feed a video to a vision LLM as a sequence of JPEG frames on the CLI (also LLM 0.25) - 5th May 2025