A plan for spam
16th August 2002
Paul Graham: A Plan for Spam. Paul suggests using content based filters that learn from users specifically marking messages as spam or legitimate mail. The system then picks emails apart looking for commmon terms (in both the body and the header of the message) that can then be used later on to identify spam messages. He claims his test have let through only 5 per 1000 spams, with 0 false positives
. Impressive stuff, and great reading for the excellent explanations of some advanced alogithmic and statistical techniques.
More recent articles
- Highlights from my appearance on the Data Renegades podcast with CL Kao and Dori Wilson - 26th November 2025
- Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult - 24th November 2025
- sqlite-utils 4.0a1 has several (minor) backwards incompatible changes - 24th November 2025