22nd September 2007 - Link Blog
robots.txt Adventure. Interesting notes from crawling 4.6 million robots.txt, including 69 different ways in which the word “disallow” can be mis-spelled.
Recent articles
- Tracking the history of the now-deceased OpenAI Microsoft AGI clause - 27th April 2026
- DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026
- Extract PDF text in your browser with LiteParse for the web - 23rd April 2026