If you're a startup running your own crawlers to gather data for whatever purpose, you should try really hard not to make the world a worse place by driving up costs for the sites you are scraping.
There's really no excuse for crawling Wikipedia ("65% of our most expensive traffic comes from bots") when they offer a comprehensive collection of bulk download options.
Do better!
Recent articles
- OpenAI DevDay 2025 live blog - 6th October 2025
- Embracing the parallel coding agent lifestyle - 5th October 2025
- Designing agentic loops - 30th September 2025