scrapely. Neat twist on a screen scraping library: this one lets you “train it” by feeding it examples of URLs paired with a dictionary of the data you would like to have extracted from that URL, then uses an instance based learning earning algorithm to run against new URLs. Slightly confusing name since it’s maintained by the scrapy team but is a totally independent project from the scrapy web crawling framework.
Recent articles
- GPT-5: Key characteristics, pricing and model card - 7th August 2025
- OpenAI's new open weight (Apache 2) models are really good - 5th August 2025
- ChatGPT agent's user-agent - 4th August 2025