Fastest way to turn HTML into text in Python (via) A light benchmark of the new-to-me selectolax Python library shows it performing extremely well for tasks such as extracting just the text from an HTML string, after first manipulating the DOM. selectolax is a Python binding over the Modest and Lexbor HTML parsing engines, which are written in no-outside-dependency C.
Recent articles
- Nano Banana Pro aka gemini-3-pro-image-preview is the best available image generation model - 20th November 2025
- How I automate my Substack newsletter with content from my blog - 19th November 2025
- Trying out Gemini 3 Pro with audio transcription and a new pelican benchmark - 18th November 2025