Tool: OCR PDFs and images directly in your browser

30th March 2024

Tool OCR PDFs and images directly in your browser — Extract text from PDF documents and images using optical character recognition (OCR) directly in your browser. The tool leverages Tesseract.js for text recognition and PDF.js to handle multi-page PDF files, supporting multiple languages and file formats including JPEG, PNG, and GIF. All processing occurs locally in your browser with no files being transmitted to external servers.

Posted 30th March 2024 at 4:34 pm

Simon Willison’s Weblog

Recent articles

Monthly briefing