Simon Willison’s Weblog

Subscribe
Atom feed for pdfminer

1 item tagged “pdfminer”

2008

PDFMiner. Useful looking PDF parsing library in Python—can produce an XML representation of the text and style information in a PDF document.

# 3rd August 2008, 3:29 pm / pdf, pdfminer, python, screenscraping, xml