Convert PDF Document to XML

gawati/pdf-to-xml

This is a fork of the pdfminer tool, with a specific focus on extracting semantic XML out of OCR-ed PDF. It extracts pdf content page by page, and also identifies words and lines using distinct tags.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

gawati/pdf-to-xml

Trending now