I'm using "pdftotext -bbox file.pdf
" to convert a pdf
file into HTML
.
Here's a sample line from the output:
<word xMin="351.852025" yMin="42.548936" xMax="365.689478"
yMax="47.681498">foo</word>
Is there a way to get font information for every word like:
- font family, e.g. Verdana
- style, i.e. none, bold, italic
- size, e.g. font size 9
I'm interested in knowing if either the poppler or xpdf version of pdftotext can do this.