On this site Adobe write about conversion from pdf to html using pdfkit
They use pdfkit.from_pdf(...)
method.
This script uses the ‘pdfkit’ library to convert the PDF file to HTML. To use this script, you will need to install the ‘pdfkit’ library...
When I want to use this method I have error
Traceback (most recent call last):
File "C:\TestPdfToHtml\script.py", line 7, in <module>
html_file = pdfkit.from_pdf(pdf_file, "my_html_file.html")
^^^^^^^^^^^^^^^
AttributeError: module 'pdfkit' has no attribute 'from_pdf'. Did you mean: 'from_url'?
How can I resolve this problem?
Below is the full script
import pdfkit
# Read the PDF file
pdf_file = open('test2.pdf', 'rb')
# Convert the PDF to HTML
html_file = pdfkit.from_pdf(pdf_file, "my_html_file.html")
# Close the PDF file
pdf_file.close()
pdfkit
say? – Uralaltaicconvert from pdf to html
. – Ivesonpdfkit
has afrom_pdf()
function. A thing you can try is seeing if itsfrom_file()
function (which exists) happens to open a PDF, something I would not bet on. – Glasswarepdfkit
are only 3 functions, from_string/file/html and the doc says nothing about conversion pfd to html, maybe adobe is trolling... – Daveta