I am using pisa, which is an HTML to PDF conversion library for Python.
Does there exist the same thing for a Word document: an HTML to .doc conversion library for Python?
I am using pisa, which is an HTML to PDF conversion library for Python.
Does there exist the same thing for a Word document: an HTML to .doc conversion library for Python?
You could use win32com from the pywin32 python extensions for windows, to let MS Word convert it for you. A simple example:
import win32com.client
word = win32com.client.Dispatch('Word.Application')
doc = word.Documents.Add('example.html')
doc.SaveAs('example.doc', FileFormat=0)
doc.Close()
word.Quit()
Though I am not aware of a direct module that can allow you to convert this, however:
In case anybody else lands here attempting to convert the other way around, the above code works, but you need to modify the FileFormat value.
http://msdn.microsoft.com/en-us/library/ff839952.aspx
Example: Filtered html is 10, instead of 0.
Update with a python3.x fix this:
from htmldocx import HtmlToDocx
new_parser = HtmlToDocx()
new_parser.parse_html_file("html_filename", "docx_filename")
#Files extensions not needed, but tolerated
© 2022 - 2024 — McMap. All rights reserved.