I have an XML document containing embedded HTML content that I am attempting to convert to an RTF output file. I have the XML elements decorated with <li>, <p>, <b>
and other HTML markup, that I would like to have transferred into the generated RTF.
Here is what works as of now:
- Fetch XML tag content as string (containing HTML tags for line breaks, paragraph breaks, and lists)
- Write the XML tag content to an RTF file.
I am using Python scripts to achieve the conversion. Also being used is ElementTree (to parse input XML) PyRTF-NG (to convert from HTML to RTF), a library that handles tables and other special formatting. At the moment, I have managed to get everything I need except the 'markdown' of the HTML (i.e. translating HTML format tags into actual RTF formatting). To clarify, I mean that if my RTF convertor encounters an <ol><li>
tag, it should create an ordered list in the RTF, instead of just spitting out <ol><li>
tags into the RTF.
Does anyone know if Python has any native calls that will allow me to do this, or any other Python libraries that might have what I need to complete the full-conversion into RTF.
Thanks!