I have unicode used in my html page, which is displaying correctly in the html page. But while converting it into html using xhtml2pdf, it generating black, solid square boxes in the unicodes. Is there some setting for unicode other than UTF-8 setting. I dont think its unicode problem.
# convert HTML to PDF
pisaStatus = pisa.CreatePDF(
StringIO(sourceHtml.encode('utf-8')),
dest=resultFile)
Complete py code:
# -*- coding: utf-8 -*-
from xhtml2pdf import pisa
from StringIO import StringIO
source = """<html>
<style>
@font-face {
font-family: Preeti;
src: url("preeti.ttf");
}
body {
font-family: Preeti;
}
</style>
<body>
This is a test <br/>
सरल
</body>
</html>"""
# Utility function
def convertHtmlToPdf(source):
# open output file for writing (truncated binary)
pdf = StringIO()
pisaStatus = pisa.CreatePDF(StringIO(source.encode('utf-8')), pdf)
# return True on success and False on errors
print "Success: ", pisaStatus.err
return pdf
# Main program
if __name__=="__main__":
print pisa.showLogging()
pdf = convertHtmlToPdf(source)
fd = open("test.pdf", "w+b")
fd.write(pdf.getvalue())
fd.close()
Do I even Need to include the font-face ??
sourceHtml
actually aunicode
string? Because normally, an HTML file is in some 8-bit encoding, and callingencode('utf-8')
on astr
that's already UTF-8 (or, worse, something like Latin-1) isn't going to help. – Yalta