Render HTML to PDF in Django site
Asked Answered
M

10

131

For my django powered site, I am looking for an easy solution to convert dynamic html pages to pdf.

Pages include HTML and charts from Google visualization API (which is javascript based, yet including those graphs is a must).

Moorings answered 4/9, 2009 at 5:54 Comment(6)
Django documentation is deep and covers a lot. Did you have any problems with the method suggested there? http://docs.djangoproject.com/en/dev/howto/outputting-pdf/Stopgap
This doesn't actually answer the question. That documentation is on how to render a PDF natively, not from rendered HTML.Thinia
I gues that the right thing to do is to make browsers produce the pdf becuase they are the only ones doing proper html/css/js rendering. see this question https://mcmap.net/q/45949/-how-to-use-the-browser-39-s-chrome-firefox-html-css-js-rendering-engine-to-produce-pdf/39998Cacao
This question is off-topic at SO, but on-topic in softwarerecs.SE. See How can I convert HTML with CSS to PDF?.Overweary
try using wkhtmltopdf learnbatta.com/blog/…Dawna
u can folow this reza-ta.medium.com/…Malta
F
227

Try the solution from Reportlab.

Download it and install it as usual with python setup.py install

You will also need to install the following modules: xhtml2pdf, html5lib, pypdf with easy_install.

Here is an usage example:

First define this function:

import cStringIO as StringIO
from xhtml2pdf import pisa
from django.template.loader import get_template
from django.template import Context
from django.http import HttpResponse
from cgi import escape


def render_to_pdf(template_src, context_dict):
    template = get_template(template_src)
    context = Context(context_dict)
    html  = template.render(context)
    result = StringIO.StringIO()

    pdf = pisa.pisaDocument(StringIO.StringIO(html.encode("ISO-8859-1")), result)
    if not pdf.err:
        return HttpResponse(result.getvalue(), content_type='application/pdf')
    return HttpResponse('We had some errors<pre>%s</pre>' % escape(html))

Then you can use it like this:

def myview(request):
    #Retrieve data or whatever you need
    return render_to_pdf(
            'mytemplate.html',
            {
                'pagesize':'A4',
                'mylist': results,
            }
        )

The template:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
    <head>
        <title>My Title</title>
        <style type="text/css">
            @page {
                size: {{ pagesize }};
                margin: 1cm;
                @frame footer {
                    -pdf-frame-content: footerContent;
                    bottom: 0cm;
                    margin-left: 9cm;
                    margin-right: 9cm;
                    height: 1cm;
                }
            }
        </style>
    </head>
    <body>
        <div>
            {% for item in mylist %}
                RENDER MY CONTENT
            {% endfor %}
        </div>
        <div id="footerContent">
            {%block page_foot%}
                Page <pdf:pagenumber>
            {%endblock%}
        </div>
    </body>
</html>
Functionalism answered 4/9, 2009 at 7:12 Comment(21)
+1 I've been using this solution for a year and its great. PISA can even create barcodes with a simple tag, plus much more. And its easy.Renter
Man, reportlab is pita to install on windows 7 64bit, python2.7 64bit. Still trying...Full
Thanks to this great site with precompiled 64-bit distribution of common python libs I was able to get things going: lfd.uci.edu/~gohlke/pythonlibsFull
Would be nice if there was some reference to how include media like css/images. This guy had this question, just like me: #2180458Full
@drozzy, to use CSS I had to include all the css code IN the Html HEAD, inside a <style> tag.Overcast
Doesn't seem to run Javascript.Catanddog
Any suggestion on how Javascript call can be incorporated?Midpoint
Is it able to generate multicolored PDFs with complete CSS ?Barr
pisa is now distributed as xhtml2pdfRosalia
Can't seem to find PisaDocument API, htmltopdf.org/doc/pisa-en.html seems to be down as well. Any other sources of documentation?Kutz
How can I force a next page?Kutz
how would I disable footer from page 1?Kutz
This works great, but I cannot seem to render HTML with an <ol></ol> tag that uses list-style-type: upper-alpha;, I cannot get the OL to use letters instead of numbers when rendered to PDF using Pisa and cStringIO. It renders fine when using Django's render_to_response, but not when I render to PDF. Any ideas on how to get this method to respect the alphabetic Ordered List (ol) type?Kellner
I gues that the right thing to do is to make browsers produce the pdf becuase they are the only ones doing proper html/js/css rendering. see this question https://mcmap.net/q/45949/-how-to-use-the-browser-39-s-chrome-firefox-html-css-js-rendering-engine-to-produce-pdf/39998Cacao
In the above answer section "First define this function:", How to get only pdf file in response? I want to save PDF file in my file system.Rufe
It works only for the static content in html content,what about the dynamic content like (charts,graphs) which are generated to javascript !!Bawd
Does it support svg graphics embedded within html?Poitiers
In python3, except the conversion cStringIO.StringIO to io.StringIO, we must define result as result = io.BytesIO() instead of result = StringIO.Colier
@Johan Looks like now only experimental version of xhtml2pdf supports python3. See here github.com/xhtml2pdf/xhtml2pdf#installation. So you need to reinstall pip uninstall xhtml2pdf and pip install --pre xhtml2pdf. Also use the @Colier advice to correctly work with StringIO in python3 .Ankh
@Johan But after some tries today I still have several problems with xhtml2pdf on python3, may be they are related to unstable release. Now I switched to pdfkit maketips.net/tip/72/html-to-pdf-in-django . It works perfect on Python3, without any problems supports styles and images and it is based on lightweight wkhtmltopdf binary .Ankh
How I can download multiple files with this approach?Melton
Y
15

Try wkhtmltopdf with either one of the following wrappers

django-wkhtmltopdf or python-pdfkit

This worked great for me,supports javascript and css or anything for that matter which a webkit browser supports.

For more detailed tutorial please see this blog post

Yaya answered 11/12, 2014 at 9:45 Comment(13)
How about svg embedded within html, is that also supported?Poitiers
@mmatt Yes it supports svg .See this #12396041 and this github.com/wkhtmltopdf/wkhtmltopdf/issues/1964Yaya
Just be careful, webkit does not support everything chrome/firefox does: webkit.org/statusPoitiers
django-wkhtmltopdf did wonders for me! also be sure to turn off all the animations your javascript/charting engine does.Poitiers
@Poitiers it didnot support my simple bar-chart js. I got lots of errors. Can you help me with it??Camise
@ManishOjha simple bar chart is based on canvas not svgYaya
@sam did you check the blog post?Yaya
@Yaya thanks for the advice. But the graph loaded but not completely. I have kept 'javascript-delay': 1000. What will be the best way?Camise
@ManishOjha did you try increasing the delay?Yaya
@Yaya yes I tried putting 'javascript-delay': 2000 but it throws error returned non-zero exit status -11Camise
@ManishOjha i can't comment without more information.The html file generated can be located in appdata or temp folder.Please open the file in browser to check for errors.Also try running by adding ignore errors flag.Yaya
@Yaya can you reply here #47327778Camise
@jithin, yes i did check the blog post, having issues with rendering the css and an SVG, and making the pdf longer than 1 page.Salpinx
A
12

https://github.com/nigma/django-easy-pdf

Template:

{% extends "easy_pdf/base.html" %}

{% block content %}
    <div id="content">
        <h1>Hi there!</h1>
    </div>
{% endblock %}

View:

from easy_pdf.views import PDFTemplateView

class HelloPDFView(PDFTemplateView):
    template_name = "hello.html"

If you want to use django-easy-pdf on Python 3 check the solution suggested here.

Asafetida answered 12/9, 2014 at 3:55 Comment(3)
This is the easiest one to implement of the options I have tried so far. For my needs (generating a pdf report from an html version) this just simply works. Thanks!Pleura
@alejoss You should use inline styles instead of CSS.Wenzel
This solution may not work straight away for django 3.0 as django-utils-six is removed but the easy_pdf depends on that.Mylor
A
11

I just whipped this up for CBV. Not used in production but generates a PDF for me. Probably needs work for the error reporting side of things but does the trick so far.

import StringIO
from cgi import escape
from xhtml2pdf import pisa
from django.http import HttpResponse
from django.template.response import TemplateResponse
from django.views.generic import TemplateView

class PDFTemplateResponse(TemplateResponse):

    def generate_pdf(self, retval):

        html = self.content

        result = StringIO.StringIO()
        rendering = pisa.pisaDocument(StringIO.StringIO(html.encode("ISO-8859-1")), result)

        if rendering.err:
            return HttpResponse('We had some errors<pre>%s</pre>' % escape(html))
        else:
            self.content = result.getvalue()

    def __init__(self, *args, **kwargs):
        super(PDFTemplateResponse, self).__init__(*args, mimetype='application/pdf', **kwargs)
        self.add_post_render_callback(self.generate_pdf)


class PDFTemplateView(TemplateView):
    response_class = PDFTemplateResponse

Used like:

class MyPdfView(PDFTemplateView):
    template_name = 'things/pdf.html'
Anastasiaanastasie answered 19/12, 2012 at 21:11 Comment(2)
This worked almost straight forward for me. The only thing was to replace html.encode("ISO-8859-1") by html.decode("utf-8")Music
I've changed the code as @Music mentioned and additionally had to add a line to the class PDFTemplateView: content_type = "application/pdf"Chilpancingo
Y
6

I tried the best answer in this thread and it didn't work for python3.8, hence I had to do some changes as follows ( for anyone working on python3.8 ) :

import io 
from xhtml2pdf import pisa
from django.http import HttpResponse
from html import escape

from django.template.loader import render_to_string

def render_to_pdf(template_src, context_dict):
    html = render_to_string(template_src, context_dict)
    result = io.BytesIO()



    pdf = pisa.pisaDocument(io.BytesIO (html.encode("utf-8")), result)
    if not pdf.err:
        return HttpResponse(result.getvalue(), content_type='application/pdf')
    return HttpResponse('We had some errors<pre>%s</pre>' % escape(html))

I had to change cgi to html since cgi.escape is depricated, and I replaced StringIO with io.ByteIO() as for the rendering I used render_to_string instead of converting the dict to context which was throwing an error.

Yolk answered 15/7, 2022 at 21:29 Comment(0)
L
3

After trying to get this to work for too many hours, I finally found this: https://github.com/vierno/django-xhtml2pdf

It's a fork of https://github.com/chrisglass/django-xhtml2pdf that provides a mixin for a generic class-based view. I used it like this:

    # views.py
    from django_xhtml2pdf.views import PdfMixin
    class GroupPDFGenerate(PdfMixin, DetailView):
        model = PeerGroupSignIn
        template_name = 'groups/pdf.html'

    # templates/groups/pdf.html
    <html>
    <style>
    @page { your xhtml2pdf pisa PDF parameters }
    </style>
    </head>
    <body>
        <div id="header_content"> (this is defined in the style section)
            <h1>{{ peergroupsignin.this_group_title }}</h1>
            ...

Use the model name you defined in your view in all lowercase when populating the template fields. Because its a GCBV, you can just call it as '.as_view' in your urls.py:

    # urls.py (using url namespaces defined in the main urls.py file)
    url(
        regex=r"^(?P<pk>\d+)/generate_pdf/$",
        view=views.GroupPDFGenerate.as_view(),
        name="generate_pdf",
       ),
Locksmith answered 3/4, 2015 at 22:48 Comment(0)
I
2

You can use iReport editor to define the layout, and publish the report in jasper reports server. After publish you can invoke the rest api to get the results.

Here is the test of the functionality:

from django.test import TestCase
from x_reports_jasper.models import JasperServerClient

"""
    to try integraction with jasper server through rest
"""
class TestJasperServerClient(TestCase):

    # define required objects for tests
    def setUp(self):

        # load the connection to remote server
        try:

            self.j_url = "http://127.0.0.1:8080/jasperserver"
            self.j_user = "jasperadmin"
            self.j_pass = "jasperadmin"

            self.client = JasperServerClient.create_client(self.j_url,self.j_user,self.j_pass)

        except Exception, e:
            # if errors could not execute test given prerrequisites
            raise

    # test exception when server data is invalid
    def test_login_to_invalid_address_should_raise(self):
        self.assertRaises(Exception,JasperServerClient.create_client, "http://127.0.0.1:9090/jasperserver",self.j_user,self.j_pass)

    # test execute existent report in server
    def test_get_report(self):

        r_resource_path = "/reports/<PathToPublishedReport>"
        r_format = "pdf"
        r_params = {'PARAM_TO_REPORT':"1",}

        #resource_meta = client.load_resource_metadata( rep_resource_path )

        [uuid,out_mime,out_data] = self.client.generate_report(r_resource_path,r_format,r_params)
        self.assertIsNotNone(uuid)

And here is an example of the invocation implementation:

from django.db import models
import requests
import sys
from xml.etree import ElementTree
import logging 

# module logger definition
logger = logging.getLogger(__name__)

# Create your models here.
class JasperServerClient(models.Manager):

    def __handle_exception(self, exception_root, exception_id, exec_info ):
        type, value, traceback = exec_info
        raise JasperServerClientError(exception_root, exception_id), None, traceback

    # 01: REPORT-METADATA 
    #   get resource description to generate the report
    def __handle_report_metadata(self, rep_resourcepath):

        l_path_base_resource = "/rest/resource"
        l_path = self.j_url + l_path_base_resource
        logger.info( "metadata (begin) [path=%s%s]"  %( l_path ,rep_resourcepath) )

        resource_response = None
        try:
            resource_response = requests.get( "%s%s" %( l_path ,rep_resourcepath) , cookies = self.login_response.cookies)

        except Exception, e:
            self.__handle_exception(e, "REPORT_METADATA:CALL_ERROR", sys.exc_info())

        resource_response_dom = None
        try:
            # parse to dom and set parameters
            logger.debug( " - response [data=%s]"  %( resource_response.text) )
            resource_response_dom = ElementTree.fromstring(resource_response.text)

            datum = "" 
            for node in resource_response_dom.getiterator():
                datum = "%s<br />%s - %s" % (datum, node.tag, node.text)
            logger.debug( " - response [xml=%s]"  %( datum ) )

            #
            self.resource_response_payload= resource_response.text
            logger.info( "metadata (end) ")
        except Exception, e:
            logger.error( "metadata (error) [%s]" % (e))
            self.__handle_exception(e, "REPORT_METADATA:PARSE_ERROR", sys.exc_info())


    # 02: REPORT-PARAMS 
    def __add_report_params(self, metadata_text, params ):
        if(type(params) != dict):
            raise TypeError("Invalid parameters to report")
        else:
            logger.info( "add-params (begin) []" )
            #copy parameters
            l_params = {}
            for k,v in params.items():
                l_params[k]=v
            # get the payload metadata
            metadata_dom = ElementTree.fromstring(metadata_text)
            # add attributes to payload metadata
            root = metadata_dom #('report'):

            for k,v in l_params.items():
                param_dom_element = ElementTree.Element('parameter')
                param_dom_element.attrib["name"] = k
                param_dom_element.text = v
                root.append(param_dom_element)

            #
            metadata_modified_text =ElementTree.tostring(metadata_dom, encoding='utf8', method='xml')
            logger.info( "add-params (end) [payload-xml=%s]" %( metadata_modified_text )  )
            return metadata_modified_text



    # 03: REPORT-REQUEST-CALL 
    #   call to generate the report
    def __handle_report_request(self, rep_resourcepath, rep_format, rep_params):

        # add parameters
        self.resource_response_payload = self.__add_report_params(self.resource_response_payload,rep_params)

        # send report request

        l_path_base_genreport = "/rest/report"
        l_path = self.j_url + l_path_base_genreport
        logger.info( "report-request (begin) [path=%s%s]"  %( l_path ,rep_resourcepath) )

        genreport_response = None
        try:
            genreport_response = requests.put( "%s%s?RUN_OUTPUT_FORMAT=%s" %(l_path,rep_resourcepath,rep_format),data=self.resource_response_payload, cookies = self.login_response.cookies )
            logger.info( " - send-operation-result [value=%s]"  %( genreport_response.text) )
        except Exception,e:
            self.__handle_exception(e, "REPORT_REQUEST:CALL_ERROR", sys.exc_info())


        # parse the uuid of the requested report
        genreport_response_dom = None

        try:
            genreport_response_dom = ElementTree.fromstring(genreport_response.text)

            for node in genreport_response_dom.findall("uuid"):
                datum = "%s" % (node.text)

            genreport_uuid = datum      

            for node in genreport_response_dom.findall("file/[@type]"):
                datum = "%s" % (node.text)
            genreport_mime = datum

            logger.info( "report-request (end) [uuid=%s,mime=%s]"  %( genreport_uuid, genreport_mime) )

            return [genreport_uuid,genreport_mime]
        except Exception,e:
            self.__handle_exception(e, "REPORT_REQUEST:PARSE_ERROR", sys.exc_info())

    # 04: REPORT-RETRIEVE RESULTS 
    def __handle_report_reply(self, genreport_uuid ):


        l_path_base_getresult = "/rest/report"
        l_path = self.j_url + l_path_base_getresult 
        logger.info( "report-reply (begin) [uuid=%s,path=%s]"  %( genreport_uuid,l_path) )

        getresult_response = requests.get( "%s%s/%s?file=report" %(self.j_url,l_path_base_getresult,genreport_uuid),data=self.resource_response_payload, cookies = self.login_response.cookies )
        l_result_header_mime =getresult_response.headers['Content-Type']

        logger.info( "report-reply (end) [uuid=%s,mime=%s]"  %( genreport_uuid, l_result_header_mime) )
        return [l_result_header_mime, getresult_response.content]

    # public methods ---------------------------------------    

    # tries the authentication with jasperserver throug rest
    def login(self, j_url, j_user,j_pass):
        self.j_url= j_url

        l_path_base_auth = "/rest/login"
        l_path = self.j_url + l_path_base_auth

        logger.info( "login (begin) [path=%s]"  %( l_path) )

        try:
            self.login_response = requests.post(l_path , params = {
                    'j_username':j_user,
                    'j_password':j_pass
                })                  

            if( requests.codes.ok != self.login_response.status_code ):
                self.login_response.raise_for_status()

            logger.info( "login (end)" )
            return True
            # see http://blog.ianbicking.org/2007/09/12/re-raising-exceptions/

        except Exception, e:
            logger.error("login (error) [e=%s]" % e )
            self.__handle_exception(e, "LOGIN:CALL_ERROR",sys.exc_info())
            #raise

    def generate_report(self, rep_resourcepath,rep_format,rep_params):
        self.__handle_report_metadata(rep_resourcepath)
        [uuid,mime] = self.__handle_report_request(rep_resourcepath, rep_format,rep_params)
        # TODO: how to handle async?
        [out_mime,out_data] = self.__handle_report_reply(uuid)
        return [uuid,out_mime,out_data]

    @staticmethod
    def create_client(j_url, j_user, j_pass):
        client = JasperServerClient()
        login_res = client.login( j_url, j_user, j_pass )
        return client


class JasperServerClientError(Exception):

    def __init__(self,exception_root,reason_id,reason_message=None):
        super(JasperServerClientError, self).__init__(str(reason_message))
        self.code = reason_id 
        self.description = str(exception_root) + " " + str(reason_message)
    def __str__(self):
        return self.code + " " + self.description
Irenics answered 24/9, 2013 at 19:15 Comment(0)
R
1

I get the code to generate the PDF from html template :

    import os

    from weasyprint import HTML

    from django.template import Template, Context
    from django.http import HttpResponse 


    def generate_pdf(self, report_id):

            # Render HTML into memory and get the template firstly
            template_file_loc = os.path.join(os.path.dirname(__file__), os.pardir, 'templates', 'the_template_pdf_generator.html')
            template_contents = read_all_as_str(template_file_loc)
            render_template = Template(template_contents)

            #rendering_map is the dict for params in the template 
            render_definition = Context(rendering_map)
            render_output = render_template.render(render_definition)

            # Using Rendered HTML to generate PDF
            response = HttpResponse(content_type='application/pdf')
            response['Content-Disposition'] = 'attachment; filename=%s-%s-%s.pdf' % \
                                              ('topic-test','topic-test', '2018-05-04')
            # Generate PDF
            pdf_doc = HTML(string=render_output).render()
            pdf_doc.pages[0].height = pdf_doc.pages[0]._page_box.children[0].children[
                0].height  # Make PDF file as single page file 
            pdf_doc.write_pdf(response)
            return response

    def read_all_as_str(self, file_loc, read_method='r'):
        if file_exists(file_loc):
            handler = open(file_loc, read_method)
            contents = handler.read()
            handler.close()
            return contents
        else:
            return 'file not exist'  
Rexanne answered 17/6, 2018 at 11:30 Comment(0)
A
1
  • This is for Django >=3
  • This code converts HTML template to pdf file for any page. For example: post/1/new1, post/2/new2
  • pdf file name is last part in url. For example for post/2/new2, file name is new2

First install xhtml2pdf

pip install xhtml2pdf

urls.py

from .views import generatePdf as GeneratePdf
from django.urls import re_path
urlpatterns = [
#...
re_path(r'^pdf/(?P<cid>[0-9]+)/(?P<value>[a-zA-Z0-9 :._-]+)/$', GeneratePdf, name='pdf'),
#...
]

views.py

from django.template.loader import get_template
from .utils import render_to_pdf
# pdf
def generatePdf(request,cid,value):
    print(cid,value)
    pdf = render_to_pdf('myappname/pdf/your.html',cid)
    return HttpResponse(pdf, content_type='application/pdf')

utils.py

from io import BytesIO #A stream implementation using an in-memory bytes buffer
                       # It inherits BufferIOBase

from django.http import HttpResponse
from django.template.loader import get_template

#pisa is a html2pdf converter using the ReportLab Toolkit,
#the HTML5lib and pyPdf.

from xhtml2pdf import pisa  
#define render_to_pdf() function
from .models import myappname
from django.shortcuts import get_object_or_404


def render_to_pdf(template_src,cid, context_dict={}):
    template = get_template(template_src)
    node = get_object_or_404(myappname, id =cid)
    context = {'node':node}
    context_dict=context
    html  = template.render(context_dict)
    result = BytesIO()

    #This part will create the pdf.
    pdf = pisa.pisaDocument(BytesIO(html.encode("ISO-8859-1")), result)
    if not pdf.err:
        return HttpResponse(result.getvalue(), content_type='application/pdf')
    return None

Structure:

myappname/
      |___views.py
      |___urls.py
      |___utils.py
      |___templates/myappname/your.html
Arbitrage answered 19/8, 2021 at 21:9 Comment(0)
F
0

If you have context data along with css and js in your html template. Than you have good option to use pdfjs.

In your code you can use like this.

from django.template.loader import get_template
import pdfkit
from django.conf import settings

context={....}
template = get_template('reports/products.html')
html_string = template.render(context)
pdfkit.from_string(html_string, os.path.join(settings.BASE_DIR, "media", 'products_report-%s.pdf'%(id)))

In your HTML you can link extranal or internal css and js, it will generate best quality of pdf.

Fraase answered 17/7, 2019 at 12:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.