HTML to PDF with Node.js
Asked Answered
A

18

94

I'm looking to create a printable pdf version of my website webpages. Something like express.render() only render the page as pdf

Does anyone know a node module that does that ?

If not, how would you go about implementing one ? I've seen some methods talk about using headless browser like phantom.js, but not sure whats the flow.

Amish answered 27/1, 2013 at 20:50 Comment(2)
I hope this will be still relevant, but there is this site now simpe.li which has some predefined templates that you can pick and use. Could be useful in some situations.Dreeda
Support for phantom has ceased, so I would use this solution your own risk!Rau
H
90

Extending upon Mustafa's answer.

A) Install http://phantomjs.org/ and then

B) install the phantom node module https://github.com/amir20/phantomjs-node

enter image description here

C) Here is an example of rendering a pdf

var phantom = require('phantom');   

phantom.create().then(function(ph) {
    ph.createPage().then(function(page) {
        page.open("http://www.google.com").then(function(status) {
            page.render('google.pdf').then(function() {
                console.log('Page Rendered');
                ph.exit();
            });
        });
    });
});

Output of the PDF:

enter image description here

EDIT: Silent printing that PDF

java -jar pdfbox-app-2.0.2.jar PrintPDF -silentPrint C:\print_mypdf.pdf

Hawserlaid answered 20/4, 2013 at 20:43 Comment(7)
Does this also load the CSS ? When I render a page, text is shown but there is no CSS.Trisect
One of the issue with this solution is, you will not get clickable links that are on the webpage. This is same as taking a screen shot and embedding the image into PDF. If that works foryou, then this a great solution.Gutierrez
This module phantomjs-node does not exist on NPM, use npm install phantom@2 -S for node v less than 5.0 or npm install phantom -S for node version 5.0 or greaterSazerac
When i convert html to pdf, there are 4-5 page in html. I want to use page break between two page. This is my url link which i want to convert in pdf. "ishtech.xyz//web/#/reports_view?StartDate=11/14/…"Robert
Working well but I've Google Map and Spyder Chart which is not displayed well spyder chart have animation. Can we set Timeout for creating pdf or wait till the whole page loaded?Yawn
can this work with invalid URLs (for example, google chrome extensions result pages such as (chrome-extension://particularExtensionID/templates/gtar.html#!/report))? What can I use to convert that entire page to pdf?Pendulous
PhantomJS is no longer an active projectPryor
B
24

Phantom.js is an headless webkit server and it will load any web page and render it in memory, although you might not be able to see it, there is a Screen Capture feature, in which you can export the current view as PNG, PDF, JPEG and GIF. Have a look at this example from phantom.js documentation

Bergman answered 27/1, 2013 at 21:7 Comment(0)
D
19

Try to use Puppeteer to create PDF from HTML

Example from here https://github.com/chuongtrh/html_to_pdf

Or https://github.com/GoogleChrome/puppeteer

Durden answered 3/10, 2018 at 11:55 Comment(2)
puppeteer makes more sense than phantom now as the latter has been deprecated and the former has much better and stable apis.Sensory
Puppeteer is the only way to create PDF from HTML, using modern markup.Frisbee
B
18

If you want to export HTML to PDF. You have many options. without node even

Option 1: Have a button on your html page that calls window.print() function. use the browsers native html to pdf. use media queries to make your html page look good on a pdf. and you also have the print before and after events that you can use to make changes to your page before print.

Option 2. htmltocanvas or rasterizeHTML. convert your html to canvas , then call toDataURL() on the canvas object to get the image . and use a JavaScript library like jsPDF to add that image to a PDF file. Disadvantage of this approach is that the pdf doesnt become editable. If you want data extracted from PDF, there is different ways for that.

Option 3. @Jozzhard answer

Bergman answered 9/8, 2014 at 2:46 Comment(1)
Which browsers have a built-in html to pdf option? I can only see it in Chrome at this point.Orville
D
12

The best solution I found is html-pdf. It's simple and work with big html.

https://www.npmjs.com/package/html-pdf

Its as simple as that:

    pdf.create(htm, options).toFile('./pdfname.pdf', function(err, res) {
        if (err) {
          console.log(err);
        }
    });

NOTE:

This package has been deprecated

Author message: Please migrate your projects to a newer library like puppeteer

Dita answered 31/5, 2016 at 12:54 Comment(7)
Absolutely awesome. It works with external URLs too if you combine it with requestify.Sazerac
It takes in account the css? the classes?Psychodiagnostics
@gabodev77, yes it does.Chewy
its support style tag or not ?Laroy
FYI - this package hasn't been updated since 2017 and has a critical vulnerability npmjs.com/advisories/1095 Probably best to go with another option :)Repay
The library is deprecated. Please update/remove this answer.Vanegas
github.com/ultimateakash/puppeteer-html-pdfMiffy
N
8

Package

I used html-pdf

Easy to use and allows not only to save pdf as file, but also pipe pdf content to a WriteStream (so I could stream it directly to Google Storage to save there my reports).

Using css + images

It takes css into account. The only problem I faced - it ignored my images. The solution I found was to replace url in src attrribute value by base64, e.g.

<img src="data:image/png;base64,iVBOR...kSuQmCC">

You can do it with your code or to use one of online converters, e.g. https://www.base64-image.de/

Compile valid html code from html fragment + css

  1. I had to get a fragment of my html document (I just appiled .html() method on jQuery selector).
  2. Then I've read the content of the relevant css file.

Using this two values (stored in variables html and css accordingly) I've compiled a valid html code using Template string

var htmlContent = `
<!DOCTYPE html>
<html>
  <head>
    <style>
      ${css}
    </style>
  </head>
  <body id=direct-sellers-bill>
    ${html}
  </body>
</html>`

and passed it to create method of html-pdf.

Nowlin answered 2/10, 2017 at 11:4 Comment(3)
Can html-pdf download from invalid urls, such as from Google Chrome extension / gtar.html pages?Pendulous
how to you expect any system to get anything from an invalid url?Maeda
An image can be loaded from a file, just a correct location has to be set with file:// prefix. So, you say in template <img src="static/logo.png">, then before converting, prepare template by prefixing const html = htmlOrig.replace(new RegExp('<img src="', 'g'), `<img src="${base}`);Carvel
S
6

Create PDF from External URL

Here's an adaptation of the previous answers which utilizes html-pdf, but also combines it with requestify so it works with an external URL:

Install your dependencies

npm i -S html-pdf requestify

Then, create the script:

//MakePDF.js

var pdf = require('html-pdf');
var requestify = require('requestify');
var externalURL= 'http://www.google.com';

requestify.get(externalURL).then(function (response) {
   // Get the raw HTML response body
   var html = response.body; 
   var config = {format: 'A4'}; // or format: 'letter' - see https://github.com/marcbachmann/node-html-pdf#options

// Create the PDF
   pdf.create(html, config).toFile('pathtooutput/generated.pdf', function (err, res) {
      if (err) return console.log(err);
      console.log(res); // { filename: '/pathtooutput/generated.pdf' }
   });
});

Then you just run from the command line:

node MakePDF.js

Watch your beautify pixel perfect PDF be created for you (for free!)

Sazerac answered 12/10, 2016 at 21:47 Comment(5)
There's an issue which causes html-pdf to only success at making the PDF sometimes - github.com/marcbachmann/node-html-pdf/issues/181Sazerac
How would you render the created PDF directly to the browser without having to store the file first?Wristwatch
Using a binary stream it could be done. Theoretically it doesn't get saved, just piped directly to the browser. Although working with node, I could only get it to work by first saving the temporary pdf, then getting the binary stream, downloading the binary stream, then deleting the temporary pdf.Sazerac
I am getting an error from html-pdf - ReferenceError: Can't find variable $. Could this be happening because the page I am loading has javascript that needs to execute? Any ideas would be helpful.Atterbury
@TetraDev: i need to ristrict to generate 1 page pdf, what will be changes?Unwrap
I
6

For those who don't want to install PhantomJS along with an instance of Chrome/Firefox on their server - or because the PhantomJS project is currently suspended, here's an alternative.

You can externalize the conversions to APIs to do the job. Many exists and varies but what you'll get is a reliable service with up-to-date features (I'm thinking CSS3, Web fonts, SVG, Canvas compatible).

For instance, with PDFShift (disclaimer, I'm the founder), you can do this simply by using the request package:

const request = require('request')
request.post(
    'https://api.pdfshift.io/v2/convert/',
    {
        'auth': {'user': 'your_api_key'},
        'json': {'source': 'https://www.google.com'},
        'encoding': null
    },
    (error, response, body) => {
        if (response === undefined) {
            return reject({'message': 'Invalid response from the server.', 'code': 0, 'response': response})
        }
        if (response.statusCode == 200) {
            // Do what you want with `body`, that contains the binary PDF
            // Like returning it to the client - or saving it as a file locally or on AWS S3
            return True
        }

        // Handle any errors that might have occured
    }
);
Isotropic answered 18/3, 2019 at 11:6 Comment(0)
B
1

Use html-pdf

var fs = require('fs');
var pdf = require('html-pdf');
var html = fs.readFileSync('./test/businesscard.html', 'utf8');
var options = { format: 'Letter' };

pdf.create(html, options).toFile('./businesscard.pdf', function(err, res) {
  if (err) return console.log(err);
  console.log(res); // { filename: '/app/businesscard.pdf' } 
});
Bourke answered 10/3, 2017 at 5:38 Comment(0)
G
1
const fs = require('fs')
const path = require('path')
const utils = require('util')
const puppeteer = require('puppeteer')
const hb = require('handlebars')
const readFile = utils.promisify(fs.readFile)

async function getTemplateHtml() {

    console.log("Loading template file in memory")
    try {
        const invoicePath = path.resolve("./invoice.html");
        return await readFile(invoicePath, 'utf8');
    } catch (err) {
        return Promise.reject("Could not load html template");
    }
}


async function generatePdf() {

    let data = {};

    getTemplateHtml()
        .then(async (res) => {
            // Now we have the html code of our template in res object
            // you can check by logging it on console
            // console.log(res)

            console.log("Compiing the template with handlebars")
            const template = hb.compile(res, { strict: true });
            // we have compile our code with handlebars
            const result = template(data);
            // We can use this to add dyamic data to our handlebas template at run time from database or API as per need. you can read the official doc to learn more https://handlebarsjs.com/
            const html = result;

            // we are using headless mode 
            const browser = await puppeteer.launch();
            const page = await browser.newPage()

            // We set the page content as the generated html by handlebars
            await page.setContent(html)

            // we Use pdf function to generate the pdf in the same folder as this file.
            await page.pdf({ path: 'invoice.pdf', format: 'A4' })

            await browser.close();
            console.log("PDF Generated")

        })
        .catch(err => {
            console.error(err)
        });
}

generatePdf();
Gerdi answered 30/4, 2021 at 11:49 Comment(0)
O
0

In case you arrive here looking for a way to make PDF from view templates in Express, a colleague and I made express-template-to-pdf

which allows you to generate PDF from whatever templates you're using in Express - Pug, Nunjucks, whatever.

It depends on html-pdf and is written to use in your routes just like you use res.render:

const pdfRenderer = require('@ministryofjustice/express-template-to-pdf')

app.set('views', path.join(__dirname, 'views'))
app.set('view engine', 'pug')

app.use(pdfRenderer())

If you've used res.render then using it should look obvious:

app.use('/pdf', (req, res) => {
    res.renderPDF('helloWorld', { message: 'Hello World!' });
})

You can pass options through to html-pdf to control the PDF document page size etc

Merely building on the excellent work of others.

Overall answered 25/6, 2019 at 8:38 Comment(0)
T
0

In my view, the best way to do this is via an API so that you do not add a large and complex dependency into your app that runs unmanaged code, that needs to be frequently updated.

Here is a simple way to do this, which is free for 800 requests/month:

var CloudmersiveConvertApiClient = require('cloudmersive-convert-api-client');
var defaultClient = CloudmersiveConvertApiClient.ApiClient.instance;

// Configure API key authorization: Apikey
var Apikey = defaultClient.authentications['Apikey'];
Apikey.apiKey = 'YOUR API KEY';



var apiInstance = new CloudmersiveConvertApiClient.ConvertWebApi();

var input = new CloudmersiveConvertApiClient.HtmlToPdfRequest(); // HtmlToPdfRequest | HTML to PDF request parameters
input.Html = "<b>Hello, world!</b>";


var callback = function(error, data, response) {
  if (error) {
    console.error(error);
  } else {
    console.log('API called successfully. Returned data: ' + data);
  }
};
apiInstance.convertWebHtmlToPdf(input, callback);

With the above approach you can also install the API on-premises or on your own infrastructure if you prefer.

Tenement answered 6/9, 2020 at 0:31 Comment(0)
A
0

In addition to @Jozzhart Answer, you can make a local html; serve it with express; and use phantom to make PDF from it; something like this:

const exp = require('express');
const app = exp();
const pth = require("path");
const phantom = require('phantom');
const ip = require("ip");

const PORT = 3000;
const PDF_SOURCE = "index"; //index.html
const PDF_OUTPUT = "out"; //out.pdf

const source = pth.join(__dirname, "", `${PDF_SOURCE}.html`);
const output = pth.join(__dirname, "", `${PDF_OUTPUT}.pdf`);

app.use("/" + PDF_SOURCE, exp.static(source));
app.use("/" + PDF_OUTPUT, exp.static(output));

app.listen(PORT);

let makePDF = async (fn) => {
    let local = `http://${ip.address()}:${PORT}/${PDF_SOURCE}`;
    phantom.create().then((ph) => {
        ph.createPage().then((page) => {
            page.open(local).then(() =>
                page.render(output).then(() => { ph.exit(); fn() })
            );
        });
    });
}

makePDF(() => {
    console.log("PDF Created From Local File");
    console.log("PDF is downloadable from link:");
    console.log(`http://${ip.address()}:${PORT}/${PDF_OUTPUT}`);
});

and index.html can be anything:

<h1>PDF HEAD</h1>
<a href="#">LINK</a>

result:

enter image description here

Afterpiece answered 17/12, 2020 at 19:30 Comment(0)
D
0

https://www.npmjs.com/package/dynamic-html-pdf

I use dynamic-html-pdf, this is simple and also able to pass dynamic variable to html.

var html = fs.readFileSync('./uploads/your-html-tpl.html', 'utf8');
var options = {
    format: "A4",
    orientation: "portrait"
    // border: "10mm"
};
var document = {
    type: 'file',     // 'file' or 'buffer'
    template: html,
    context: {
       'your_key':'your_values'
    },
    path: '/pdf/1.pdf'   // pdf save path
};

pdf.create(document, options)
.then(res => {
    console.log(res)
}).catch(error => {
    console.error(error)
});

On html you can use {{your_key}}

Depopulate answered 26/5, 2021 at 13:37 Comment(0)
O
0

I've written hpdf lib for generating PDF from HTLM or URL. It supports configurable pool of headless browsers (as resources) in the background.

import fs from 'fs';
import { PdfGenerator } from './src';

const start = async () => {
    const generator = new PdfGenerator({
        min: 3,
        max: 10,
    });

    const helloWorld = await generator.generatePDF('<html lang="html">Hello World!</html>');
    const github = await generator.generatePDF(new URL('https://github.com/frimuchkov/hpdf'));

    await fs.promises.writeFile('./helloWorld.pdf', helloWorld);
    await fs.promises.writeFile('./github.pdf', github);

    await generator.stop();
}
Oceangoing answered 12/12, 2022 at 12:13 Comment(0)
C
0

I wanted to add to this since I did not see the option to created pdfs from liquid templates yet, but the solution also works with normal html or urls as well.

Lets say this is our html template. Which could be anything really but see that the code include double curly braces. The key inside the braces will be looked up in the liquid_data parameter of the request and replaced by the value.

<html>
  <body>
    <h1>{{heading}}</h1>
    <img src="{{img_url}}"/>
  </body>
</html>

The corresponding liquid_data object looks like this:

{
  "heading":"Hi Stackoverflow!",
  "img_url":"https://stackoverflow.design/assets/img/logos/so/logo-stackoverflow.svg"
}

This is the example I want to create a PDF for. Using pdfEndpoint and the Playground creating a pdf from that template from above is very simple.

const axios = require("axios");

const options = {
  method: "POST",
  url: "https://api.pdfendpoint.com/v1/convert",
  headers: {
    "Content-Type": "application/json",
    "Authorization": "Bearer SIGN-UP-FOR-KEY"
  },
  data: {
    "delivery_mode": "json",
    "page_size": "A4",
    "margin_top": "1cm",
    "margin_bottom": "1cm",
    "margin_left": "1cm",
    "margin_right": "1cm",
    "orientation": "vertical",
    "html": "<html><body> <h1>{{heading}}</h1> <img src=\"{{img_url}}\"/>      </body>\</html>",
    "parse_liquid": true,
    "liquid_data": "{  \"heading\":\"Hi Stackoverflow!\",  \"img_url\":\"https://stackoverflow.design/assets/img/logos/so/logo-stackoverflow.svg\"}"
  }
};

axios.request(options).then(function (response) {
  console.log(response.data);
}).catch(function (error) {
  console.error(error);
});

The service will the return a rendered pdf like this: Html to pdf rendered with pdfendpoint.com

Cluj answered 23/2, 2023 at 9:9 Comment(0)
D
0

After experimenting with various packages, I found that Puppeteer was the most suitable, and this is what worked for me.

My Controller.js

        const fs = require("fs");
        const express = require("express");
        // const html_to_pdf = require("html-pdf-node");
        const path = require("path");
        const pdf = require("html-pdf");
        const ejs = require("ejs");
        const AWS = require("aws-sdk");
        //if u want to import additional methods from other js file and pass it to ejs
        const ejs_helpers = require("../views/report/helpers");
        const puppeteer = require("puppeteer");


        const generatePdfFromEjs = async (req,res) => {
                      const filePathName = path.resolve("public/views/report/home.ejs");
                      const htmlString = fs.readFileSync(filePathName).toString();
                      //using puppeteer
                      const browser = await puppeteer.launch();
                      const [page] = await browser.pages();
                      const additionalData = {};
                      const html = await ejs.render(htmlString, { helpers: ejs_helpers, ...additionalData });
                      await page.setContent(html);
                      const pdf = await page.pdf({ format: "A4", printBackground: true });
                      res.contentType("application/pdf");
                      res.setHeader("Content-Disposition", "attachment; filename=report.pdf");
                      // can upload to S3 bucket if needed
                      const uploadParams = {
                        Bucket: "XYZ",
                        Key: `filename.pdf`,
                        Body: pdf,
                      };
                      s3.upload(uploadParams, (err, data) => {
                        if (err) {
                          console.log("error", err);
                          rej("");
                        }
                        console.log(data.Location);
                        return data.Location;
                      });
                      res.send(pdf);
       }

My Home.ejs

<!DOCTYPE html>
<html lang="en">

<head>
  <meta charset="UTF-8">
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <style>
    #print-report .introFooter {
      background-image: url("https://www.w3schools.com/html/pic_trulli.jpg") !important; /* adding important is mandatory then onnly it will reflect */
      background-repeat: no-repeat !important;
      background-size: cover !important;
      padding: 0 20px 25px 20px;
      display: flex;
      flex-direction: column;
      justify-content: center;
      align-items: center;
    }
  </style>
</head>

<body>
  <div id="print-report">
    <div class="introFooter">
      <div class="container">
        <div class="row">
          <div class="col-xs-4"><span class="idHeader">Date :</span>Test</span></div>
          <div class="col-xs-4"><span class="idHeader">Date :</span>Test</span></div>
          <div class="col-xs-4"><span class="idHeader">Date :</span>Test</span></div>
        </div>
      </div>
      <div ><img class="brandDivider" src="https://www.w3schools.com/html/pic_trulli.jpg" alt="img"></div>
      <div>Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.</div>
    </div>
    </h1>
    <%- include('./file.ejs', {data:{}}) %>
  </div>
</body>

</html>
Dowie answered 31/1, 2024 at 17:53 Comment(0)
F
-1

You can also use pdf node creator package

Package URL - https://www.npmjs.com/package/pdf-creator-node

Firebug answered 23/10, 2019 at 18:13 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.