Puppeteer PDF Title and Author (metadata)
Asked Answered
C

3

10

After all my searches and code digging didn't help, I'm asking here for a hint:

How, using Puppeteer PDF generation, do I set the metadata of the file (specifically title and author)?

I've tried setting meta tags in my HTML, but it didn't output these into the file metadata.

Caddaric answered 3/7, 2018 at 11:5 Comment(1)
Try show code in fileRigney
R
11

Puppeteer does not come with built-in functionality to edit or write metadata to a PDF.

Instead, you can install the exiftool command line utility to edit the metadata of PDFs generated with Puppeteer:

sudo apt install libimage-exiftool-perl

Then you can use the Node.js child_process.exec() function to call the command line utility from your program after the PDF has been generated:

'use strict';

const puppeteer = require('puppeteer');
const exec = require('util').promisify(require('child_process').exec);

const execute = async command => {
  const {stdout, stderr} = await exec(command);

  console.log((stderr || stdout).trim());
};

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com/');

  await page.pdf({
    path: 'example.pdf',
  });

  await execute('exiftool -title="Example PDF" -author="John Doe" /var/www/example.com/public_html/example.pdf');

  await browser.close();
})();
Ricci answered 23/7, 2018 at 22:2 Comment(0)
D
8

The accepted answer is right, as for now Puppeeter doesn't support setting pdf metadata. But I just wanted to give a solution using a node package instead of a native library : pdf-lib.

You need to :

  • generate the pdf with puppeeter
  • use the returned buffer to load a pdf'lib's PdfDocument
  • set the metadata you want
  • send (and/or save) the result document
import puppeteer from 'puppeteer'
import { PDFDocument } from 'pdf-lib'
import fs from 'fs'

// generate pdf page as usual with puppeeter
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.setContent(`Some html`),
const puppeeterPdf = await page.pdf()
await browser.close()

// Give the buffer to pdf-lib
const pdfDoc = await PDFDocument.load(puppeeterPdf)
pdfDoc.setTitle('A title')
pdfDoc.setAuthor('An author')
const pdfBytes = await pdfDoc.save()

// write to disk
await fs.promises.writeFile('path/to/file.pdf', pdfBytes)
// send via http
res.send(Buffer.from(pdfBytes))
Danikadanila answered 20/6, 2021 at 16:9 Comment(1)
Just important note... setting filename is not possible chrome always uses it from URL - e.g. from URL example.com/file.pdf filename will be file.pdf but when URL is without extension e.g. example.com/somehash filename will be unfortunately somehash.pdf so URL should always end with desired filename. Puppeteer and pdf-lib have no option to set filename. Btw header Content-Disposition: inline; filename=myfile.pdf is ignored but Content-Disposition: attachment; filename=myfile.pdf is working but user can not see PDF inline in browser and must save it to disk.Graff
M
1

I have not found a way to set the author on the generated pdf. However I was able to set the title by adding the <title> tag to my html content.

import puppeteer from 'puppeteer';
import { writeFile } from 'fs/promises';
// create simple html template but set the title tag
const template = `<!DOCTYPE html>
<html>
  <head>
    <title>Hello World Example PDF</title>
  </head>
  <body>
    <div>Hello World</div>
  </body>
</html>`;

const browser = await puppeteer.launch({
    headless: true,
  });

const page = await browser.newPage();
await page.setContent(template);
const pdfBuffer = await page.pdf()
await writeFile('/path/to/pdf.pdf', pdfBuffer);

The resulting pdf file has the title Hello World Example PDF

Medicine answered 17/7, 2024 at 21:45 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.