How to do mail merge on top of a PDF?
Asked Answered
S

11

13

I often get a PDF from our designer (built in Adobe InDesign) which is supposed to be sent out to thousands of people.

I've got the list with all the people, and it's easy doing a mail merge in OpenOffice.org. However, OpenOffice.org doesn't support the advanced PDF. I just want to output some text onto each page and print it out.

Here's how I do it now: print out 6.000 copies of the PDF, then put all of them into the printer again and just print out name, address and other information on top of it. But that's expensive.

Sadly, I can't make the PDF to an image and use that in OpenOffice.org because it grinds the computer to a halt. It also takes extremely long time to send this job to the printer.

So, is there an easy way to do this mail merge (preferably in Python) without paying for third party closed solutions?

Seethe answered 10/12, 2008 at 15:38 Comment(1)
This is an offline pdf mail merge tool that is open source but not really free github.com/plainlab/plainmerge. Disclaimer: I am the author.Sanious
T
7

Now I've made an account. I fixed it by using the ingenious pdftk.

In my quest I totally overlook the feature "background" and "overlay". My solution was this:

pdftk names.pdf background boat_background.pdf output out.pdf

Creating the names.pdf you can easily do with Python reportlab or similar PDF-creation scripts. It's best using code to do that, creating 6k pages took several hours in LibreOffice/OpenOffice, while it took just a few seconds using Python.

Threadbare answered 28/9, 2009 at 1:32 Comment(5)
I'm upvoting because this is the solution to another problem I have, not forms related, but rather 'stamping' text onto a pdfSepia
The link is down.Leveroni
Thanks @stoic, removing the link because it is basically just detailing all the failed attempts I had and why they failed (well some almost worked). I think I also wrote and showed a script to make the names.pdf file, but you can do that easily with Python. I tried LibreOffice first but it took 4+ hours, Python took 4 seconds.Threadbare
Can you provide examples of names.pdf and boat_background.pdf? This seems like a great solution, but I'm having a bit of trouble figuring out the specifics. I'm thinking examples of both files would help.Hobnail
Hey @Hobnail - my website used to have examples, but I haven't set it up again. I did add the basic script I wrote in another answer now though: https://mcmap.net/q/373450/-how-to-do-mail-merge-on-top-of-a-pdfThreadbare
D
2

You could probably look at a PDF library like iText. If you have some programming knowledge and a bit of time you could write some code that adds the contact information to the PDFs

Donielle answered 10/12, 2008 at 15:48 Comment(0)
W
2

There are two much simpler and cheaper solutions.

First, you can do your mail merge directly in InDesign using DataMerge. This is a utility added to InDesign way back in CS. You export or save your names in CSV format. Import the data into an InDesign template and then drop in your name, address and such fields in the layout. Press Go. It will create a new document with all the finished letters or you can go right to the printer.

OR, you can export your data to an XML file and create a dynamic layout using XML placeholders in InDesign.

The book A Designer's Guide to Adobe InDesign and XML will teach you how to do this, or you can check out the Lynda.com videos for Dynamic workflows with InDesign and XML.

Very easy to do.

If you want to create separate PDFs files for the mail merge, you can run out one long PDF with all the names in one file then do an Extract to Separate PDF files in Acrobat Pro itself.

Wilcox answered 5/7, 2012 at 23:34 Comment(1)
Cheaper? Not as in cost at least, because pdftk is free. It's also very fast. In the end I've been using reportlab pdfgen plus pdftk to do this job. And from taking hours before, it now takes mere seconds to mail-merge a 50.000 page CSV file on top of a PDF. :-)Threadbare
S
1

If you cannot get the template in another format than PDF a simple ad-hoc solution would be to

  • convert the PDF into an image
  • put the image in the backgroud of your (OpenOffice.org) document
  • position mail merge fields on top of the image
  • do the mail merge and print
Suprarenal answered 10/12, 2008 at 15:49 Comment(2)
The print job gets HUGE and it will never finish.Seethe
Well, then the easiest would be to ask your designer to provide the template in a viable format (or to get a copy of Adobe Acrobat Professional, I think there should be some possibility to use convert the PDF into a form). Seems the ROI of delivering 6000 letters should justify such an investment.Suprarenal
I
1

Probably the best way would be to generate another PDF with the missing text, and overlay one PDF over the other. A quick Google found this link showing how to do it in Acrobat, and I'm sure there are other methods as well.

http://forums.macrumors.com/showthread.php?t=508226

Inerney answered 10/12, 2008 at 16:36 Comment(0)
G
1

For a no-mess, no-fuss solution, use iText to simply add the text to the pdf. For example, you can do the following to add text to a pdf document once loaded:

PdfContentByte cb= ...;
cb.BeginText();
cb.SetFontAndSize(font, fontSize);
float x = ...;
float y = ...;
cb.SetTextMatrix(x, y);
cb.ShowText(fieldValue);
cb.EndText();    

From there on, save it as a different file, and print it.

However, I've found that form fields are the way to go with pdf document generation from templates.

If you have a template with form fields (added with Adobe Acrobat), you have one of two choices :

  • Create a FDF file, which is essentially a list of values for the fields on the form. A FDF is a simple text document which references the original document so that when you open up the PDF, the document loads with the field values supplied by the FDF.
  • Alternatively, load the template with with a library like iText / iTextSharp, fill the form fields manually, and save it as a seperate pdf.

A sample FDF file looks like this (stolen from Planet PDF) :

%FDF-1.2
%âãÏÓ
1 0 obj
<<<
 /F(Example PDF Form.pdf)
 /Fields[
  <<
  /T(myTextField)
  /V(myTextField default value)
  >>
  ]
 >>
>> endobj trailer
<>
%%EOF

Because of the simple format and the small size of the FDF, this is the preferred approach, and the approach should work well in any language.

As for filling the fields programmatically, you can use iText in the following way :

PdfAcroForm acroForm = writer.AcroForm;
acroForm.Put(new PdfName(fieldInfo.Name), new PdfString(fieldInfo.Value));
Gynecologist answered 10/12, 2008 at 17:17 Comment(1)
But can I put 6000 names into one FDF, or do I have to make 6000 FDF files and then output 6000 5MiB PDF-files (which would be HUGE and take forever)?Seethe
G
1

What about using a variable data program such as - XMPie for Adobe Indesign. It's a plug-in that should reference to your list of people (think it might have to be a list in Excel though).

Guyon answered 17/12, 2011 at 21:15 Comment(0)
S
1

One easy way would be to create a fillable pdf form from the original document in Acrobat and do a mail merge with the form and a csv.

PDF mail merges are relatively easy to do in python and pdftk. Fdfgen (pip install fdfgen) is a python library that will create an fdf from a python array, so you can save the excel grid to a csv, make sure that the csv headers match the name of the pdf form field you want to fill with that column, and do something like

import csv
import subprocess

from fdfgen import forge_fdf

PDF_FORM = 'path/to/form.pdf'
CSV_DATA = 'path/to/data.csv'

infile = open(CSV_DATA, 'rb')
reader = csv.DictReader(infile)
rows = [row for row in reader]
infile.close()

for row in rows:
    # Create fdf
    filename = row['filename'] # Construct filename
    fdf_data = [(k,v) for k, v in row.items()]
    fdf = forge_fdf(fdf_data_strings=fdf_data)
    fdf_file = open(filename+'.fdf', 'wb')
    fdf_file.write(fdf)
    fdf_file.close()

    # Use PDFTK to create filled, flattened, pdf file
    cmds = ['pdftk', PDF_FORM, 'fill_form', filename+'.fdf',
            'output', filename+'.pdf', 'flatten', 'dont_ask']
    process = subprocess.Popen(cmds, stdout=subprocess.PIPE)
    stdout, stderr = process.communicate()
    returncode = process.poll()
    os.remove(filename+'.fdf')

I've encountered this problem enough to write my own free solution, PdfZero. PdfZero has a mail merge feature to merge spreadsheets with PDF forms. You will still need to create a PDF form, but you can upload the form and csv to pdfzero, select which form fields you want filled with which columns, create a naming convention for each filled pdf using the csv data if needed, and batch generate the filled PDfs.

DISCLAIMER: I wrote PdfZero

Speciality answered 19/7, 2019 at 15:28 Comment(0)
T
1

Someone asked for specifics. I didn't want to sully my top answer with it, because you can do it how you like (and just knowing pdftk is up to it should give people the idea).

But here's some scripts I used ages ago:

csv_to_pdf.py

#!/usr/bin/python
# This makes one PDF page per name in the CSV file
# csv_to_pdf.py <CSV_FILE>

import csv
import sys
from reportlab.pdfgen.canvas import Canvas
from reportlab.lib.units import cm, mm

in_db = csv.reader(open(sys.argv[1], "rb"));
outname = sys.argv[1].replace("csv", "pdf")
pdf = Canvas(outname)
in_db.next()

i = 0
for rad in in_db:
        pdf.setFontSize(11)
        adr = rad[1]

        tekst = pdf.beginText(2*cm, 26*cm)

        for a in adr.split('\n'):
            if not a.strip():
                continue
            if a[-1] == ',':
                a = a[:-1]
            tekst.textLine(a)
        pdf.drawText(tekst)
        pdf.showPage()

        i += 1
        if i % 1000 == 0:
                print i
pdf.save()

When you've ran this, you have a file with thousands of pages, only with a name on it. This is when you can background the fancy PDF under all of them:

pdftk <YOUR_NEW_PDF_FILE.pdf> background <DESIGNED_FILE.pdf> <MERGED.pdf>
Threadbare answered 31/7, 2019 at 16:53 Comment(0)
V
0

You can use InDesign's data merge function, or you can do what you've been doing with printing a portion of the job, and then printing the mail merge atop that with Word or Open Office. But also look into finding a company that can do variable data offset printing or dynamic publishing. Might be a little more expensive up front but can save a bundle when it comes to time, testing, even packaging and mailing.

Vassalize answered 18/5, 2015 at 5:34 Comment(0)
L
0

Disclaimer: I'm the author of this tool.

I ran into this issue enough times that I built a free online tool for it: https://pdfbatchfill.com/

It assumes a PDF form as a template and uses that along with CSV form data to generate a single PDF or individual PDFs in a zip file.

Li answered 25/1, 2017 at 0:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.