How to crop PDF margins using pdftk and /MediaBox
Asked Answered
F

3

7

I used pdftk to uncompress a PDF and then opened it as a text file.
I want to edit the /MediaBox field, which is in my case

/MediaBox [0 0 612 792]

I would like to reduce the margins, for instance

/MediaBox [100 0 512 792]

Unfortunately it doesn't work. I can change the 0 into a 2 or a 9 but I cannot put 100 for instance.

Any idea why?

Feldspar answered 15/3, 2011 at 5:24 Comment(1)
this is not a programming question, should be moved to another site of the networkDeboer
T
9

use sed to replace any occurrence

sed 's/MediaBox \[0 0 612 792*/MediaBox \[100 0 512 792]/g'<in.pdf >out.pdf

or podofobox (inside podofo utils)

without needing to uncompress pdf streams first (as needed with pdftk)

podofobox in.pdf out.pdf media 10000 0 51200 79200

as you can see, podofobox uses MediaBox values multiplied by 100, since its scale is a sub multiple, so, you need simply to add two zeroes (00) to values you can read in MediaBox field

Textualism answered 6/4, 2012 at 13:27 Comment(0)
R
16

The string 100 has two more numbers in it than 0. When you use a text editor and add characters, that makes the file longer. That's why replacing with 9 or 2 or any other single digit works fine. While a text editor can theoretically be used to edit a pdf, it's not simple and you have to respect the internal structure of the file. The xref table is a table near the end of a pdf that tells the reader exactly where each object is located. It has to be changed whenever the length or location of anything is changed.

The reason the manual method above using pdftk doesn't work is that you are adding two bytes in the center of the file. This breaks the xref table. If you manually update all the xrefs, this will work, but it is potentially very tedious. Using sed or any other text editing tool will not solve the problem. podofo does the xref calculation for you.

Raylenerayless answered 25/2, 2013 at 17:3 Comment(2)
1- What do you mean by "adding two bytes in the center of the file" and what is the xref table? 2-So what do you suggest?Feldspar
I recommend doing what @Dingo and Dr Gorb already suggested, which is to use software or code that is designed to manipulate pdfs.Raylenerayless
T
9

use sed to replace any occurrence

sed 's/MediaBox \[0 0 612 792*/MediaBox \[100 0 512 792]/g'<in.pdf >out.pdf

or podofobox (inside podofo utils)

without needing to uncompress pdf streams first (as needed with pdftk)

podofobox in.pdf out.pdf media 10000 0 51200 79200

as you can see, podofobox uses MediaBox values multiplied by 100, since its scale is a sub multiple, so, you need simply to add two zeroes (00) to values you can read in MediaBox field

Textualism answered 6/4, 2012 at 13:27 Comment(0)
T
5

there are better ways to change the margin of a PDF:

hope you found an answer to that since posting :-)

Tachistoscope answered 6/4, 2012 at 8:29 Comment(1)
I have tried the last one, Ghostscript (9.10) and it didn't work for me. On the other hand, podofobox in the accepted answer does work.Cathern

© 2022 - 2024 — McMap. All rights reserved.