How can I edit/modify/replace text in an existing PDF file? [duplicate]
Asked Answered
N

1

2

I am working on my final year project, so I working on a website where a user can come and read PDF. I am adding some features such as converting currency to their country currency. I am using flask and pymuPDF for my project and I don't know how I can modify the text at a pdf anyone can help me with this problem?

I heard here that using pymuPDF or pypdf can work, but I didn't find any solution for replacing text.

Necroscopy answered 8/2, 2023 at 14:36 Comment(0)
I
9

Using the redaction facility of PyMuPDF is probably the adequate thing to do. The approach:

  1. Identify the location of the text to replace
  2. Erase the text and replace it using redactions

Care must be taken to get hold of the original font, and whether or not the new text is longer / short than the original.

import fitz  # import PyMuPDF

doc = fitz.open("myfile.pdf")
page = doc[number]  # page number 0-based
# suppose you want to replace all occurrences of some text
disliked = "delete this"
better   = "better text"
hits = page.search_for("delete this")  # list of rectangles where to replace

for rect in hit:
    page.add_redact_annot(rect, better, fontname="helv", fontsize=11,
       align=fitz.TEXT_ALIGN_CENTER, ...)  # more parameters

page.apply_annots(images=fitz.PDF_REDACT_IMAGE_NONE)  # don't touch images
doc.save("replaced.pdf", garbage=3, deflate=True)

This works well with short text and medium quality expectations.

With some more effort, the original font properties, color, font size, etc. can be identified to produce a close-to-perfect result.

Instantaneity answered 8/2, 2023 at 15:17 Comment(1)
It seems that apply_annots should be replaced by apply_redactions.Cruiserweight

© 2022 - 2024 — McMap. All rights reserved.