iTextSharp PDF Reading highlighed text (highlight annotations) using C#

About

Asked 28/4, 2014 at 13:31 Answered 28/4, 2014 at 13:53

I am developing a C# winform application that converts the pdf contents to text. All the required contents are extracted except the content found in highlighted text of the pdf. Please help to get the working sample to extract the highlighted text found in pdf. I am using the iTextSharp.dll in the project

Bypath answered 28/4, 2014 at 13:31 Comment(1)

Are you talking about annotations? You need to be more clear. Annotations are elements that aren't part of the content stream of a page. They are always added on top of the page and have their own appearance stream. You can list them in a separate panel in Adobe Reader. Are we talking about that kind of content? – Unexperienced 28/4, 2014 at 13:34

Assuming that you're talking about Comments. Please try this:

for (int i = pageFrom; i <= pageTo; i++)
{
    PdfDictionary page = reader.GetPageN(i);
    PdfArray annots = page.GetAsArray(iTextSharp.text.pdf.PdfName.ANNOTS);
    if (annots != null)
        foreach (PdfObject annot in annots.ArrayList)
        {
            PdfDictionary annotation = (PdfDictionary)PdfReader.GetPdfObject(annot);
            PdfString contents = annotation.GetAsString(PdfName.CONTENTS);
            // now use the String value of contents
        }
}

This is written from memory (I'm a Java developer, not a C# developer).

Unexperienced answered 28/4, 2014 at 13:53 Comment(2)

you can use PdfArray quadPoints = annotation.GetAsArray(PdfName.QUADPOINTS); to get the quad points of the annotation – Feints 20/12, 2022 at 9:14

This code will read the comment associated to the annotation. It won't return you the text that is highlighted. – Nomination 27/8 at 8:27

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags