Does PDF file contain iref stream?
Asked Answered
P

4

7

I still fight with read data from PDF file.
I use PDFsharp, how can I check if file contains iref stream without use method Open. Method Open throws exception if file contains iref stream.

Parquet answered 8/10, 2012 at 13:0 Comment(2)
I have no idea about PDF#, but my solution would be to open the PDF and fetch the specific exceptionPeterkin
a new version of pdfsharp is available It's still a beta version: 1.50.4000-beta3b but it solves the issue. You can download it from nuget nuget.org/packages/PdfSharp/1.50.4000-beta3bCinchonize
J
20

There is a know workaround to permit you to open ALSO the pdf files that contains iref: you can find here the complete thread about that.

Just to summarize the solution:

  1. download and include the iTextSharp 4.1.6 library
  2. paste the following code in a code file into your project:

-

using System;
using System.IO;

namespace PdfSharp.Pdf.IO
{
    static public class CompatiblePdfReader
    {
        /// <summary>
        /// uses itextsharp 4.1.6 to convert any pdf to 1.4 compatible pdf, called instead of PdfReader.open
        /// </summary>
        static public PdfDocument Open(string pdfPath)
        {
            using (var fileStream = new FileStream(pdfPath, FileMode.Open, FileAccess.Read))
            {
                var len = (int)fileStream.Length;
                var fileArray = new Byte[len];
                fileStream.Read(fileArray, 0, len);
                fileStream.Close();

                return Open(fileArray);
            }
        }

        /// <summary>
        /// uses itextsharp 4.1.6 to convert any pdf to 1.4 compatible pdf, called instead of PdfReader.open
        /// </summary>
        static public PdfDocument Open(byte[] fileArray)
        {
            return Open(new MemoryStream(fileArray));
        }

        /// <summary>
        /// uses itextsharp 4.1.6 to convert any pdf to 1.4 compatible pdf, called instead of PdfReader.open
        /// </summary>
        static public PdfDocument Open(MemoryStream sourceStream)
        {
            PdfDocument outDoc;
            sourceStream.Position = 0;

            try
            {
                outDoc = PdfReader.Open(sourceStream, PdfDocumentOpenMode.Import);
            }
            catch (PdfReaderException)
            {
                //workaround if pdfsharp doesn't support this pdf
                sourceStream.Position = 0;
                var outputStream = new MemoryStream();
                var reader = new iTextSharp.text.pdf.PdfReader(sourceStream);
                var pdfStamper = new iTextSharp.text.pdf.PdfStamper(reader, outputStream) {FormFlattening = true};
                pdfStamper.Writer.SetPdfVersion(iTextSharp.text.pdf.PdfWriter.PDF_VERSION_1_4);
                pdfStamper.Writer.CloseStream = false;
                pdfStamper.Close();

                outDoc = PdfReader.Open(outputStream, PdfDocumentOpenMode.Import);
            }

            return outDoc;
        }
    }
}
  1. Change all your calls to PdfReader.Open to CompatiblePdfReader.Open.

It works like a charm for me, hope this helps you.

Jonette answered 27/9, 2013 at 13:26 Comment(5)
Until you get exception: PdfReader not opened with owner password. I still have not overcome this one. I think you can use an older (pre 4.0.4) version of iTextSharp.Dunnite
Wish I could upvote this more times! So simple when you know how!!Sacramentalist
this was a life saver! I had a need to merge generated PDFs with some provided PDFs that I didn't have control over. This solution let me use both and be able to merge them without throwing out the incompatible ones.Donothing
The answer looks good - however it's worth noting that iTextSharp uses the AGPL and so for most commercial operations a license fee will be payable.Chlorite
@StuartMoore the linked version is under MPL / LGPLv2Jonette
B
10

PDFsharp 1.32 and earlier did not support iref streams.

Since December 2015 we have PDFsharp 1.50 with support for iref streams.

Bibliotaph answered 22/4, 2016 at 13:38 Comment(2)
As of writing this you need to select the pre-release version from NuGet...Geer
1.50.3638-beta has other issue with PdfReader.Open() hangs. It is not fixed and still present in version 1.50.4000-beta3b, which is the last one currently. See Bug: PdfReader.Open() (PDFsharp 1.5) thread.Pertinacious
M
0

Although a late reply but might be useful.

I am on a same situation (C# Project using pdfSharp). I've a PowerShell script, which ignores the files with iref stream while merging (Thus not throwing the exception).

Function Merge-PDF {
    Param($path, $filename)                        



    $output = New-Object PdfSharp.Pdf.PdfDocument
    $PdfReader = [PdfSharp.Pdf.IO.PdfReader]
    $PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]                        

    foreach($i in (gci $path *.pdf -Recurse)) {
        $input = New-Object PdfSharp.Pdf.PdfDocument
        $input = $PdfReader::Open($i.fullname, $PdfDocumentOpenMode::Import)
        $input.Pages | %{$output.AddPage($_)}
    }                        

    $output.Save($filename)
}

Merge-PDF -path c:\reports -filename c:\reports\zzFull_deck.pdf

Will definitely post the C# equivalent of above function later.

Meliamelic answered 25/8, 2013 at 7:9 Comment(2)
And he never returned.Mitchum
Handling PdfSharp.Pdf.IO.PdfReaderException with an empty catch block did the trick.Meliamelic
M
-1

The work around is to catch the PdfSharp.Pdf.IO.PdfReaderException, and ignore the files causing such exceptions.

PdfDocument inputPDFDocument = new PdfDocument();
try
{
    inputPDFDocument = PdfReader.Open(pdfFile, PdfDocumentOpenMode.Import);
}
catch (PdfSharp.Pdf.IO.PdfReaderException)
{
    //
}
Meliamelic answered 25/8, 2013 at 8:34 Comment(2)
just ignore them? like just ignoring some of my job requirements? yea sounds like a fixBoult
@Boult that indeed was the case. Check Vive la deraison's answer for the solution in newer versions of PDFsharp.Meliamelic

© 2022 - 2024 — McMap. All rights reserved.