Executing "pdftk my-pdf-form.pdf dump_data_fields" shows nothing
Asked Answered
M

4

6

I am using the tool pdftk, I have got a editable PDF and I saw in the documentation the arguments dump_data_fields should show me the fields of the form.

I use this command (windows): pdftk my-pdf-form.pdf dump_data_fields

I am using the pdftk server edition.

Documentation: https://www.pdflabs.com/docs/pdftk-man-page/

The point is that the PDF is editable, it has got fields to write with Adobe PDF Viewer.

Musso answered 20/5, 2015 at 12:24 Comment(0)
M
8

The problem was that the pdf was created by Adobe LiveCycle Designer and was save as "Adobe Dynamic XML From". The solution is saving the file as "Adobe Static PDF Form". Possibly pdftk cannot deal with that livecycle files.

Musso answered 25/5, 2015 at 7:2 Comment(0)
H
1

I thought the accepted answer might have been my solution, but it turned out the PDF document I was working with actually didn't have form fields set up. If the document looks like a form, but the form fields are not greyed out, then no fields will be detected.

The only way I could solve this was to open the document in Acrobat Pro and add the fields via its form tool. Then pdftk worked fine.

Halle answered 4/8, 2017 at 21:30 Comment(0)
V
1

If you are facing the problem of the OP within the Windows environment, follow the below instructions.

1- Open the GUI PDFtk program. (You may also use the cli if you wish)

extracting pdf fields using pdftk on Windows

2- Click on the "Add PDF..." button and search for your fill-ready PDF file.

extracting pdf fields using pdftk on Windows

3- Scroll down to the bottom of the GUI PDFtk window and click on "Create PDF..." without adding or changing any settings.

extracting pdf fields using pdftk on Windows

4- Save the new fill-ready PDF file with a new name to a directory of your choice

extracting pdf fields using pdftk on Windows

5- Finally, issue the Windows version of the dump_data_fields command using cmd, like so.(notice how "output" is used instead of ">")

extracting pdf fields using pdftk on Windows

6- Open the text file "fields.txt", and you will see the field names. Example shown below.

extracting pdf fields using pdftk on Windows

Vespid answered 14/7, 2018 at 23:40 Comment(0)
C
0

I don't know if this helps but I wrote some C# code to count the data fields in a document. Please see the following functions.

  1. Here we pass the file path to a file and it counts the total number of fields in the document.

    public int countDataFields(string inputFile)
    {
        int fieldCount = 0;
        string arguments = "";
    
        using (Process newProcess = new Process())
        {
            arguments = inputFile + " dump_data_fields";
            newProcess.StartInfo = new ProcessStartInfo("pdftk ", arguments);
            newProcess.StartInfo.RedirectStandardInput = true;  
            newProcess.StartInfo.RedirectStandardOutput = true;
            newProcess.StartInfo.RedirectStandardError = true;
            newProcess.StartInfo.UseShellExecute = false;
            newProcess.StartInfo.CreateNoWindow = false;
            newProcess.Start();
    
            while (!newProcess.StandardOutput.EndOfStream)
            {
                var line = newProcess.StandardOutput.ReadLine();
                fieldCount = fieldCount + 1;
            }
    
            Console.WriteLine("Field Counts: " + fieldCount);
            newProcess.WaitForExit();
        }
    
        return fieldCount;
    }
    
  2. In case you want to pass a file as a stream through the standard input

    public void countDataFieldsWhenFilePassedAsBinaryStream(string file1)
    {
        int fieldCount = 0;
        // initialize the binary reader and open the binary reader with the file stream of the incoming file.
        BinaryReader binaryReader = new BinaryReader(File.Open(file1, FileMode.Open, FileAccess.Read));
    
        //create a buffer array of 1024.
        byte[] buffer = new byte[1024];
    
        using (Process newProcess = new Process())
        {
            newProcess.StartInfo = new ProcessStartInfo("pdftk");
            newProcess.StartInfo.Arguments = @" - dump_data_fields";
            newProcess.StartInfo.UseShellExecute = false;
            newProcess.StartInfo.RedirectStandardInput = true;
            newProcess.StartInfo.RedirectStandardOutput = true;
            newProcess.Start();
    
            int bytesRead = 0;
    
            // we are reading the binary files in chunks of 1024 bytes
            // we loop through as long as the byte read is greater than 0
            while ((bytesRead = binaryReader.Read(buffer, 0, 1024)) > 0)
            {
                //  we write the standard input bytes into the buffer.
                newProcess.StandardInput.BaseStream.Write(buffer, 0, bytesRead);
            }
    
            //closing the binaryReader
            binaryReader.Close();
    
            //closing the standard input stream
            newProcess.StandardInput.Close();
    
            // here we are going to loop through the standard output stream till the eof. we are counting the
    
            while (newProcess.StandardOutput.EndOfStream == false)
            {
                //read the line;
                newProcess.StandardOutput.ReadLine();
                //increment the counter
                fieldCount++;;
            }
    
            // console writeline the field count.
            Console.WriteLine(fieldCount);
    
            newProcess.WaitForExit();
        }// end of using
    }// end of function convertPDFToStandardInput
    

Hope this helps :)

Circulation answered 26/8, 2019 at 16:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.