How to create a PDF/A from command line with Libre Office Draw in headless mode?
Asked Answered
M

4

13

LibreOffice Draw allows you to open a non PDF/A file and export this a PDF/A-1b or PDF/A-2b file.

export as PDF

The same is possible from the command line by calling on macOS

/Applications/LibreOffice.app/Contents/MacOS/soffice --headless \
        --convert-to pdf:draw_pdf_Export \
        --outdir ./pdfout \
        ./input-non-pdfa.pdf

or an a Linux simply

libreoffice --headless \
        --convert-to pdf:draw_pdf_Export \
        --outdir ./pdfout \
        ./input-non-pdfa.pdf

On the command line it is possible to tell the convert-to to create a pdf and use LibreOffice Draw to do this by telling --convert-to pdf:draw_pdf_Export.

Is there also a way to tell LibreOffice to produce a PDF/A document in headless mode?

Mohamed answered 15/8, 2019 at 17:7 Comment(1)
Have a look at #62535817 . There is no such thing like a cli option. I guess, that uconv fiddles with the user settings file (registrymodifications.xcu) that stores the pdfa options before triggering headless libreoffice. And that seems to get read by libreoffice, even in the headless mode.Therron
S
5

For PDF/A-1(means PDF/A-1b?):

soffice --headless --convert-to pdf:"writer_pdf_Export:SelectPdfVersion=1" --outdir outdir input.pdf

Change the value from 1 to 2 for PDF/A-2, here is the Libreoffice source code Common.xcs, pdfexport.cxx and pdffilter.cxx.

Splay answered 7/6, 2021 at 11:15 Comment(5)
Note that despite there was a SelectPdfVersion filter data element, it was never possible to define it using command line prior to version 7.4 (not yet released at the time of writing my comment) - see ask.libreoffice.org/t/… that mentions respective enabling commit. So this accepted answer is wrong, and the working syntax in 7.4 will be soffice --convert-to 'pdf:writer_pdf_Export:{"SelectPdfVersion":{"type":"long","value":"1"}}' input.pdfMaice
@Mike Kaganski How is it possibile version 7.4 ? Now, August, the stable release is 7.2.7 and the fresh one is 7.3.4Intertidal
@Splay Did you find a working solution? I tried yours and Kagansky one, they don't work.Intertidal
@Intertidal If you read my comment, you may note the "not yet released at the time of writing my comment"; 7.4 is in release candidate stage at this moment (and was a pre-alpha back then in February when I was writing that); its release is due in Aug (wiki.documentfoundation.org/ReleasePlan/7.4).Maice
Note that in Windows single quote doesn't work, use this syntax: soffice --convert-to "pdf:writer_pdf_Export:{\"SelectPdfVersion\":{\"type\":\"long\",\"value\":\"1\"}}" input.pdfFill
E
0
  • Convert pdf to ps
pdftops input.pdf input.ps
  • Then convert ps to pdf/a
gs -dPDFA -dBATCH -dNOPAUSE -dNOOUTERSAVE -dUseCIEColor -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite -sPDFACompatibilityPolicy=1 -sOutputFile=input-A.pdf input.ps
Erhart answered 1/6, 2022 at 9:17 Comment(0)
S
0

After sevaral tries with version 7.4 but issues with formatting of pdfa leading to a "non-pdfa-result" with online PDFA validators i upgraded to the current version of Libre 7.6.4.1.

I managed to convert from docx / pdf to PDFA 3b under Windows 11 using a simple .bat file resulting in successful online validation of pdfa with multiple websites.

Here is the code I used:

// changing to directory where source file is located

cd C:\Users\Username\Documents

// converting input file to PDFA version 3b with provided output directory and destination folder. Version can be changed from 1 to 3.

"C:\Program Files\LibreOffice\program\swriter.exe" --headless --convert-to pdf:"writer_pdf_Export:SelectPdfVersion=3" inputfile.pdf --outdir "C:\Users\Username\Documents\pdfa"

If you want to process a batch of files you can integrate a "For-loop" like this:

FOR %%variablename IN (*.fileformat) DO (
   conversion as mentioned above while replacing filename with %%variablename 
)
Squander answered 21/1 at 14:40 Comment(0)
L
0

As long ago as 6.3 the correct way to use Windows command line is through invoking Soffice.com as that channels all command lines through the correct pathing. see current 7.6 manual comments.

Search for Run in the Windows Start menu.
Type the following text in the Open text field and click OK.
"{install}\program\soffice.com" {parameter}

Replace {install} with the path to your installation of LibreOffice software (for example, C:\Program Files\LibreOffice). Use soffice.exe instead of soffice.com, when you do not need console (e.g., you do not use command-line interface for headless operations).

So note that if you run a batch file all you need is the path prepended thus

set "path=C:\Program Files\LibreOffice\Program;%path%"

and soffice.com will be the default headless command to channel files via DRAW or calc etc.

Versions after 7.4 can convert PDF to PDF/A-1 or A-2 but PDF/A-3 was not added until later. So use current 7.6 or newer.

soffice --headless bla blah

where blah blah is https://help.libreoffice.org/latest/en-US/text/shared/guide/pdf_params.html

BEWARE the correct mix of external " and internal \"

--convert-to "pdf:draw_pdf_Export:{\"SelectPdfVersion\":{\"type\":\"long\",\"value\":\"15\"}}" --outdir "folder name" "input.pdf"

"value" can be as below so for PDF/A-3b use number 3

1: PDF/A-1b

2: PDF/A-2b

3: PDF/A-3b

15: PDF 1.5

16: PDF 1.6

17: PDF 1.7 (same as default = 0)

The need for --outdir is to avoid risk of pdf overwrite of input.pdf, HOWEVER in the version I tested the outfile was renamed .'pdf and needed rename !!

Tested with value = 2

Standard

PDF/A-2b

ISO Name

ISO 19005-2:2011

Conformance Level

Level b

Validity

PDF/A verification succeeded

Conformance is not always guaranteed, PDF/A-3 is excessively draconian. My first run with an old file produced a not unusual fail, which could more easily be fixed using Ghostscript.

Name of the processed file
TimeMachine1895.pdf
Validation status
This document has some validation errors
Result
Conformance level applied: PDF/A-3b

The PDF document conformance part does not match the selected validation conformance.
The PDF document conformance level does not match the selected validation conformance.
Document XMP metadata is missing.
Device dependent color space used without matching PDF/A OutputIntent.

However after a bit of head scratching I realised the file was not correctly completed due to a bad mix of " and\" escape character and on a re-run got

The file is a valid PDF/A-3B document  

Process more files
File name(s):
TimeMachine1895.pdf
Lerma answered 21/1 at 16:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.