To remove pdfmarks from a file the best way is convert pdf to ps and convert ps result to pdf again.
to remove pdfmark
gs -q -dNOPAUSE -dBATCH -sDEVICE=pswrite -sOutputFile=result.ps pdffilewithpdfmark.pdf
after that you can convert again
gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=pdffilewithoutpdfmark.pdf result.ps
This two steps remove completely pdfmarks from file.
Extract pdfmarks is another question. I analyse pdf results after join pdfmarks into pdf file with ghostscript. The problem is the diference between pdfmarks we write to put inside pdfmark.ps file and how this command is converted inside pdf. In example.
You put inside a pdfmark.ps file the following line:
[ Title(chapter 01) /Page 1 /OUT pdfmark
This line adds a bookmark with title chapter 01 and it points to page 1. We join pdf file with this pdfmark.ps file with command
gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=pdffilewithpdfmark.pdf pdffilewithoupdfmark.pdf pdfmark.ps
Into pdffilewithpdfmark.pdf file this simple line in pdfmark.ps file become:
697 0 obj
<< /Title(chapter 01)
/Dest [1 0 R /XYZ null null null]
>>
endobj
Into this case you can open pdf with notepad or another text editor and try to extract this part of file and edit to a new pdfmark.ps file. Into this case you need 2 information to build your pdfmark (title and page) But only Title is easy to identify.
/Title(chapter 01) "Title of bookmark"
/Dest [1 "Objetct pointed by bookmark not page!!"
You can get title of bookmark with this simple cmd command:
findstr /s /i /o /c:"Title" pdfwithpdfmarks.pdf 1>>bookmarks_title.ps
This command print all line where "Title" was found and record it inside bookmarks_title.ps.
Try this command without record into file to see the output.
findstr /s /i /o /c:"Title" pdfwithpdfmarks.pdf
Many information will be place with string and you need to filter what you want to mount statment pdfmark into pdfmark.ps. You will need to place page number manually. After that join pdffile without pdfmark (1st tip above) with this new bookmarks_title.ps edited and prepared to join into new pdf file.
Good Luck!
version='%(prog)s {}\.format(__version__))
, but in hindsight this seems not smart. – Forbidden