How can one create a polyglot PDF?
Asked Answered
F

2

11

I like reading the PoC||GTFO issues and one thing I found remarkable when I first discovered it, was the "polyglot" nature of their PDF files.

Let met explain: when you consider for example their 8th issue, you may unzip files from it; execute the encryption they are talking about by running it as a script and even better(worse?) with their 9th issue you can even play it as a music file!

I'm currently in the process of writing small scripts every week and writing each time a little one page PDF in LaTeX to explain the said scripts. So I would really enjoy being able to create the same kind of PDF files. Sadly they explained (partly) in their first issue how to include zip files, but they did so through three small sketches of cmd lines without actual explanations.

So my question is basically : how can one create such a polyglot PDF file containing stuff like a zip as well as being a shell script which may be run using arguments just like normal scripts?

I'm asking here about the process of creation, not just an explanation of how this is possible. The ideal way for me would that there are already some scripts or programs allowing to create easily such PDF files.

I've tried to search the net for the keywords "polyglot files" and others of the kind and wasn't able to find any useful matches. Maybe this process has another name?

I've already read the presentation by Julia Wolf which explains how things works, but I sadly haven't had time to apply the knowledge there to real world, because I'm sadly not used to play with file headers and the way a PDF is constructed.

EDIT: Okay, I've read more and found the 7th edition of PoC||GTFO to be really informative concerning this subject. I may end up being able to create my own scripts to do such polyglot PDF files if I have some more time to consider it.

Frager answered 7/3, 2016 at 9:29 Comment(1)
github.com/perfaram/pdf-zip-nes-polyglotMontpellier
T
3

I played around with polyglots myself after attending Ange's talks and also talking to him in person. You really need to understand the file formats to be able to nest them into each other.
However, long story short, here are some links I found extremely useful for creating polyglots:

Some older Google Code Trunk
PoC of the polyglot stuff

Especially the second link (to github) will help you creating polyglots, but also understanding how they are working and how they are implemented. Since it is mostly Python stuff and very well / clean written, it is very useful and easy to follow.

Tenderloin answered 16/4, 2016 at 22:1 Comment(0)
C
3

I feel dissecting some file formats would be a good place to start. You can find many file format specifications for different file types through Google, but they can be a tough read and will likely take you some time to translate into whatever language you are using.

PDF: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf

ELF: https://www.cs.cmu.edu/afs/cs/academic/class/15213-s00/doc/elf.pdf

ZIP: http://kat.sdf.org/zip_file_format.txt

The language(s) you select will need a way to read and write raw bytes (not just ascii alphanumeric), so perhaps C would be good for more direct access to memory. Some Python tricks could help with open sourcing the scripts easily.

To dissect the files, you may want to build a tool kinda like https://github.com/kvesel/zipbrk/ to take them apart, then put them all back together in a polyglot format. For example, zip does not require the section headers to be at the start (or even contiguous for that matter), and PDF magic number can appear in multiple places within the file as well. I also believe I recall a polyglot tool being included in one of the PoC||GTFO publishings (maybe issue 8 or 2??) as a polyglot in the pdf file.

Don't forget the hackers bible! :) https://nostarch.com/gtfo

Cattan answered 4/4, 2018 at 20:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.