I'm having an issue with a filter program I wrote. It detects if a file is a PDF document by reading the first 5 bytes of the file and comparing it to a fixed buffer :
25 50 44 46 2D
This works fine except that I'm seeing a few files that starts with a byte order mark instead:
EF BB BF 25 50 44 46 2D
^-------^
I'm wondering if that is actually allowed by the PDF specs. If I check section 7.5 of that documentation, I read it as "no":
The first line of a PDF file shall be a header consisting of the 5 characters %PDF– followed by a version number of the form 1.N, where N is a digit between 0 and 7
Yet, I see these documents in the wild and the users gets confused because PDF reader programs can open these documents by my filter reject them.
So: are BOM markers allowed at the start of PDF documents ? (I'm NOT talking about string objects here but the PDF file itself)