Here is a command line based method to extract ICC color profiles from a PDF. It uses the Python script pdf-parser.py written by security researcher Didier Stevens which you can download here.
However, this tool is not a specialized tool for ICC extraction. (I do not know such a tool.) It is a generic command line tool to investigate PDF files.
Therefor you need to go through various steps in order to achieve the extraction.
Step 1: Determine the PDF object ID of the ICC profile
You have to use -s
to search for the string ICCBased
. (PDF files without an embedded ICC profile will not have this keyword [with the exception of possibly using it in their text contents...].)
pdf-parser -s ICCBased my.pdf
My test PDF returned this:
obj 18 0
Type:
Referencing: 21 0 R
It seems that an ICC profile is to be found in PDF object 21.
Step 2: Look at the PDF object found in step 1
You have to use -o 21
to see what PDF object 21 is:
pdf-parser.py -o 21 my.pdf
My test PDF returns this:
obj 21 0
Type:
Referencing:
Contains stream
<<
/Alternate /DeviceRGB
/Filter /FlateDecode
/Length 2574
/N 3
>>
Ok, this looks like we are getting close...
Step 3: Dump the stream contained in the PDF object containing the profile
In step 2 we acquired two important infos:
- The PDF object 21 contains a stream (the contents of which are not shown by using the
-o 21
parameter of pdf-parser.py
).
- The object stream has to be de-compressed with the
/FlateDecode
in order to get to its content.
Hence we have to run pdf-parser.py
now with two additional arguments:
-d filename
in order to dump the stream of PDF object 21 to a file.
-f
in order to filter/un-compress the object stream when dumping it to a file.
- Command to run:
pdf-parser.py -o 21 -f -d 21.stream my.pdf
Step 4: Verify what was extracted
We now have dumped the stream of PDF object 21 to a file named 21.stream
. Let's see what it contains:
file 21.stream
21.stream: Microsoft ICM Color Profile
Looks like we succeeded. :-)
Step 5: Open the color profile
I'll see if my Mac OSX system does accept this profile:
mv 21.stream 21.icm
open 21.icm
OSX uses the 'Color Sync Utility' to open the file and display a window. Clicking on the list entries opens different information panes at the bottom of the window:
Step 6: Use Argyll's iccdump
to dump the contents of the ICC profile as text
Note, that Graeme Gill's ArgyllCMS, the open source color management software, available for Linux, Mac OSX and Windows, ships with a whole suite of command line tools. One of these is iccdump
. We can use it to look at the properties of the newly won 21.icm
file:
iccdump 21.icm
icc:
Header:
size = 3144 bytes
CMM = 'Lino'
Version = 2.1.0
Device Class = Display
Color Space = RGB
Conn. Space = XYZ
Date, Time = 9 Feb 1998, 6:49:00
Platform = Microsoft
Flags = Not Embedded Profile, Use anywhere
Dev. Mnfctr. = 'IEC '
Dev. Model = 'sRGB'
Dev. Attrbts = Reflective, Glossy
Rndrng Intnt = Perceptual
Illuminant = 0.964203, 1.000000, 0.824905 [Lab 100.000000, 0.000498, -0.000436]
Creator = 'HP '
tag 0:
sig 'cprt'
type 'text'
offset 336
size 51
Text:
No. chars = 43
0x0000: Copyright (c) 1998 Hewlett-Packard Company
tag 1:
sig 'desc'
type 'desc'
offset 388
size 108
TextDescription:
ASCII data, length 18 chars:
0x0000: sRGB IEC61966-2.1
No Unicode data
ScriptCode Data, Code 0x0, length 18 chars
0x0000: 73 52 47 42 20 49 45 43 36 31 39 36 36 2d 32 2e 31 00
tag 2:
sig 'wtpt'
type 'XYZ '
offset 496
size 20
XYZArray:
No. elements = 1
tag 3:
sig 'bkpt'
type 'XYZ '
offset 516
size 20
XYZArray:
No. elements = 1
tag 4:
sig 'rXYZ'
type 'XYZ '
offset 536
size 20
XYZArray:
No. elements = 1
tag 5:
sig 'gXYZ'
type 'XYZ '
offset 556
size 20
XYZArray:
No. elements = 1
tag 6:
sig 'bXYZ'
type 'XYZ '
offset 576
size 20
XYZArray:
No. elements = 1
tag 7:
sig 'dmnd'
type 'desc'
offset 596
size 112
TextDescription:
ASCII data, length 22 chars:
0x0000: IEC http://www.iec.ch
No Unicode data
ScriptCode Data, Code 0x0, length 22 chars
0x0000: 49 45 43 20 68 74 74 70 3a 2f 2f 77 77 77 2e 69 65 63 2e 63 68 00
tag 8:
sig 'dmdd'
type 'desc'
offset 708
size 136
TextDescription:
ASCII data, length 46 chars:
0x0000: IEC 61966-2.1 Default RGB colour space - sRGB
No Unicode data
ScriptCode Data, Code 0x0, length 46 chars
0x0000: 49 45 43 20 36 31 39 36 36 2d 32 2e 31 20 44 65 66 61 75 6c 74 20
...
tag 9:
sig 'vued'
type 'desc'
offset 844
size 134
TextDescription:
ASCII data, length 44 chars:
0x0000: Reference Viewing Condition in IEC61966-2.1
No Unicode data
ScriptCode Data, Code 0x0, length 44 chars
0x0000: 52 65 66 65 72 65 6e 63 65 20 56 69 65 77 69 6e 67 20 43 6f 6e 64
...
tag 10:
sig 'view'
type 'view'
offset 980
size 36
Viewing Conditions:
XYZ value of illuminant in cd/m^2 = 19.644501, 20.371796, 16.808899
XYZ value of surround in cd/m^2 = 3.928894, 4.074387, 3.361786
Illuminant type = D50
tag 11:
sig 'lumi'
type 'XYZ '
offset 1016
size 20
XYZArray:
No. elements = 1
tag 12:
sig 'meas'
type 'meas'
offset 1036
size 36
Measurement:
Standard Observer = 1931 Two Degrees
XYZ for Measurement Backing = 0.000000, 0.000000, 0.000000 [Lab 0.000000, 0.000000, 0.000000]
Measurement Geometry = Unknown
Measurement Flare = 1.0%
Standard Illuminant = D65
tag 13:
sig 'tech'
type 'sig '
offset 1072
size 12
Signature
Technology = Cathode Ray Tube Display
tag 14:
sig 'rTRC'
type 'curv'
offset 1084
size 2060
Curve:
No. elements = 1024
tag 15:
sig 'gTRC'
type 'curv'
offset 1084
size 2060
Curve:
No. elements = 1024
tag 16:
sig 'bTRC'
type 'curv'
offset 1084
size 2060
Curve:
No. elements = 1024
P.S.:
ArgyllCMS contains a command line tool, extracticc
, which can extract an embedded ICC profile from a TIFF file. It does not have a tool to extract a profile from a PDF file.