TIFF plot generation and compression: R vs. GIMP vs. IrfanView vs. Photoshop file sizes
Asked Answered
B

1

7

I generated some high resolution publication quality plots for example

library(plot3D)
Volcano<-volcano
zf=10 #zoom factor
tiff("Volcano.tif", width=1800*zf, height=900*zf, res=175*zf, compression="lzw")
image2D(z = Volcano, clab = "height, m",colkey = list(dist = -0.20, shift = 0.15,side = 3, length = 0.5, width = 0.5,cex.clab = 1.2, col.clab = "white", line.clab = 2,col.axis = "white", col.ticks = "white", cex.axis = 0.8))
dev.off()

the file is 22 MB.

Now I open the file with GIMP and without doing anything else I export it as "Volcano gimp.tif" (don't change resolution, or do anything else). GIMP generates a file ("Volcano gimp.tif") that is 1.9 MB.

imagemagick reports similar image stats:

$ identify Volcano.tif
Volcano.tif TIFF 18000x9000 18000x9000+0+0 8-bit DirectClass 22.37MB 0.000u 0:00.000
$ identify "Volcano gimp.tif"
Volcano gimp.tif TIFF 18000x9000 18000x9000+0+0 8-bit DirectClass 1.89MB 0.000u 0:00.000

even using identify -verbose the 2 files appear to be similar.

What is the difference between these files? Why do they have so different file sizes?

UPDATE: OK, things are getting crazier. I did the same thing with IrfanView and I get different file sizes. The initial file is the Volcano.tif generated from R with compression="lzw". Check how Volcano irfan.tif and Volcano gimp.tif differ in size but all other stats are the same. Memory footprint, DPI, Colors, Resolution is identical. Disk size is different.

enter image description here

UPDATE 2: Adobe Photoshop saves the file down to 2.6 MB

enter image description here

WinRar reports that the original R generated TIFF is highly compressible (from 22MB ->3.6MB)

UPDATE 3: This issue might be similar to Montage / Join 2 TIFF images in a 2 col x 1 row tile without losing quality

UPDATE 4: The R generated TIFF file can be found here http://ge.tt/7ZvRd4C1/v/0?c

Bootlace answered 2/1, 2014 at 10:11 Comment(9)
There seems to be something amiss with the tiff function. On my Win7 machine, (a slightly out of date v2.15.2) R won't create a valid image file at all using compression rle, jpeg or zip. Will investigate further on a different machine later. In the mean time, try playing around with tiff options and see if you can replicate my odd behaviour. There could be a bug buried here.Zosima
compression="zip" crushes my session!Bootlace
Using LZW with and without the predictor option on 24-bpp data can make a huge difference in the compression ratio (like you are observing). Post the TIFF's somewhere I can download them and I will tell you why they are different sizes.Waverly
Here is the R generated TIFF file ge.tt/7ZvRd4C1/v/0?cBootlace
The R generated TIFF file is not using the TIFF predictor. This causes the terrible compression when working with 24-bpp data since the LZW compression works 8-bits at a time. The predictor allows for the constant color sections to "cancel each other out", become black and compress much better.Waverly
OK, thanks for the info. What does this mean practically? Is the problem solely on the compression? Should I output as uncompressed and then compress with GIMP? Also please make this an answer rather a comment (would be helpful to include some more details, I am considering filing this as a bug).Bootlace
In the future, you can use my TIFFTOOL to see all the details of why those files were different: bitbanksoftware.com/tinytools.htmlWaverly
I just published an OSX version of my TIFFTOOL for those of you who don't use Windows: itunes.apple.com/us/app/tifftool/id955437526?mt=12Waverly
The issue seems to be resolved when using compression="lzw+p"Bootlace
W
10

Apparently the TIFF LZW compressor used by R is not making use of an important option (the TIFF predictor) which is leading to an extremely large file. Data compression works best when it can recognize symmetries/redundancies in the data. In this case, the image data is composed of 24-bit (3-byte) pixels containing red, green and blue 8-bit values. Standard LZW compression looks at a stream of bytes for repeating patterns. If it looks at the color image simply as a stream of bytes, it will see repeating patterns of 3-bytes instead of repeating patterns of constant color. Enabling the TIFF predictor on the data causes a differencing filter to store the delta of each pixel with its neighbor. If the neighboring pixels are the same color, it will store 0's. A long string of 0's compresses much better than repeating patterns of non-zeros which are at least 3 bytes long.

Here is an example of how it works on a 6 pixel line. When encoding, the predictor starts from the right edge and works left for each scan line:

Original data:
2A 50 40 2A 50 40 2A 50 40 2A 50 40 2A 50 40 2A 50 40 (6 pixels of the same color)

After horizontal differencing (TIFF predictor):
2A 50 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

The data is much more compressible after the predictor since long runs of the same value (0x00) are easier for LZW to compress.

Conclusion: This should be filed as a bug against the owner of the R compression code since using LZW on full color images without the predictor produces poor results. In the mean time, a workaround is needed to compress it more efficiently.

Waverly answered 2/1, 2014 at 19:17 Comment(4)
Excellent. Thank you. I filed a bug bugs.r-project.org/bugzilla/show_bug.cgi?id=15626 . What should I do in the meantime? Should I save uncompressed TIFFs and the compress them with GIMP or ImageMagick or save the plots as PNG and then convert them to TIFF?Bootlace
PNG should get you the smallest file since it takes advantage of both horizontal and vertical symmetries. Uncompressed TIFFs would take up huge amounts of disk space, so even the poorly compressed ones would be a better choice. The choice of final file format depends on what software will be opening them. They're all using lossless compression so the original data is preserved.Waverly
What happens when I take the poorly compressed TIFF generated from R and open it and save it with GIMP. Does the LZW compression work properly? Is this lossless? Also is PNG->TIFF lossless? (My publisher requires TIFF)Bootlace
PNG and TIFF LZW are lossless (with or without the predictor). All of the file conversions you plan to use will result in identical output, so the only difference will be the file size.Waverly

© 2022 - 2024 — McMap. All rights reserved.