Am I creating lossless PNG images?

Asked 19/12, 2017 at 10:33 Answered 26/1, 2020 at 18:46

Solved python numpy opencv png image-compression

I am doing image processing in a scientific context. Whenever I need to save an image to the hard drive, I want to be able to reopen it at a later time and get exactly the data that I had before saving it. I exclusively use the PNG format, having always been under the impression that it is a lossless format. Is this always correct, provided I am not using the wrong bit-depth? Should encoder and decoder play no role at all? Specifically, the images I save

are present as 2D numpy arrays
have integer values from 0 to 255
are encoded with the OpenCV imwrite() function, e.g. cv2.imwrite("image.png", array)

Waxwork answered 19/12, 2017 at 10:33 Comment(14)

I do not know of any way that you could could get anything other than "pixel perfect" data in that scenario. – Flurried 19/12, 2017 at 10:42

if you are in doubt, load the image again and compute absdiff and test whether any result pixel isn't 0, for some good amount of sample images. – Stubby 19/12, 2017 at 10:58

@Stubby Good idea, but it would be heuristic after all. There is a lot of information out there and people told me PNG can be lossy in this case and that case, that it's only good gray values or certain textures etc... I was left somewhat confused and was hoping for a definitive answer for at least my special case :) – Waxwork 19/12, 2017 at 11:9

My current information is that PNG compression is absolutely never lossy, but the user can screw it up by scramming too many bits per pixel into the format, resulting in loss of color/value range. – Waxwork 19/12, 2017 at 11:12

Use TIFF format? – Primero 19/12, 2017 at 13:45

@Primero To what advantage? AFAIK, there isn't one TIFF format. It seems like you have to look very closely at what you're doing when using it. The english wikipedia page lists over 20 different compression modes, some lossless, some lossy. On a second look, apparently only 5 of those are used frequently. Still, it seems like a very complex format with many versions and degrees of freedom – Waxwork 19/12, 2017 at 13:55

being the storage choice of professional photographers next to .RAW files for example – Primero 19/12, 2017 at 14:5

There is no benefit to using TIFF here - it just complicates things and adds dependencies. In fact, I would go the other way and take the simplest possible format, which doesn't support compression - namely one of the NetPBM formats, e.g. PGM for greyscale or PPM for colour - especially as OpenCV can read/write that without any library dependencies. Plus they also support 16-bit if higher colour resolution becomes necessary later... en.wikipedia.org/wiki/Netpbm_format – Flurried 19/12, 2017 at 14:21

@MarkSetchell Thanks, I hadn't heard of those before. I do want to use compression though, since I'm saving a large number of large images. Otherwise I would just save write the arrays to the hard drive using numpy.save() without having to rely on any additional image reader/writer :) – Waxwork 19/12, 2017 at 14:29

@speedymcs Are you also worrying about checksums etc.? – Variscite 24/8, 2018 at 11:58

@Variscite I'm not familiar with them in this context.. should I? – Waxwork 24/8, 2018 at 14:26

@speedymcs It was more that you said you wanted to ensure that - in the scientific context - you get the right data back. At one point we used checksums to maintain data integrity through hard-drive round trips - but perhaps I am taking your question too far :) – Variscite 24/8, 2018 at 14:30

@Variscite Ah, I see :) I think it's good for comparing two files that should be identical, like an original and a copy, say a download from the web, but when compressing an image file, the checksum changes of course. I guess decompressing the new file and comparing the SSD between pixel values, similar to what was proposed in the second comment, could be seen as a kind of check sum in this context though. – Waxwork 24/8, 2018 at 14:37

Good point re checksum changing - red herring - sorry! :) I sort of meant just maintaining the integrity of the compressed file: memory -> disk -> memory. I suppose in astronomy at least sometimes people worry about bit flips, but it's actually probably a minority sport in the end – Variscite 24/8, 2018 at 15:44

PNG is a lossless format by design:

Since PNG's compression is fully lossless--and since it supports up to 48-bit truecolor or 16-bit grayscale--saving, restoring and re-saving an image will not degrade its quality, unlike standard JPEG (even at its highest quality settings).

The encoder and decoder should not matter, in regards of reading the images correctly. (Assuming, of course, they're not buggy).

And unlike TIFF, the PNG specification leaves no room for implementors to pick and choose what features they'll support; the result is that a PNG image saved in one app is readable in any other PNG-supporting application.

Digitiform answered 19/12, 2017 at 21:30 Comment(1)

Thanks! Encoders seem to vary dramatically in terms of compression rate though, with the standard one being also used by OpenCV performing much wors than others: https://mcmap.net/q/644412/-opencv-imread-imwrite-increases-the-size-of-png – Waxwork 20/12, 2017 at 12:23

While png is lossless, this does not mean it is uncompressed by default.

I specify compression using the IMWRITE_PNG_COMPRESSION flag. It varies between 0 (no compression) and 9 (maximum compression). So if you want uncompressed png:

cv2.imwrite(filename, data, [cv2.IMWRITE_PNG_COMPRESSION, 0])

The more you compress, the longer it takes to save.

Link to docs

Geognosy answered 26/1, 2020 at 18:46 Comment(0)

Recommended topics

Hot tags