Image sanitization library

Asked 17/4, 2012 at 17:51 Answered 16/3, 2021 at 22:56

I have a website that displays images submitted by users. I am concerned about some wiseguy uploading an image which may exploit some 0-day vulnerability in a browser rendering engine. Moreover, I would like to purge images of metadata (like EXIF data), and attempt to compress them further in a lossless manner (there are several such command line utilities for PNG and JPEG).

With the above in mind, my question is as follows: is there some C/C++ library out there that caters to the above scenario? And even if the full pipeline of parsing -> purging -> sanitizing -> compressing -> writing is not available in any single library, can I at least implement the parsing -> purging -> sanitizing -> writing pipeline (without compressing) in a library that supports JPEG/PNG/GIF?

Pronty answered 17/4, 2012 at 17:51 Comment(0)

Your requirement is impossible to fulfill: if there is a 0-day vulnerability in one of the image reading libraries you use, then your code may be exploitable when it tries to parse and sanitize the incoming file. By "presanitizing" as soon as the image is received, you'd just be moving the point of exploitation earlier rather than later.

The only thing that would help is to parse and sanitize incoming images in a sandbox, so that, at least, if there was a vulnerability, it would be contained to the sandbox. The sandbox could be a separate process running as an unprivileged user in a chroot environment (or VM, for the very paranoid), with an interface consisting only of bytestream in, sanitized image out.

The sanitization itself could be as simple as opening the image with ImageMagick, decoding it to a raster, and reencoding and emitting them in a standard format (say, PNG or JPEG). Note that if the input and output are both lossy formats (like JPEG) then this transformation will be lossy.

Bluhm answered 17/4, 2012 at 18:28 Comment(2)

Yes, my idea was to run the sanitization within a hermetic environment. The rationale is that I have full control of the sanitization and can protect myself and my users that way. – Pronty 18/4, 2012 at 13:38

"with an interface consisting only of bytestream in, sanitized image out." This is important... as people have reported recent vulnerabilities in ImageMagick that allow for remote code execution... You need to limit what is available in this sandbox. Even simple things like allowing outbound http traffic can open you up to having your hosts being taken over for use in things like DDOS attacks. – Mica 17/7, 2017 at 15:50

I know, I'm 9 years late, but...

You could use a idea similar to the PDF sanitizer in Qubes OS, which copies a PDF to a disposable virtual machine, runs a PDF parser which converts PDF to basically TIFF images, which are sent back to the originating VM and reassembled into a PDF there. This way you reduced your attack surface to TIFF files. Which is tiny.

(image taken from this article: https://blog.invisiblethings.org/2013/02/21/converting-untrusted-pdfs-into-trusted.html)

If there is really a 0-day exploit for your specific parser in that PDF, it compromises the disposable VM, but since only valid TIFF is accepted by the originating VM and since the disposable VM is discarded once the process is done, this is pointless. Unless of course the attacker also has a either Xen exploit at hand to break out of the disposable VM or a Spectre-type full memory read primitive coupled with a sidechannel to leak data to their machines. Since the disposable VM is not connected to the internet or has any audio hardware assigned, this boils down to creating EM interference by modulating the CPU power consumption, so the attacker probably needs a big antenna and a location close to your server.

It would be an expensive attack.

Hoseahoseia answered 16/3, 2021 at 22:56 Comment(0)

Recommended topics

Hot tags