Python: Extract Metadata from PNG
Asked Answered
B

2

18

I am able to extract the necessary information using R, but for consistency within the overall project, I would like to be able to do it with Python (preferably Python3). I need the contents of a single tag called "Settings". This tag contains XML which will then need to be parsed.

Getting the metadata in R is incredibly easy:

library(exifr)
library(XML)

path = file.path('path', 'to', 'file')

x = read_exif(file.path(path,'image.png'))
x$Settings

It doesn't look like Python can do it, which boggles my mind. Or doing so requires me to have far more knowledge of Python and PNGs than I have at the moment. How can I extract PNG metadata using Python?


Here's the list of things I've tried:

PyPng PyPNG seems promising. Examining the length of each chunk, it seems likely the "Settings" tag lives in the zTXt chunk.

import png

filename = "C:\\path\\to\\image.png"

im = png.Reader(filename)

for c in im.chunks():
    print(c[0], len(c[1]))

>>>
IHDR 13
tIME 7
pHYs 9
IDAT 47775
zTXt 714
IEND 0

The above was taken from this post. However, it's still unclear how to extract the zTXt data.

hachoir3

Using the hachoir3 package, I tried the following:

from hachoir.parser import createParser
from hachoir.metadata import extractMetadata

filename = "C:\\path\\to\\file\\image.png"
parser = createParser(filename)
metadata = extractMetadata(parser)

for line in metadata.exportPlaintext():
    print(line)

This gives me the following:

Metadata:
- Image width: 1024 pixels
- Image height: 46 pixels
- Bits/pixel: 16
- Pixel format: RGB
- Compression rate: 2.0x
- Image DPI width: 1 DPI
- Image DPI height: 1 DPI
- Creation date: 2016-07-13 19:09:28
- Compression: deflate
- MIME type: image/png
- Endianness: Big endian

I can't seem to get at the field I need, the "Settings" one referenced in the R code. I've had no luck with other methods, such as metadata.get. As far as I can tell, those seem to be the two options for parsing PNG metadata. The docs read,

Some good (but not perfect ;-)) parsers:

Matroska video Microsoft RIFF (AVI video, WAV audio, CDA file) PNG picture TAR and ZIP archive

Maybe it just doesn't have the functionality I need?

Pillow

Following the advice given in this post:

from PIL import Image
filename = "C:\\path\\to\\file\\image.png"
im = Image.open(filename)

This reads in the image, but im.info only returns {'aspect': (1, 1)}. Reading through the documentation, it doesn't look like any of the methods get at the metadata. I read through the PNG description provided in the post. Honestly, I don't know how to make use of its information nor how Pillow would facilitate me.

There are some posts which imply that what I need can be done, but they do not work. For example, this post suggests using the ExifTags library:

from PIL import Image, ExifTags
filename = "C:\\path\\to\\file\\image.png"
im = Image.open(filename)
exif = { ExifTags.TAGS[k]: v for k, v in im._getexif().items() if k in ExifTags.TAGS}

The problem is, AttributeError: 'PngImageFile' object has no attribute '_getexif'. According to the documentation, the ._getexif feature is experimental and only applies to JPGs.

Reading through the overall Pillow documentation, it really only talks about JPG and TIFF. Processing PNG files doesn't seem to be part of the package at all. So like hachoir, maybe it can't be done?

PIL

There's apparently another package PIL from which Pillow was forked. It looks like it was abandoned in 2009.

Bronnie answered 5/2, 2018 at 21:37 Comment(3)
Possible duplicate of How to extract metadata from a image using python?Doth
PNG tags are only four characters long, so those "Settings" are not actually a tag. You are right, it looks like it's compressed data. Do you have a link to such a PNG file? If so, I can probably adjust my PyPNG code to extract it.Redeemable
.. Okay, some bad news here. I found an R-created PNG with a zTXt chunk, and it contained just a profile in that chunk, no metadata. So I still need one containing your settings.Redeemable
G
17

You can get the EXIF metadata with Pillow by accessing the info dict of the loaded image.

As of Pillow 6.0, EXIF data can be read from PNG images. However, unlike other image formats, EXIF data is not guaranteed to be present in info until load() has been called. https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#png

from PIL import Image

filename = 'path/to/img'
im = Image.open(filename)
im.load()  # Needed only for .png EXIF data (see citation above)
print(im.info['meta_to_read'])

I am using Python 3.7 and pillow 7.1.2 from the conda repo.

Grievous answered 18/6, 2020 at 17:51 Comment(0)
S
3

Here is an inelegant and clumsy but working solution.

Adapted from here: https://motherboard.vice.com/en_us/article/aekn58/hack-this-extra-image-metadata-using-python

You can call the command line exiftools app from within python and then parse the results.

Below is the code which works in Python 3.6.3 under Ubuntu 16.04:

import subprocess

result = subprocess.run(['exiftool', '-h', '/home/jason/Pictures/kitty_mask.png'], stdout=subprocess.PIPE)
print (type(result))
print ("\n\n",result.stdout)
normal_string = result.stdout.decode("utf-8")
print("\n\n", normal_string)

It produces the following results for my test image:

> <class 'subprocess.CompletedProcess'>
> 
> 
>  b'<!-- /home/jason/Pictures/kitty_mask.png
> -->\n<table>\n<tr><td>ExifTool Version Number</td><td>10.80</td></tr>\n<tr><td>File
> Name</td><td>kitty_mask.png</td></tr>\n<tr><td>Directory</td><td>/home/jason/Pictures</td></tr>\n<tr><td>File
> Size</td><td>25 kB</td></tr>\n<tr><td>File Modification
> Date/Time</td><td>2018:07:02 09:35:00+01:00</td></tr>\n<tr><td>File
> Access Date/Time</td><td>2018:07:09
> 16:23:24+01:00</td></tr>\n<tr><td>File Inode Change
> Date/Time</td><td>2018:07:02 09:35:00+01:00</td></tr>\n<tr><td>File
> Permissions</td><td>rw-r--r--</td></tr>\n<tr><td>File
> Type</td><td>PNG</td></tr>\n<tr><td>File Type
> Extension</td><td>png</td></tr>\n<tr><td>MIME
> Type</td><td>image/png</td></tr>\n<tr><td>Image
> Width</td><td>2448</td></tr>\n<tr><td>Image
> Height</td><td>3264</td></tr>\n<tr><td>Bit
> Depth</td><td>8</td></tr>\n<tr><td>Color
> Type</td><td>RGB</td></tr>\n<tr><td>Compression</td><td>Deflate/Inflate</td></tr>\n<tr><td>Filter</td><td>Adaptive</td></tr>\n<tr><td>Interlace</td><td>Noninterlaced</td></tr>\n<tr><td>Image
> Size</td><td>2448x3264</td></tr>\n<tr><td>Megapixels</td><td>8.0</td></tr>\n</table>\n'
> 
> 
>  <!-- /home/jason/Pictures/kitty_mask.png --> <table> <tr><td>ExifTool
> Version Number</td><td>10.80</td></tr> <tr><td>File
> Name</td><td>kitty_mask.png</td></tr>
> <tr><td>Directory</td><td>/home/jason/Pictures</td></tr> <tr><td>File
> Size</td><td>25 kB</td></tr> <tr><td>File Modification
> Date/Time</td><td>2018:07:02 09:35:00+01:00</td></tr> <tr><td>File
> Access Date/Time</td><td>2018:07:09 16:23:24+01:00</td></tr>
> <tr><td>File Inode Change Date/Time</td><td>2018:07:02
> 09:35:00+01:00</td></tr> <tr><td>File
> Permissions</td><td>rw-r--r--</td></tr> <tr><td>File
> Type</td><td>PNG</td></tr> <tr><td>File Type
> Extension</td><td>png</td></tr> <tr><td>MIME
> Type</td><td>image/png</td></tr> <tr><td>Image
> Width</td><td>2448</td></tr> <tr><td>Image
> Height</td><td>3264</td></tr> <tr><td>Bit Depth</td><td>8</td></tr>
> <tr><td>Color Type</td><td>RGB</td></tr>
> <tr><td>Compression</td><td>Deflate/Inflate</td></tr>
> <tr><td>Filter</td><td>Adaptive</td></tr>
> <tr><td>Interlace</td><td>Noninterlaced</td></tr> <tr><td>Image
> Size</td><td>2448x3264</td></tr>
> <tr><td>Megapixels</td><td>8.0</td></tr> </table>
Schilit answered 9/7, 2018 at 15:57 Comment(1)
If you're going to call exiftool, then I would suggest removing the -h option. This would give you the raw output so you don't need to parse out the html, just the tags you would need.Anesthesia

© 2022 - 2024 — McMap. All rights reserved.