Can matplotlib add metadata to saved figures?
Asked Answered
U

6

35

I want to be able to ascertain the provenance of the figures I create using matplotlib, i.e. to know which version of my code and data created these figures. (See this essay for more on provenance.)

I imagine the most straightforward approach would be to add the revision numbers of the code and data to the metadata of the saved figures, or as comments in a postscript file for example.

Is there any easy way to do this in Matplotlib? The savefig function doesn't seem to be capable of this but has someone come up with a workable solution?

Underpass answered 10/5, 2012 at 11:0 Comment(7)
Just add some text to the plot...Halberd
That might be straightforward but I don't want to have to submit figures for publication with "commit 5d3414b19986fe3c08df4088d87b8786a660c387" written underneath.Underpass
Then hide it using Steganography. Sorry for stupid suggestions but I'm not aware of any support for this in matplotlib. What I'm suggesting is something like adding a pixelvalue in position (0,0) that differs from background with a value you can correlate with the revision...Halberd
You could look at putting it in EXIF data? I guess you don't want to use JPEGs, but apparently TIFF supports EXIF as well.Partly
I mainly use PDFs or EPS, but I did think EXIF would be a good approach for the others. I might look at writing a wrapper for savefig that adds a string to EXIF for JPEGs, a comment to an EPS file or adds metadata to a PDF. I was interested in whether anyone had already tried to do this.Underpass
EPS files are just text files, with lines beginning with % being a comment. So it would be easy to add a few lines yourself. PDFs are compressed EPS (more or less) so above should work too, best done with some PDF library. (I salute your efforts to track provenance. I've been doing it for model runs but not for figures so far, may start now.)Kianakiang
Did you ever get around of writing such a wrapper? I'd be interested. Alternative would be to write a wrapper that simply stores a text-file next to every stored figure.Papaya
I
21

I don't know of a way using matplotlib, but you can add metadata to png's with PIL:

f = "test.png"
METADATA = {"version":"1.0", "OP":"ihuston"}

# Create a sample image
import pylab as plt
import numpy as np
X = np.random.random((50,50))
plt.imshow(X)
plt.savefig(f)

# Use PIL to save some image metadata
from PIL import Image
from PIL import PngImagePlugin

im = Image.open(f)
meta = PngImagePlugin.PngInfo()

for x in METADATA:
    meta.add_text(x, METADATA[x])
im.save(f, "png", pnginfo=meta)

im2 = Image.open(f)
print im2.info

This gives:

{'version': '1.0', 'OP': 'ihuston'}
Innocuous answered 11/5, 2012 at 13:53 Comment(1)
I'm going to accept this answer for the time being, given that there seems to be no way of adding metadata in matplotlib in a format agnostic manner.Underpass
M
11

If you are interested in PDF files, then you can have a look at the matplotlib module matplotlib.backends.backend_pdf. At this link there is a nice example of its usage, which could be "condensed" into the following:

import pylab as pl
import numpy as np
from matplotlib.backends.backend_pdf import PdfPages

pdffig = PdfPages('figure.pdf')

x=np.arange(10)

pl.plot(x)
pl.savefig(pdffig, format="pdf")

metadata = pdffig.infodict()
metadata['Title'] = 'Example'
metadata['Author'] = 'Pluto'
metadata['Subject'] = 'How to add metadata to a PDF file within matplotlib'
metadata['Keywords'] = 'PdfPages example'

pdffig.close()
Meerschaum answered 4/7, 2013 at 4:55 Comment(0)
Z
10

As of matplotlib version 2.1.0, the savefig command accepts the keyword argument metadata. You pass in a dictionary with string key/value pairs to be saved.

This only fully works with the 'agg' backend for PNG files.

For PDF and PS files you can use a pre-defined list of tags.

Zetland answered 1/7, 2019 at 17:10 Comment(0)
A
3

If you are generating SVG files, you can simply append text as an XML comment at the end of the SVG file. Editors like Inkscape appear to preserve this text, even if you subsequently edit an image.

Here's an example, based on the answer from Hooked:

import pylab as plt
import numpy as np

f = "figure.svg"
X = np.random.random((50,50))
plt.imshow(X)
plt.savefig(f)

open(f, 'a').write("<!-- Here is some invisible metadata. -->\n")
Amianthus answered 14/8, 2014 at 15:52 Comment(1)
Btw, this metadata can be accessed in JavaScript like this: document.getElementsByTagName('svg')[0].nextSiblingDelphadelphi
D
0

This is an old question but there should be an updated answer.

matplotlib now accepts a metadata dictionary as parameter. Using the creator tag is allowed for svg, png.

from matplotlib import pyplot as plt
plt.plot([1,2],[1,4])
f = "line.png"
metadata={"Creator":"them"}
plt.savefig(filename,metadata=metadata)
plt.close()

Note that for png file, inspecting the metadata is hard, as gimp won't see it. The python PIL library allows to extract it:

from PIL import Image
im2 = Image.open(f)
print(im2.info)

>{'Software': 'Matplotlib version3.6.2, https://matplotlib.org/', 'Creator': 'them', 'dpi': (100, 100)}
Digitalism answered 17/4, 2024 at 4:33 Comment(0)
C
0

In another slight variation of some of the above answers for saved images, matplotlib.pyplot.imsave has a pil_kwargs keyword. This allows to specify the PIL.PngImagePlugin.PngInfo() metadata directly, and so you don't need to use the PIL Image syntax.

Cohbath answered 26/6, 2024 at 15:47 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.