Plotting at full resolution with matplotlib.pyplot, imshow() and savefig()?
Asked Answered
S

3

15

I have a medium-sized array (e.g. 1500x3000) that I want to plot at scale since it is an image. However, the vertical and horizontal scales are very different. For simplification let say that there is one meter/row and 10/column. The plot should then produce an image which is c. 1500x30000. I use the kwarg extent for the scales and aspect = 1 to avoid deformation. Either by using the plotting windows (QT4) and imshow() or by using savefig(), I never succeeded in producing the image at scale and at full resolution.

I have looked to many proposed solutions as indicated in here, here, or here and there or there in case it was a bug. I have altered my matplotlibrc and placed it in ~/.config/matplotlib to try forcing the my display / savefig options but to no avail. I also tried with pcolormesh() but without success. I use python 2.7 and matplotlib 1.3 from the repo of Ubuntu 14.04 and QT4Agg as a backend. I tried TkAgg too but it is slow and gives the same results. I have the impression that in the x axis the resolution is right but it is definitely downsampled in the vertical direction. Here is a piece of code which should simulate my issue.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors

R, C = 1500, 3000
DATA = np.random.random((R, C))
DATA[::2, :] *= -1  # make every other line negative
Yi, Xi = 1, 10 # increment
CMP = 'seismic'
ImageFormat ='pdf'
Name = 'Image'


DataRange = (np.absolute(DATA)).max() # I want my data centred on 0
EXTENT = [0, Xi*C, 0 ,Yi*R]
NORM = matplotlib.colors.Normalize(vmin =-DataRange, vmax= DataRange, clip =True)

for i in range(1,4):
    Fig=plt.figure(figsize=(45, 10), dpi = 100*i, tight_layout=True)
    Fig.suptitle(Name+str(i)+'00DPI')
    ax = Fig.add_subplot(1, 1, 1)
    Plot = ax.imshow(DATA, cmap=plt.get_cmap(CMP), norm = NORM, extent = EXTENT, aspect = 1, interpolation='none') 
    ax.set_xlabel('metres')
    ax.set_ylabel('metres')
    Fig.savefig(Name+str(i)+'00DPI.'+ImageFormat,  format = ImageFormat, dpi = Fig.dpi)
plt.close()

In imshow(), interpolation = 'none' or 'nearest' or 'bilinear' does not change the resolution for some reason although I think it is supposed to at least in the Qt4 window if I do show() instead of savefig(). Notice that the resolution is the same in the figures saved whatever you setup in the plt.figure(dpi=).

I am out of idea and at the limit of my understanding on how things work with this system. Any help is very welcome.

Thanks in advance.

Superfamily answered 16/10, 2015 at 15:35 Comment(4)
Is saving as an SVG an option? plt.savefig("test.svg")Hygrometer
I have not noticed an improvement saving as svg in terms of vertical resolution.Superfamily
I modified the code so that the image will alternate positive and negative values vertically. The main idea is that if the images are resolved in full we should be able to distinguish blue and red horizontal stripesSuperfamily
Have you considered that this might be a problem of the .pdf viewer? When I run your example, open it using okular and zoom in, I see the stripes. When I zoom out, they are still there. Only when okular decides to downsample the image to free some memory, the stripes disappear.Ricotta
C
3

Running your example, everything looks good in matplotlib after zooming: no matter the resolution, results are the same and I see one pixel per axis unit. Also, trying with smaller arrays, pdfs (or other formats) work well.

This is my explanation: when you set figure dpi, you are setting the dpi of the entire figure (not only the data area). On my system, this results in the plot area occupying vertically about 20% of the entire figure. If you set 300 dpi and 10 in height, you get for vertical data axis a total of 300x10x0.2=600 pixels, that are not enough to represent 1500 points, this explains to me why output must be resampled. Note that reducing the width sometimes incidentally works because it changes the fraction of figure occupied by the data plot.

Then you have to increase the dpi and also set interpolation='none' (it shouldn't matter if resolution is perfectly set, but it matters if it is just close enough). Also you can adjust the plot position and size to take a larger part of the figure, but going back to the optimal resolution settings, ideally you want to have a number of pixel on the axis that is a multiple of your data points, otherwise some kind of interpolation must happen (think how you can plot two points on three pixels, or viceversa).

I don't know if the following is the best way to do it, there might be more suitable methods and properties in matplotlib, but I would try something like this to calculate the optimal dpi:

vsize=ax.get_position().size[1]  #fraction of figure occupied by axes
axesdpi= int((Fig.get_size_inches()[1]*vsize)/R)  #(or Yi*R according to what you want to do)

Then your code (reduced to the first loop), becomes:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors

R, C = 1500, 3000
DATA = np.random.random((R, C))
DATA[::2, :] *= -1  # make every other line negative
Yi, Xi = 1, 10 # increment
CMP = 'seismic'
ImageFormat ='pdf'
Name = 'Image'


DataRange = (np.absolute(DATA)).max() # I want my data centred on 0
EXTENT = [0, Xi*C, 0 ,Yi*R]
NORM = matplotlib.colors.Normalize(vmin =-DataRange, vmax= DataRange, clip =True)

for i in (1,):
    print i 
    Fig=plt.figure(figsize=(45, 10), dpi = 100*i, tight_layout=True)
    Fig.suptitle(Name+str(i)+'00DPI')
    ax = Fig.add_subplot(1, 1, 1)
    Plot = ax.imshow(DATA, cmap=plt.get_cmap(CMP), norm = NORM, extent = EXTENT, aspect = 1, interpolation='none') 
    ax.set_xlabel('metres')
    ax.set_ylabel('metres')
    vsize=ax.get_position().size[1]  #fraction of figure occupied by axes
    axesdpi= int((Fig.get_size_inches()[1]*vsize)/R)  #(or Yi*R according to what you want to do)
    Fig.savefig(Name+str(axesdpi)+'DPI.'+ImageFormat,  format = ImageFormat, dpi = axesdpi)
    #plt.close()

This works reasonably for me.

Commonwealth answered 19/2, 2016 at 20:51 Comment(0)
A
1

Firstly, when you're saving as a .pdf, you are implicitly using the pdf backend, even though you might be specifying other backends in your options. This means your image is saved in vector format and dpi is therefore pretty meaningless. In any resolution, if I load up your PDF in a decent viewer (I used inkscape, others are available), you can clearly see the stripes - I actually found it easier to observe if you set every second row to zero. All the PDFs generated contain complete information to reproduce the stripes and are consequently virtually identical. As you specify figsize=(45, 10), all the generated PDFs have suggested display size 45 inches x 10 inches.

If I specify png as the image type, I see a difference in file size based on the dpi parameter, which I think is what you're expecting. If you look at the 100 dpi image, it has 4500000, the 200 dpi image has 18000000 pixels (4x as many) and the 300 dpi image has 40500000 (9x as many). You will notice that 4500000 == 1500 x 3000 i.e. one pixel per member of your original array. It follows, then, that the larger dpi settings don't gain you any further definition really - instead, your stripes are 2 or 3 pixels wide respectively instead of 1.

I think what you want to do is effectively plot every column 10 times, so you get an image 1500 x 30000 pixels. To do this, using all your own code, you could use np.repeat to do something like the following:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors

R, C = 1500, 3000
DATA = np.random.random((R, C))
DATA[::2, :] = 0  # make every other line plain white
Yi, Xi = 1, 10 # increment
DATA = np.repeat(DATA, Xi, axis=1)
DATA = np.repeat(DATA, Yi)

CMP = 'seismic'
ImageFormat ='pdf'
Name = 'Image'


DataRange = (np.absolute(DATA)).max() # I want my data centred on 0
EXTENT = [0, Xi*C, 0 ,Yi*R]
NORM = matplotlib.colors.Normalize(vmin =-DataRange, vmax= DataRange, clip =True)

for i in range(1,4):
    Fig=plt.figure(figsize=(45, 10), dpi = 100*i, tight_layout=True)
    Fig.suptitle(Name+str(i)+'00DPI')
    ax = Fig.add_subplot(1, 1, 1)
    Plot = ax.imshow(DATA, cmap=plt.get_cmap(CMP), norm = NORM, extent = EXTENT, aspect = 1, interpolation='none') 
    ax.set_xlabel('metres')
    ax.set_ylabel('metres')
    Fig.savefig(Name+str(i)+'00DPI.'+ImageFormat,  format = ImageFormat, dpi = Fig.dpi)
plt.close()

Caveat: This a memory intensive solution - there may be better ways out there. If you don't need the vector graphics output of pdf, you can change your ImageFormat variable to png


It strikes me that the other thing you might be concerned with is to give the picture the appropriate aspect ratio (i.e 20 times as wide as it is high). This you're already doing. So, if you look at each representation of a pixel in the pdf, they are rectangular (10 times as wide as they are tall), not square.

Abiogenetic answered 21/1, 2016 at 17:26 Comment(1)
Just to clarify, the increment is a scale in meter. The data is sampled every metres vertically and every 10 metres horizontally. So I do not really want to repeat the values, I prefer an interpolation (like a median filter). However I understand that my code at the moment is not designed for that (I removed that part for simplicity). Thanks for the explanations, I did not know np.repeat function. I will also use a raster format, it is more meaningful. It is just that I prefer the rendering of the fonts/axis in pdf, I can then edit them in Inkscape for publication.Superfamily
I
0

If you'd like it as a function:

This code is based on the solution by Vincenzooo, but the math is updated for Python 3 and matplotlib 3.7.2.

import math
import os
import matplotlib.pyplot as plt
import numpy as np


def calculate_dpi_needed_for_image(n_rows: int, n_columns: int, fig, ax, min_dpi: int=300, dots_per_entry=2):
    ax_width_inches = fig.get_size_inches()[0] * ax.get_position().size[0] # figsize * fraction of figure occupied by axes
    dpi_w = dots_per_entry * n_columns / ax_width_inches # dpi * inches >= n_columns
    ax_height_inches = fig.get_size_inches()[1] * ax.get_position().size[1] # figsize * fraction of figure occupied by axes
    dpi_h = dots_per_entry * n_rows / ax_height_inches # dpi * inches >= n_rows
    dpi = max(dpi_h, dpi_w)
    if min_dpi is not None:
        if dpi < min_dpi: # Ensure DPI is at least 300 for legibility.
            dpi = dpi * math.ceil(300 / dpi)
    return dpi


# Create a sample dataset.
# data = np.random.rand(10, 20)
data = np.random.rand(1280, 1920)
data[::2, :] = 1.0 # make every other row equal one
data[:, ::2] = 1.0 # make every other column equal one

# Plot the data.
fig = plt.figure(figsize=(10, 10))
ax = plt.subplot(1, 1, 1)
ax.imshow(data, cmap='hot', interpolation='none')
dpi = calculate_dpi_needed_for_image(data.shape[0], data.shape[1], fig, ax, dots_per_entry=2, min_dpi=None)
print(f'Using DPI: {dpi}')
path = os.path.join(os.path.dirname(__file__), 'test_dpi_util.png')
plt.savefig(path, dpi=dpi)
Iives answered 21/9, 2023 at 18:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.