How to make Matplotlib scatterplots transparent as a group?
Asked Answered
G

6

42

I'm making some scatterplots using Matplotlib (python 3.4.0, matplotlib 1.4.3, running on Linux Mint 17). It's easy enough to set alpha transparency for each point individually; is there any way to set them as a group, so that two overlapping points from the same group don't change the color?

Example code:

import matplotlib.pyplot as plt
import numpy as np

def points(n=100):
    x = np.random.uniform(size=n)
    y = np.random.uniform(size=n)
    return x, y
x1, y1 = points()
x2, y2 = points()
fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111, title="Test scatter")
ax.scatter(x1, y1, s=100, color="blue", alpha=0.5)
ax.scatter(x2, y2, s=100, color="red", alpha=0.5)
fig.savefig("test_scatter.png")

Results in this output:

enter image description here

but I want something more like this one:

enter image description here

I can workaround by saving as SVG and manually grouping then in Inkscape, then setting transparency, but I'd really prefer something I can code. Any suggestions?

Greathouse answered 7/5, 2015 at 17:54 Comment(1)
Probably not, because doing that is counter to what a scatterplot is usually trying to show.Simplex
N
20

Yes, interesting question. You can get this scatterplot with Shapely. Here is the code :

import matplotlib.pyplot as plt
import matplotlib.patches as ptc
import numpy as np
from shapely.geometry import Point
from shapely.ops import cascaded_union

n = 100
size = 0.02
alpha = 0.5

def points():
    x = np.random.uniform(size=n)
    y = np.random.uniform(size=n)
    return x, y

x1, y1 = points()
x2, y2 = points()
polygons1 = [Point(x1[i], y1[i]).buffer(size) for i in range(n)]
polygons2 = [Point(x2[i], y2[i]).buffer(size) for i in range(n)]
polygons1 = cascaded_union(polygons1)
polygons2 = cascaded_union(polygons2)

fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111, title="Test scatter")
for polygon1 in polygons1:
    polygon1 = ptc.Polygon(np.array(polygon1.exterior), facecolor="red", lw=0, alpha=alpha)
    ax.add_patch(polygon1)
for polygon2 in polygons2:
    polygon2 = ptc.Polygon(np.array(polygon2.exterior), facecolor="blue", lw=0, alpha=alpha)
    ax.add_patch(polygon2)
ax.axis([-0.2, 1.2, -0.2, 1.2])

fig.savefig("test_scatter.png")

and the result is :

Test scatter

Nihil answered 7/5, 2015 at 20:20 Comment(3)
Very cool use of shapely where I would never have expected it! Do you think the descartes package would simplify the plotting at all?Earthworm
Thanks ! Yes, the descartes package can be used. After the cascaded_union: create patches with descartes.PolygonPatch, use matplotlib.collections.PathCollection and replace add_patch by add_collection. This will do the job with fewer lines.Nihil
I run this code and get " for polygon1 in polygons1: TypeError: 'MultiPolygon' object is not iterable" .... Is this due to changes in Shapely?Television
C
13

Interesting question, I think any use of transparency will result in the stacking effect you want to avoid. You could manually set a transparency type colour to get closer to the results you want,

import matplotlib.pyplot as plt
import numpy as np

def points(n=100):
    x = np.random.uniform(size=n)
    y = np.random.uniform(size=n)
    return x, y
x1, y1 = points()
x2, y2 = points()
fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111, title="Test scatter")
alpha = 0.5
ax.scatter(x1, y1, s=100, lw = 0, color=[1., alpha, alpha])
ax.scatter(x2, y2, s=100, lw = 0, color=[alpha, alpha, 1.])
plt.show()

The overlap between the different colours are not included in this way but you get,

enter image description here

Caster answered 7/5, 2015 at 18:23 Comment(4)
Bonus: it doesn't require an additional library!And
You can't see the red through the blue or vice versa though.Pelasgian
I think the color is wrong, it should be a RGBA tuple, instead of a RGB so: [0,0,1,0.5] should be transparent blueResolute
@Kev1n91, setting alpha to anything other than one (the default for an RGB value with no alpha) means you can see the overlap, which the OP specified they did not want: "overlapping points from the same group don't change the color"Caster
P
9

This is a terrible, terrible hack, but it works.

You see while Matplotlib plots data points as separate objects that can overlap, it plots the line between them as a single object - even if that line is broken into several pieces by NaNs in the data.

With that in mind, you can do this:

import numpy as np
from matplotlib import pyplot as plt

plt.rcParams['lines.solid_capstyle'] = 'round'

def expand(x, y, gap=1e-4):
    add = np.tile([0, gap, np.nan], len(x))
    x1 = np.repeat(x, 3) + add
    y1 = np.repeat(y, 3) + add
    return x1, y1

x1, y1 = points()
x2, y2 = points()
fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111, title="Test scatter")
ax.plot(*expand(x1, y1), lw=20, color="blue", alpha=0.5)
ax.plot(*expand(x2, y2), lw=20, color="red", alpha=0.5)

fig.savefig("test_scatter.png")
plt.show()

And each color will overlap with the other color but not with itself.

enter image description here

One caveat is that you have to be careful with the spacing between the two points you use to make each circle. If they're two far apart then the separation will be visible on your plot, but if they're too close together, matplotlib doesn't plot the line at all. That means that the separation needs to be chosen based on the range of your data, and if you plan to make an interactive plot then there's a risk of all the data points suddenly vanishing if you zoom out too much, and stretching if you zoom in too much.

As you can see, I found 1e-5 to be a good separation for data with a range of [0,1].

Psychosis answered 6/5, 2018 at 19:39 Comment(1)
This was exactly what I needed! For logarithmic plots, adding [0, gap, nan] doesn't work simultaneously over many orders of magnitude, so I multiply by [1, 1+gap, nan] instead.Plover
S
7

Just pass an argument saying edgecolors='none' to plt.scatter()

Surpass answered 25/9, 2020 at 10:9 Comment(0)
D
1

Here's a hack if you have more than just a few points to plot. I had to plot >500000 points, and the shapely solution does not scale well. I also wanted to plot a different shape other than a circle. I opted to instead plot each layer separately with alpha=1 and then read in the resulting image with np.frombuffer (as described here), then add the alpha to the whole image and plot overlays using plt.imshow. Note this solution forfeits access to the original fig object and attributes, so any other modifications to figure should be made before it's drawn.

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
from matplotlib.figure import Figure

def arr_from_fig(fig):
    canvas = FigureCanvas(fig)
    canvas.draw()
    img = np.frombuffer(fig.canvas.tostring_rgb(), dtype=np.uint8)
    img = img.reshape(fig.canvas.get_width_height()[::-1] + (3,))
    return img

def points(n=100):
    x = np.random.uniform(size=n)
    y = np.random.uniform(size=n)
    return x, y

x1, y1 = points()
x2, y2 = points()
imgs = list()
figsize = (4, 4)
dpi = 200

for x, y, c in zip([x1, x2], [y1, y2], ['blue', 'red']):
    fig = plt.figure(figsize=figsize, dpi=dpi, tight_layout={'pad':0})
    ax = fig.add_subplot(111)
    ax.scatter(x, y, s=100, color=c, alpha=1)
    ax.axis([-0.2, 1.2, -0.2, 1.2])
    ax.axis('off')
    imgs.append(arr_from_fig(fig))
    plt.close()


fig = plt.figure(figsize=figsize)
alpha = 0.5

alpha_scaled = 255*alpha
for img in imgs:
    img_alpha = np.where((img == 255).all(-1), 0, alpha_scaled).reshape([*img.shape[:2], 1])
    img_show = np.concatenate([img, img_alpha], axis=-1).astype(int)
    plt.imshow(img_show, origin='lower')

ticklabels = ['{:03.1f}'.format(i) for i in np.linspace(-0.2, 1.2, 8, dtype=np.float16)]
plt.xticks(ticks=np.linspace(0, dpi*figsize[0], 8), labels=ticklabels)
plt.yticks(ticks=np.linspace(0, dpi*figsize[1], 8), labels=ticklabels);
plt.title('Test scatter');

enter image description here

Decemvirate answered 30/11, 2021 at 7:14 Comment(0)
U
0

I encountered the save issue recently, my case is there are too many points very close to each other, like 100 points of alpha 0.3 on top of each other, the alpha of the color in the generated image is almost 1. So instead of setting the alpha value in the cmap or scatter. I save it to a Pillow image and set the alpha channel there. My code:

import io
import os

import numpy as np
import numpy.ma as ma
import matplotlib.pyplot as plt
from matplotlib import colors
from PIL import Image

from dhi_base import DHIBase

class HeatMapPlot(DHIBase):

    def __init__(self) -> None:
        super().__init__()

        # these 4 values are precalculated
        top=75
        left=95
        width=1314
        height=924

        self.crop_box = (left, top, left+width, top+height)
        # alpha 0.5, [0-255]
        self.alpha = 128

    def get_cmap(self):

        v = [
                ...
        ]

        return colors.LinearSegmentedColormap.from_list(
            'water_level', v, 512)

    def png3857(self):
        """Generate flooding images

        """

        muids = np.load(os.path.join(self.npy_dir, 'myfilename.npy'))

        cmap = self.get_cmap()

        i = 0

        for npyf in os.listdir(self.npy_dir):

            if not npyf.startswith('flooding'):
                continue

            flooding_num = np.load(os.path.join(self.npy_dir, npyf))
            image_file = os.path.join(self.img_dir, npyf.replace('npy', 'png'))

            # if os.path.isfile(image_file):
            #     continue

            # filter the water level value that is less than 0.001
            masked_arr = ma.masked_where(flooding_num > 0.001, flooding_num)

            flooding_masked = flooding_num[masked_arr.mask]
            muids_masked = muids[masked_arr.mask, :]

            plt.figure(figsize=(self.grid2D['numJ'] / 500, self.grid2D['numK'] / 500))
            plt.axis('off')
            plt.tight_layout()

            plt.scatter(muids_masked[:, 0], muids_masked[:, 1], s=0.1, c=flooding_masked, 
                        alpha=1, edgecolors='none', linewidths=0,
                        cmap=cmap, 
                        vmin=0, vmax=1.5)

            img_buf = io.BytesIO()

            plt.savefig(img_buf, transparent=True, dpi=200, format='png')#, pad_inches=0)

            plt.clf()
            plt.close()

            img_buf.seek(0)

            img = Image.open(img_buf)

            # Cropped image of above dimension
            # (It will not change original image)
            img = img.crop(self.crop_box)

            alpha_channel = img.getchannel('A')

            # Make all opaque pixels into semi-opaque
            alpha_channel = alpha_channel.point(lambda i: self.alpha if i>0 else 0)

            img.putalpha(alpha_channel)

            img.save(image_file)

            self.logger.info("PNG saved to {}".format(image_file))

            i += 1

            # if i > 15:
            #     break

if __name__ == "__main__":

    hp = HeatMapPlot()

    hp.png3857()
Un answered 26/8, 2022 at 3:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.