Create stacked histogram from unequal length arrays
Asked Answered
T

2

68

I'd like to create a stacked histogram. If I have a single 2-D array, made of three equal length data sets, this is simple. Code and image below:

import numpy as np
from matplotlib import pyplot as plt

# create 3 data sets with 1,000 samples
mu, sigma = 200, 25
x = mu + sigma*np.random.randn(1000,3)

#Stack the data
plt.figure()
n, bins, patches = plt.hist(x, 30, stacked=True, density = True)
plt.show()

enter image description here

However, if I try similar code with three data sets of a different length the results are that one histogram covers up another. Is there any way I can do the stacked histogram with mixed length data sets?

##Continued from above
###Now as three separate arrays
x1 = mu + sigma*np.random.randn(990,1)
x2 = mu + sigma*np.random.randn(980,1)
x3 = mu + sigma*np.random.randn(1000,1)

#Stack the data
plt.figure()
plt.hist(x1, bins, stacked=True, density = True)
plt.hist(x2, bins, stacked=True, density = True)
plt.hist(x3, bins, stacked=True, density = True)
plt.show()

enter image description here

Thaddeusthaddus answered 26/8, 2013 at 17:26 Comment(0)
T
102

Well, this is simple. I just need to put the three arrays in a list.

##Continued from above
###Now as three separate arrays
x1 = mu + sigma*np.random.randn(990,1)
x2 = mu + sigma*np.random.randn(980,1)
x3 = mu + sigma*np.random.randn(1000,1)

#Stack the data
plt.figure()
plt.hist([x1,x2,x3], bins, stacked=True, density=True)
plt.show()
Thaddeusthaddus answered 26/8, 2013 at 17:35 Comment(3)
What's the best way to add a legend for x1, x2, and x3 separately?Firedog
plt.hist([x1,x2,x3], bins, stacked=True, color=["red", "blue", "violet"], normed = True); plt.legend({label1: "red", label2: "blue", label3: "violet"})Metonymy
This code is throwing error for me: ValueError: x must have 2 or fewer dimensionsJugum
S
8
import pandas as pd
import numpy as np

# create the uneven arrays
mu, sigma = 200, 25
np.random.seed(365)
x1 = mu + sigma*np.random.randn(990, 1)
x2 = mu + sigma*np.random.randn(980, 1)
x3 = mu + sigma*np.random.randn(1000, 1)

# create the dataframe; enumerate is used to make column names
df = pd.concat([pd.DataFrame(a, columns=[f'x{i}']) for i, a in enumerate([x1, x2, x3], 1)], axis=1)

# plot the data
df.plot.hist(stacked=True, bins=30, density=True, figsize=(10, 6), grid=True)

enter image description here

Sunbreak answered 9/9, 2020 at 23:18 Comment(2)
Do we know what the y-axis decimals refer to?Quiteri
@kudo See the density parameter in matplotlib.pyplot.hist for an explanation of the y-axis values.Sunbreak

© 2022 - 2024 — McMap. All rights reserved.