Histogram has only one bar
Asked Answered
C

2

10

My data--a 196,585-record numpy array extracted from a pandas dataframe--are being placed into a single bin by matplotlib.hist. The data were originally integers, so I tried converting them to float as wel, as shown below, but they are still not being distributed among 10 bins.

Interestingly, a small sub-sample (using df.sample(0.00x)) of the integer data are successfully distributed.

Any suggestions on where I may be erring in data preparation or use of matplotlib's histogram function would be appreciated.

histogram output

x = df[(df['UNIT']=='X')].OPP_VALUE.values
num_bins = 10
n, bins, patches = plt.hist((x[(x>0)]).astype(float), num_bins, normed=False, facecolor='0.5', alpha=0.8)
plt.show()
Chloe answered 2/8, 2016 at 17:47 Comment(3)
try using log=True - your sample contains very few large values which skew the distribution. You may have to think about removing them.Bonne
Yup. Looks like you need to zoom in all the way in. Can you print the output of print(n); print(bins);.Puli
You hit the nail on the head, so much so that log=True even doesn't work: print(bins) [ 1.00000000e+00 3.00000000e+09 6.00000000e+09 9.00000000e+09 1.20000000e+10 1.50000000e+10 1.80000000e+10 2.10000000e+10 2.40000000e+10 2.70000000e+10 3.00000000e+10] print(n) [ 1.86114000e+05 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]Chloe
H
5

Most likely what is happening is that the number of data points with x > 0.5 is very small but you do have some outliers that forces the hist function to pick the scale it does. Try removing all values > 0.5 (or 1 if you do not want to convert to float) and then plot again.

Heartwood answered 23/5, 2019 at 1:8 Comment(1)
Im also facing this issue, could you explian a littile elaborately,, I am plotting after removing outliers using z score and I am getting thisKile
D
-1

you should modify number of bins, for exam

number_of_bins = 200
bin_cutoffs = np.linspace(np.percentile(x,0), np.percentile(x,99),number_of_bins)
Dower answered 16/12, 2022 at 4:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.