Histogram calculation in julia-lang
Asked Answered
L

3

7

refer to julia-lang documentations :

hist(v[, n]) → e, counts

Compute the histogram of v, optionally using approximately n bins. The return values are a range e, which correspond to the edges of the bins, and counts containing the number of elements of v in each bin. Note: Julia does not ignore NaN values in the computation.

I choose a sample range of data

testdata=0:1:10;

then use hist function to calculate histogram for 1 to 5 bins

hist(testdata,1) # => (-10.0:10.0:10.0,[1,10])
hist(testdata,2) # => (-5.0:5.0:10.0,[1,5,5])
hist(testdata,3) # => (-5.0:5.0:10.0,[1,5,5])
hist(testdata,4) # => (-5.0:5.0:10.0,[1,5,5])
hist(testdata,5) # => (-2.0:2.0:10.0,[1,2,2,2,2,2])

as you see when I want 1 bin it calculates 2 bins, and when I want 2 bins it calculates 3.

why does this happen?

Lyda answered 7/9, 2015 at 10:5 Comment(0)
P
9

As the person who wrote the underlying function: the aim is to get bin widths that are "nice" in terms of a base-10 counting system (i.e. 10k, 2×10k, 5×10k). If you want more control you can also specify the exact bin edges.

Piccolo answered 7/9, 2015 at 16:6 Comment(4)
Thanks for your response, I think some lack of documentation exists there.Lyda
Is there an easy way to plot the results of the hist function?Software
Yes - using StatsBase; h = fit(Histogram, randn(1000)); using StatPlots; plot(h)Trussell
This function seems to be deprecated in Base module. Is there any alternative?Explore
M
5

The key word in the doc is approximate. You can check what hist is actually doing for yourself in Julia's base module here.

When you do hist(test,3), you're actually calling

hist(v::AbstractVector, n::Integer) = hist(v,histrange(v,n))

That is, in a first step the n argument is converted into a FloatRange by the histrange function, the code of which can be found here. As you can see, the calculation of these steps is not entirely straightforward, so you should play around with this function a bit to figure out how it is constructing the range that forms the basis of the histogram.

Mentor answered 7/9, 2015 at 12:18 Comment(0)
C
1

In new versions of Julia, the hist function is not present.

To calculate a histogram, one should use StatsBase.Histogram and StatsBase.fit, e.g.:

    using StatsBase
    h = fit(Histogram, rand(100))
    print(h)

Output:

Histogram{Int64, 1, Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}}
edges:
  0.0:0.2:1.0
weights: [21, 22, 17, 16, 24]
closed: left
isdensity: false
Champollion answered 15/11, 2023 at 9:49 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.