How to fit the best probability distribution model to my data in python?

i have about 20,000 rows of data like this,,

Id | value
1    30
2    3
3    22
..
n    27

I did statistics to my data,, the average value 33.85, median 30.99, min 2.8, max 206, 95% confidence interval 0.21.. So most values around 33, and there are some outliers (a little).. So it seems like a distribution with long tail.

I am new to both distribution and python,, i tried class fitter https://pypi.org/project/fitter/ to try many distribution from Scipy package,, and loglaplace distribution showed the lowest error (although not quiet understand it).

I read almost all questions in this thread and i concluded two approaches (1) fitting a distribution model and then in my simulation i draw random values (2) compute the frequency of different groups of values,, but this solution will not have a value more than 206 for example.

Having my data which is values (number), what is the best approach to fit a distribution to my data in python as in my simulation i need to draw numbers. The random numbers must have same pattern as my data. Also i need to validate the model is well presenting my data by drawing my data and the model curve.

import openturns as ot # Define x as a Sample object. It is a sample of size 11 and dimension 1 sample = ot.Sample([[xi] for xi in x]) # define distributions you want to test on the sample tested_distributions = [ot.WeibullMaxFactory(), ot.NormalFactory(), ot.UniformFactory()] # find the best distribution according to BIC and print its parameters best_model, best_bic = ot.FittingTest.BestModelBIC(sample, tested_distributions) print(best_model) >>> Uniform(a = -0.769231, b = 10.7692)

Recommended topics

Hot tags