statistics Questions

13

Solved

I'm using Python and Numpy to calculate a best fit polynomial of arbitrary degree. I pass a list of x values, y values, and the degree of the polynomial I want to fit (linear, quadratic, etc.). Th...
Joellyn asked 21/5, 2009 at 15:55

10

Solved

How would you create a qq-plot using Python? Assuming that you have a large set of measurements and are using some plotting function that takes XY-values as input. The function should plot the qua...
Crackling asked 13/12, 2012 at 17:54

3

Solved

I have a table that looks like this. > dput(theft_loc) structure(c(13704L, 14059L, 14263L, 14450L, 14057L, 15503L, 14230L, 16758L, 15289L, 15499L, 16066L, 15905L, 18531L, 19217L, 12410L, 133...
Joiejoin asked 7/2, 2017 at 17:31

1

I am trying to write a multiple linear regression model from scratch to predict the key factors contributing to number of views of a song on Facebook. About each song we collect this information, i...
Maddocks asked 15/1, 2018 at 4:58

3

Solved

I need to generate bins for the purposes of calculating a histogram. Language is C#. Basically I need to take in an array of decimal numbers and generate a histogram plot out of those. Haven't be...
Ryurik asked 5/3, 2010 at 15:40

15

I'm trying to find an efficient, numerically stable algorithm to calculate a rolling variance (for instance, a variance over a 20-period rolling window). I'm aware of the Welford algorithm that eff...
Gobetween asked 28/2, 2011 at 20:46

2

Solved

My question involves statistics and python and I am a beginner in both. I am running a simulation, and for each value for the independent variable (X) I produce 1000 values for the dependent variab...
Stupefy asked 11/9, 2016 at 8:52

2

Solved

I have a dataframes with a column called Means. I want to get just the first quartile from this column. I know I can use quartile (df) or summary (df) but this gives me all the quartiles. How do I ...
Environs asked 1/9, 2015 at 10:30

3

We're working on panel data, and there is a command in Stata, xtsum, that gives you within and between variance for the variables in the data set. Is there a similar command for R, that produces cl...
Ona asked 14/3, 2018 at 15:43

8

Solved

I have a matrix data with m rows and n columns. I used to compute the correlation coefficients between all pairs of rows using np.corrcoef: import numpy as np data = np.array([[0, 1, -1], [0, -1, ...
Sought asked 26/6, 2014 at 13:39

6

In machine learning cost function, if we want to minimize the influence of two parameters, let's say theta3 and theta4, it seems like we have to give a large value of regularization parameter just ...
Stonechat asked 25/6, 2017 at 0:26

8

Solved

Example input: SELECT * FROM test; id | percent ----+---------- 1 | 50 2 | 35 3 | 15 (3 rows) How would you write such query, that on average 50% of time i could get the row with id=1, ...
X asked 23/10, 2012 at 22:22

13

Solved

I need to know if a number compared to a set of numbers is outside of 1 stddev from the mean, etc..
Planetarium asked 22/5, 2009 at 0:26

3

Solved

TL;DR: How to plot the result of np.histogram(..., density=True) correctly with Numpy? Using density=True should help to match the histogram of the sample, and the density function of the underlyin...
Ardeb asked 25/10, 2023 at 20:29

11

How can I find the p-value (significance) of each coefficient? lm = sklearn.linear_model.LinearRegression() lm.fit(x,y)
Michaelmichaela asked 13/1, 2015 at 17:46

2

Solved

In the context of the Student's t-distribution cumulative distribution function, R Version 4.3.1's ?dt documentation highlights the following result: However, upon attempting to verify the accurac...
Whitson asked 9/10, 2023 at 13:15

5

Solved

With scipy.stats.linregress I am performing a simple linear regression on some sets of highly correlated x,y experimental data, and initially visually inspecting each x,y scatter plot for outliers....
Knitwear asked 19/4, 2012 at 15:14

18

Solved

I often find myself with a file that has one number per line. I end up importing it in excel to view things like median, standard deviation and so forth. Is there a command line utility in linux t...
Flintshire asked 20/3, 2012 at 15:31

4

Solved

I am having trouble making a scatter plot that has from a date array and a bunch of PM 2.5 values. My lists would look like the following: dates = ['2015-12-20','2015-09-12'] PM_25 = [80, 55]
Latrinalatrine asked 7/7, 2016 at 23:21

11

Solved

I'm trying to sort a bunch of products by customer ratings using a 5 star system. The site I'm setting this up for does not have a lot of ratings and continue to add new products so it will usually...
Antlion asked 11/9, 2009 at 14:24

16

Solved

I would like to convert a NumPy array to a unit vector. More specifically, I am looking for an equivalent version of this normalisation function: def normalize(v): norm = np.linalg.norm(v) if nor...
Alvinalvina asked 9/1, 2014 at 20:25

2

Solved

I have two GMMs that I used to fit two different sets of data in the same space, and I would like to calculate the KL-divergence between them. Currently I am using the GMMs defined in sklearn (htt...
Westerman asked 27/9, 2014 at 22:44

2

Solved

In the documentation of scipy, the 'frozen pdf', etc, is mentioned sometimes, but I don't know the meaning of it? Is it a statistical concept or scipy terminology?
Halford asked 10/3, 2020 at 7:36

2

Solved

I have a dataset where I need to be able to control to what extent the Outlier Detection Model (Isolation Forest, Elliptic Envelope, OneClassSVM...) considers a given point an outlier or not (somet...

1

Solved

I just recently came accross that there is different defition of quantile() in Julia and Matlab. I was unable to align the two definitions and always get different result. Does anybody know why is ...
Exclamatory asked 10/9, 2023 at 18:42

© 2022 - 2024 — McMap. All rights reserved.