statistics Questions
13
Solved
I'm using Python and Numpy to calculate a best fit polynomial of arbitrary degree. I pass a list of x values, y values, and the degree of the polynomial I want to fit (linear, quadratic, etc.).
Th...
Joellyn asked 21/5, 2009 at 15:55
10
Solved
How would you create a qq-plot using Python?
Assuming that you have a large set of measurements and are using some plotting function that takes XY-values as input. The function should plot the qua...
Crackling asked 13/12, 2012 at 17:54
3
Solved
I have a table that looks like this.
> dput(theft_loc)
structure(c(13704L, 14059L, 14263L, 14450L, 14057L, 15503L, 14230L,
16758L, 15289L, 15499L, 16066L, 15905L, 18531L, 19217L, 12410L,
133...
Joiejoin asked 7/2, 2017 at 17:31
1
I am trying to write a multiple linear regression model from scratch to predict the key factors contributing to number of views of a song on Facebook. About each song we collect this information, i...
Maddocks asked 15/1, 2018 at 4:58
3
Solved
I need to generate bins for the purposes of calculating a histogram. Language is C#. Basically I need to take in an array of decimal numbers and generate a histogram plot out of those.
Haven't be...
Ryurik asked 5/3, 2010 at 15:40
15
I'm trying to find an efficient, numerically stable algorithm to calculate a rolling variance (for instance, a variance over a 20-period rolling window). I'm aware of the Welford algorithm that eff...
Gobetween asked 28/2, 2011 at 20:46
2
Solved
My question involves statistics and python and I am a beginner in both. I am running a simulation, and for each value for the independent variable (X) I produce 1000 values for the dependent variab...
Stupefy asked 11/9, 2016 at 8:52
2
Solved
I have a dataframes with a column called Means. I want to get just the first quartile from this column. I know I can use quartile (df) or summary (df) but this gives me all the quartiles. How do I ...
Environs asked 1/9, 2015 at 10:30
3
We're working on panel data, and there is a command in Stata, xtsum, that gives you within and between variance for the variables in the data set.
Is there a similar command for R, that produces cl...
Ona asked 14/3, 2018 at 15:43
8
Solved
I have a matrix data with m rows and n columns. I used to compute the correlation coefficients between all pairs of rows using np.corrcoef:
import numpy as np
data = np.array([[0, 1, -1], [0, -1, ...
Sought asked 26/6, 2014 at 13:39
6
In machine learning cost function, if we want to minimize the influence of two parameters, let's say theta3 and theta4, it seems like we have to give a large value of regularization parameter just ...
Stonechat asked 25/6, 2017 at 0:26
8
Solved
Example input:
SELECT * FROM test;
id | percent
----+----------
1 | 50
2 | 35
3 | 15
(3 rows)
How would you write such query, that on average 50% of time i could get the row with id=1, ...
X asked 23/10, 2012 at 22:22
13
Solved
I need to know if a number compared to a set of numbers is outside of 1 stddev from the mean, etc..
Planetarium asked 22/5, 2009 at 0:26
3
Solved
TL;DR: How to plot the result of np.histogram(..., density=True) correctly with Numpy?
Using density=True should help to match the histogram of the sample, and the density function of the underlyin...
Ardeb asked 25/10, 2023 at 20:29
11
How can I find the p-value (significance) of each coefficient?
lm = sklearn.linear_model.LinearRegression()
lm.fit(x,y)
Michaelmichaela asked 13/1, 2015 at 17:46
2
Solved
In the context of the Student's t-distribution cumulative distribution function, R Version 4.3.1's ?dt documentation highlights the following result:
However, upon attempting to verify the accurac...
Whitson asked 9/10, 2023 at 13:15
5
Solved
With scipy.stats.linregress I am performing a simple linear regression on some sets of highly correlated x,y experimental data, and initially visually inspecting each x,y scatter plot for outliers....
Knitwear asked 19/4, 2012 at 15:14
18
Solved
I often find myself with a file that has one number per line. I end up importing it in excel to view things like median, standard deviation and so forth.
Is there a command line utility in linux t...
Flintshire asked 20/3, 2012 at 15:31
4
Solved
I am having trouble making a scatter plot that has from a date array and a bunch of PM 2.5 values. My lists would look like the following:
dates = ['2015-12-20','2015-09-12']
PM_25 = [80, 55]
Latrinalatrine asked 7/7, 2016 at 23:21
11
Solved
I'm trying to sort a bunch of products by customer ratings using a 5 star system. The site I'm setting this up for does not have a lot of ratings and continue to add new products so it will usually...
Antlion asked 11/9, 2009 at 14:24
16
Solved
I would like to convert a NumPy array to a unit vector. More specifically, I am looking for an equivalent version of this normalisation function:
def normalize(v):
norm = np.linalg.norm(v)
if nor...
Alvinalvina asked 9/1, 2014 at 20:25
2
Solved
I have two GMMs that I used to fit two different sets of data in the same space, and I would like to calculate the KL-divergence between them.
Currently I am using the GMMs defined in sklearn (htt...
Westerman asked 27/9, 2014 at 22:44
2
Solved
In the documentation of scipy, the 'frozen pdf', etc, is mentioned sometimes, but I don't know the meaning of it? Is it a statistical concept or scipy terminology?
Halford asked 10/3, 2020 at 7:36
2
Solved
I have a dataset where I need to be able to control to what extent the Outlier Detection Model (Isolation Forest, Elliptic Envelope, OneClassSVM...) considers a given point an outlier or not (somet...
Komara asked 24/7, 2020 at 12:51
1
Solved
I just recently came accross that there is different defition of quantile() in Julia and Matlab.
I was unable to align the two definitions and always get different result.
Does anybody know why is ...
Exclamatory asked 10/9, 2023 at 18:42
© 2022 - 2024 — McMap. All rights reserved.