quantile Questions
1
Solved
I have a PySpark dataframe which contains an ID and then a couple of variables for which I want to calculate the 95% point.
Part of the printSchema():
root
|-- ID: string (nullable = true)
|--...
Obligation asked 19/9, 2018 at 12:3
2
Solved
I'm having trouble finding quantile functions for well-known probability distributions in Python, do they exist? In particular, is there an inverse normal distribution function? I couldn't find any...
2
Solved
I want to get summary data of the first quartile for a table in Hive. Below is a query to get the maximum number of views in each quartile:
SELECT NTILE(4) OVER (ORDER BY total_views) AS quartile,...
4
Solved
I'm sorry for what may be a silly question.
When I do:
> quantile(df$column, .75) #get 3rd quartile
I get something like
75%
1234.5
Is there a way to just get the value (1234.5) without ...
2
Solved
All,
I have a ml pipeline setup as below
import org.apache.spark.ml.feature.QuantileDiscretizer
import org.apache.spark.sql.types.{StructType,StructField,DoubleType}
import org.apache.spark.ml.P...
Charnel asked 26/4, 2017 at 16:1
1
Solved
I have a 2D-distribution of points (roughly speaking, two np.arrays, x and y) as shown in the figure attached.
How can I select the points of the distribution that are part of the n-th quantile o...
Dicot asked 30/3, 2018 at 8:44
1
Solved
I have a dataframe in Spark, and would like to calculate the 0.1 quantile after grouping by a specific column.
For example:
> library(sparklyr)
> library(tidyverse)
> con = spark_connect...
Fishbein asked 12/2, 2018 at 12:32
1
Solved
According to the docs:
Returns the approximate boundaries for a group of expression values, where number represents the number of quantiles to create. This function returns an array of number + ...
Falciform asked 18/1, 2018 at 17:10
1
Solved
Calculating the maximum quantile over all dataseries is a problem for me:
query
http_response_time{job=~"^(x|y)$", quantile="0.95",...}
result
http_response_time{job="x",...} 0.26
http_respons...
Stefanistefania asked 19/12, 2017 at 12:32
1
Solved
I'm giving a sorted array of 24 numbers to d3.quantile and asking it to calculate the first quartile value. Since the array can be split evenly into four groups of 6 values, my assumption was that ...
Sessions asked 18/9, 2017 at 1:59
1
Solved
I want to calculate the multivariate gaussian density function for a data set I have on python. My dataset has 21 variables and there 75 data points.
I have calculated the covariance matrix (cov)...
1
Solved
I would like to do quantile cuts (cut into n bins with equal number of points) for each group
qcut = function(x, n) {
quantiles = seq(0, 1, length.out = n+1)
cutpoints = unname(quantile(x, quant...
Kilocycle asked 22/3, 2017 at 10:2
6
Solved
I need to count the quantiles for a large set of data.
Let's assume we can get the data only through some portions (i.e. one row of a large matrix). To count the Q3 quantile one need to get all th...
Holocaine asked 14/5, 2010 at 20:14
1
Solved
I have a dataframe:
df = pd.DataFrame(np.random.randint(0,100,size=(5, 2)), columns=list('AB'))
A B
0 92 65
1 61 97
2 17 39
3 70 47
4 56 6
Here are 5% quantiles:
down_quantiles = df.quantile(0...
6
Solved
I have a very simple table like that:
CREATE TABLE IF NOT EXISTS LuxLog (
Sensor TINYINT,
Lux INT,
PRIMARY KEY(Sensor)
)
It contains thousands of logs from different sensors.
I would like to...
Incapacity asked 3/7, 2015 at 14:53
1
I have a dataframe t_unit, which is the result of a pd.read_csv() function.
datetime B18_LR_T B18_B1_T
24/03/2016 09:00 21.274 21.179
24/03/2016 10:00 19.987 19.868
24/03/2016 11:00 21.632 21.417
...
Radon asked 26/9, 2016 at 16:43
2
Solved
I have a data.table and would like to compute stats by groups.
R) set.seed(1)
R) DT=data.table(a=rnorm(100),b=rnorm(100))
Those groups should be defined by
R) quantile(DT$a,probs=seq(.1,.9,.1))...
Alasdair asked 21/3, 2014 at 20:31
1
Solved
I have a time series of hourly values and I am trying to derive some basic statistics on a weekly/monthly basis.
If we use the following abstract dataframe, were each column is time-series:
rng =...
Inconsumable asked 31/8, 2016 at 10:7
2
Solved
I am performing an extreme value analysis for meteorological data, to be precise for precipitation data available in mm/d. I am using a threshold excess approach for estimating the parameters of a ...
2
My dataset contains multiple observations for different species. Each species has a different number of observations. Looking for a fast way in R to calculate the mean of the top 10% of values for ...
Apostate asked 13/4, 2016 at 0:18
1
Solved
I am trying to use ecdf, but I am not sure if I am doing it right. My ultimate purpose is to find what quantile corresponds to a specific value. As an example:
sample_set <- c(20, 40, 60, 80, 1...
2
Solved
I have the following df:
group = rep(seq(1,3),30)
variable = runif(90, 5.0, 7.5)
df = data.frame(group,variable)
I need to i) Define quantile by groups, ii) Assign each person to her quantile wi...
2
Solved
I need to get the Nth quantile of a beta distribution, or equivalently, the 95% or 99% percentile. This is so much easier in Maple, which allows symbolic input -- but how is this done in Python?
I'...
Roundlet asked 26/10, 2015 at 1:41
1
Solved
Problem Setup
In statsmodels Quantile Regression problem, their Least Absolute Deviation summary output shows the Intercept. In that example, they are using a formula
from __future__ import print_...
Launder asked 10/10, 2015 at 5:25
3
Solved
I created a scatterplot (multiple groups GRP) with IV=time, DV=concentration. I wanted to add the quantile regression curves (0.025,0.05,0.5,0.95,0.975) to my plot.
And by the way, this is what I...
Bedpan asked 22/2, 2013 at 1:51
© 2022 - 2024 — McMap. All rights reserved.