split-apply-combine Questions
2
I know this must be super easy, but I'm having trouble finding the right dplyr commands to do this. Let's say I want to group a dataset by two variables, and then summarize the count for each row. ...
Extravagancy asked 24/4, 2018 at 1:15
4
Solved
I have downloaded a list of all the towns and cities etc in the US from the census bureau. Here is a random sample:
dput(somewhere)
structure(list(state = structure(c(30L, 31L, 5L, 31L, 24L, 36L,
...
Mardellmarden asked 10/11, 2021 at 15:19
2
Solved
I'm wondering if there is an efficient way to do the following in Julia:
I have a DataFrame of the following form:
julia> df1 = DataFrame(var1=["a","a","a","b&...
Aleida asked 26/11, 2020 at 15:29
2
Solved
When I need to apply multiple functions to multiple columns sequentially and aggregate by multiple columns and want the results to be bound into a data frame I usually use aggregate() in the follow...
Rosado asked 29/10, 2014 at 7:12
1
Solved
I'm sure this has been asked before, sorry if duplicate. Suppose I have the following dataframe:
df = pd.DataFrame({'key': ['A', 'B', 'C', 'A', 'B', 'C'],
'data': range(6)}, columns=['key'...
G asked 12/3, 2019 at 15:51
1
Solved
I have a dataframe that consists of truthIds and trackIds:
truthId = ['A', 'A', 'B', 'B', 'C', 'C', 'A', 'C', 'B', 'A', 'A', 'C', 'C']
trackId = [1, 1, 2, 2, 3, 4, 5, 3, 2, 1, 5, 4, 6]
df1 = pd.Da...
Blocking asked 6/2, 2018 at 18:51
2
Solved
I need to select half of a dataframe using the groupby, where the size of each group is unknown and may vary across groups. For example:
index summary participant_id
0 130599 17.0 13
1 130601 18....
Recurvate asked 27/6, 2017 at 19:42
2
Solved
Using dplyr, I'd like to summarize [sic] by a variable that I can vary (e.g. in a loop or apply-style command).
Typing the names in directly works fine:
library(dplyr)
ChickWeight %>% group_by...
Antimagnetic asked 8/2, 2015 at 0:22
1
Solved
In traditional plyr, returned rows are added automagically to the output even if they exceed the number of input rows for that grouping:
set.seed(1)
dat <- data.frame(x=runif(10),g=rep(letters[...
Embower asked 13/5, 2014 at 1:10
2
Solved
Ok, second R question in quick succession.
My data:
Timestamp St_01 St_02 ...
1 2008-02-08 00:00:00 26.020 25.840 ...
2 2008-02-08 00:10:00 25.985 25.790 ...
3 2008-02-08 00:20:00 25.930 25.765 ...
Tooling asked 28/5, 2012 at 16:19
2
Solved
I am working with an unbalanced, irregularly spaced cross-sectional time series. My goal is to obtain a lagged moving average vector for the "Quantity" vector, segmented by "Subject".
In other wor...
Photoluminescence asked 10/11, 2013 at 20:2
2
Solved
On a concrete problem, say I have a DataFrame DF
word tag count
0 a S 30
1 the S 20
2 a T 60
3 an T 5
4 the T 10
I want to find, for every "word", the "tag" that has the most "count". So the r...
Attested asked 10/3, 2013 at 13:16
1
© 2022 - 2024 — McMap. All rights reserved.