split-apply-combine Questions

2

I know this must be super easy, but I'm having trouble finding the right dplyr commands to do this. Let's say I want to group a dataset by two variables, and then summarize the count for each row. ...
Extravagancy asked 24/4, 2018 at 1:15

4

Solved

I have downloaded a list of all the towns and cities etc in the US from the census bureau. Here is a random sample: dput(somewhere) structure(list(state = structure(c(30L, 31L, 5L, 31L, 24L, 36L, ...
Mardellmarden asked 10/11, 2021 at 15:19

2

Solved

I'm wondering if there is an efficient way to do the following in Julia: I have a DataFrame of the following form: julia> df1 = DataFrame(var1=["a","a","a","b&...
Aleida asked 26/11, 2020 at 15:29

2

Solved

When I need to apply multiple functions to multiple columns sequentially and aggregate by multiple columns and want the results to be bound into a data frame I usually use aggregate() in the follow...
Rosado asked 29/10, 2014 at 7:12

1

Solved

I'm sure this has been asked before, sorry if duplicate. Suppose I have the following dataframe: df = pd.DataFrame({'key': ['A', 'B', 'C', 'A', 'B', 'C'], 'data': range(6)}, columns=['key'...
G asked 12/3, 2019 at 15:51

1

Solved

I have a dataframe that consists of truthIds and trackIds: truthId = ['A', 'A', 'B', 'B', 'C', 'C', 'A', 'C', 'B', 'A', 'A', 'C', 'C'] trackId = [1, 1, 2, 2, 3, 4, 5, 3, 2, 1, 5, 4, 6] df1 = pd.Da...
Blocking asked 6/2, 2018 at 18:51

2

Solved

I need to select half of a dataframe using the groupby, where the size of each group is unknown and may vary across groups. For example: index summary participant_id 0 130599 17.0 13 1 130601 18....
Recurvate asked 27/6, 2017 at 19:42

2

Solved

Using dplyr, I'd like to summarize [sic] by a variable that I can vary (e.g. in a loop or apply-style command). Typing the names in directly works fine: library(dplyr) ChickWeight %>% group_by...
Antimagnetic asked 8/2, 2015 at 0:22

1

Solved

In traditional plyr, returned rows are added automagically to the output even if they exceed the number of input rows for that grouping: set.seed(1) dat <- data.frame(x=runif(10),g=rep(letters[...
Embower asked 13/5, 2014 at 1:10

2

Solved

Ok, second R question in quick succession. My data: Timestamp St_01 St_02 ... 1 2008-02-08 00:00:00 26.020 25.840 ... 2 2008-02-08 00:10:00 25.985 25.790 ... 3 2008-02-08 00:20:00 25.930 25.765 ...
Tooling asked 28/5, 2012 at 16:19

2

Solved

I am working with an unbalanced, irregularly spaced cross-sectional time series. My goal is to obtain a lagged moving average vector for the "Quantity" vector, segmented by "Subject". In other wor...
Photoluminescence asked 10/11, 2013 at 20:2

2

Solved

On a concrete problem, say I have a DataFrame DF word tag count 0 a S 30 1 the S 20 2 a T 60 3 an T 5 4 the T 10 I want to find, for every "word", the "tag" that has the most "count". So the r...
Attested asked 10/3, 2013 at 13:16
1

© 2022 - 2024 — McMap. All rights reserved.