summarize Questions
5
I have a dataset for which I want to summarise by mean, but also calculate the max to just 1 of the variables.
Let me start with an example of what I would like to achieve:
iris %>%
group_by(...
4
Solved
I'm trying to create an output that calculates the percentage of counts, out of total counts (in a data frame), by factor level, but can't seem to figure out how to retain the grouping structure in...
6
Solved
I started getting a new message (see post title) when running group_by and summarise() after updating to dplyr development version 0.8.99.9003.
Here is an example to recreate the output:
library(...
15
3
Solved
I have a dataset that I want to summarize. First, I want the sum of the home and away games, which I can do. However, I also want to know how many outliers (defined as more than 300 points) are wit...
2
I know this must be super easy, but I'm having trouble finding the right dplyr commands to do this. Let's say I want to group a dataset by two variables, and then summarize the count for each row. ...
Extravagancy asked 24/4, 2018 at 1:15
3
I am grouping data and then summarizing it, but would also like to retain another column. I do not need to do any evaluations of that column's content as it will always be the same as the group_by ...
3
Solved
I have a relatively straightforward question that I've been unable to find a solution for.
Suppose I have the following dataset:
ID
dummy_var
String1
String2
String3
1
0
Tom
NA
NA
1
1
NA
...
3
Solved
I want to convert my R code using dplyr package into pandas where I group-by and perform multiple summarizations.
Here is my current code:
import pandas as pd
data = pd.DataFrame(
{'col1':[1,1,1,1...
4
Solved
With dplyr starting version 0.7 the methods ending with underscore such as summarize_ group_by_ are deprecated since we are supposed to use quosures.
See:
https://cran.r-project.org/web/packages/d...
3
Solved
I have data where I want to get a bunch of summary statistics for multiple columns with the tidyverse approach. However, utilizing tidyverse's summarize function, it will create each column statist...
2
Solved
I'm trying to reduce a df of observations to a single observation (single line).
I would like to summarize_if is numeric with the mean and if is string or factor with the mode. The code below doesn...
2
Solved
I am summarizing group means from a table using the summarize function from the dplyr package in R. I would like to do this dynamically, using a column name string stored in another variable.
The ...
3
Solved
Building on this question: Summarize with conditions in dplyr
I would like to use dplyr to summarize a column based on a mathematical condition (not string matching as in the linked post). I need t...
Ewold asked 5/12, 2019 at 16:18
1
Solved
I am fairly new to R and even newer to dplyr. I have a small data set comprised of 2 columns - var1 and var2. The var1 column is comprised of num values. The var2 column is comprised of factors wit...
3
Solved
I believe this may have a simple solution but I'm having trouble describing what I need to do (and hence what to search for). I think I need the summarize function. My goal output is at the very bo...
2
Solved
I recently built a simple R script to summarize three different data frames. Since updating to the newest version of R and R Studio, I am running into an output I haven't seen before when using the...
1
Solved
I have a big table similar to datadf with 3000 thousand columns and rows, I saw some methods to obtain my expected summary in stack overflow (Frequency of values per column in table), but even the ...
Rienzi asked 11/9, 2018 at 19:53
3
Solved
I have a dataframe with records spanning multiple years:
WarName | StartDate | EndDate
---------------------------------------------
'fakewar1' 01-01-1990 02-02-1995
'examplewar' 05-01-1990 03-0...
Attestation asked 20/5, 2018 at 23:9
1
Solved
I'm fooling around with babynames pkg. A group_by command works, but after the summarize, one of the groups is dropped from the group list.
library(babynames)
babynames[1:10000, ] %>% group_by(...
2
Solved
I have found that data.table and dplyr have differing results when trying to do the same thing. I would like to use dplyr syntax, but have it compute in the way that data.table does. The use case i...
Fullfledged asked 20/1, 2018 at 15:32
1
Solved
I want to find the difference between the cases that were observed and those that were not by type of case:
set.seed(42)
df <- data.frame(type = factor(rep(c("A", "B", "C"), 2)), observed = rep...
3
Solved
My question is very similar to Applying group_by and summarise on data while keeping all the columns' info
but I would like to keep columns which get excluded because they conflict after groupi...
4
Solved
Using python I have created following data frame which contains similarity values:
cosinFcolor cosinEdge cosinTexture histoFcolor histoEdge histoTexture jaccard
1 0.770 0.489 0.388 0.57500000 0.5...
1
Solved
I would like, when summarizing after grouping, to count the number of a specific level of another factor.
In the working example below, I would like to count the number of "male" levels in each g...
1 Next >
© 2022 - 2025 — McMap. All rights reserved.