EDIT: input
very new to this.
I have a similar problem to this: group by and then count missing variables?
Taking the input data from that question:
df1 <- data.frame(
Z = sample(LETTERS[1:5], size = 10000, replace = T),
X1 = sample(c(1:10,NA), 10000, replace = T),
X2 = sample(c(1:25,NA), 10000, replace = T),
X3 = sample(c(1:5,NA), 10000, replace = T))
as one user proposed, it's possible to use summarise_each
:
df1 %>%
group_by(Z) %>%
summarise_each(funs(sum(is.na(.))))
#Source: local data frame [5 x 4]
#
# Z X1 X2 X3
# (fctr) (int) (int) (int)
#1 A 169 77 334
#2 B 170 77 316
#3 C 159 78 348
#4 D 181 79 326
#5 E 174 69 341
However, I would like to get only the total number of missing values per group.
I've also tried this but it didn't work: R count NA by group
Ideally, it should give me something like:
# Z sumNA
# (fctr) (int)
#1 A 580
#2 B 493
#3 C 585
#4 D 586
#5 E 584
Thanks in advance.
dput(df)
. Or, if it is too big with the output ofdput(head(df, 20))
. (df
is the name of your dataset.) – Buglegroup_by(df1, Z) %>% summarize(n = sum(is.na(X1)))
? Those aren't the numbers you show here, but that may be due to the uncontrolled randomness (should have usedset.seed
). – Taber