I would like to make a cross tab in R using dplyr
. I have good reasons for not just using the base table()
command.
table(mtcars$cyl, mtcars$gear)
3 4 5
4 1 8 2
6 2 4 1
8 12 0 2
library(dplyr)
library(tidyr)
mtcars %>%
group_by(cyl, gear) %>%
tally() %>%
spread(gear, n, fill = 0)
Source: local data frame [3 x 4]
cyl 3 4 5
1 4 1 8 2
2 6 2 4 1
3 8 12 0 2
This is all well and good. But it seems to fall apart when there are missing values in the group_by()
variables.
mtcars %>%
mutate(
cyl = ifelse(cyl > 6, NA, cyl),
gear = ifelse(gear > 4, NA, gear)
) %>%
group_by(cyl, gear) %>%
tally()
Source: local data frame [8 x 3]
Groups: cyl
cyl gear n
1 4 3 1
2 4 4 8
3 4 NA 2
4 6 3 2
5 6 4 4
6 6 NA 1
7 NA 3 12
8 NA NA 2
# DITTO # %>%
spread(gear, n)
Error in if (any(names2(x) == "")) { :
missing value where TRUE/FALSE needed
I guess what I would like is for a NA
column like when you do table(..., useNA = "always")
. Any tips?