Using the example data below, my goal is to create a table (publication-ready would be great, but fine if not) where I calculate what percent of each group within each column (city, race, and gender) attend, fail, or both.
So what percent of city=6 attend, what percent of city=6 fail, what percent of city=6 both. And then repeat for each group within city, each group within race, and each group within gender, with the end result being each of those as rows in the output table and attend, fail, and both being the 3 columns in the output table.
What I attempted was to calculate percentages for each group within each column separately, and then stack them all using kableExtra
. The kableExtra attempt was so messy and was so incorrect that I didn't even save it, but this was how I started my calculations:
race_percentages <- d %>%
group_by(race) %>%
summarize(
percent_attend = mean(attend) * 100,
percent_fail = mean(fail) * 100,
percent_both = mean(both) * 100)
gender_percentages <- d %>%
group_by(gender) %>%
summarize(
percent_attend = mean(attend) * 100,
percent_fail = mean(fail) * 100,
percent_both = mean(both) * 100)
city_percentages <- d %>%
group_by(city) %>%
summarize(
percent_attend = mean(attend) * 100,
percent_fail = mean(fail) * 100,
percent_both = mean(both) * 100)
So if there is some way to take those resulting data frames and stack them on top of each other, that would get close to what I'm hoping for as the end result.
A shortened example of how I hope the final table will be organized:
Group | Attend | Fail | Both |
---|---|---|---|
Race1 | X% | X% | X% |
Race2 | X% | X% | X% |
Male | X% | X% | X% |
Female | X% | X% | X% |
City6 | X% | X% | X% |
City9 | X% | X% | X% |
City12 | X% | X% | X% |
Data:
d<-structure(list(city = structure(c(9, 6, 9, 12, 12, 6, 6, 12,
12, 6, 6, 9, 12, 12, 6, 6, 9, 6, 9, 6, 6, 12, 12, 12, 6, 12,
9, 6, 12, 6), format.stata = "%9.0g"), race = structure(c(3,
3, 3, 3, 3, 3, 2, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 3, 2, 2,
2, 2, 2, 2, 2, 3, 3, 2), format.stata = "%9.0g", labels = c(White = 1,
Black = 2, Hispanic = 3, Other = 4), class = c("haven_labelled",
"vctrs_vctr", "double")), gender = structure(c(0, 1, 0, 1, 0,
0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1,
1, 0, 0, 0), label = "gender of subject", format.stata = "%12.0g", labels = c(female = 0,
male = 1), class = c("haven_labelled", "vctrs_vctr", "double"
)), attend = structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), format.stata = "%9.0g"),
fail = structure(c(0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1,
1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1), format.stata = "%9.0g"),
both = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA,
-30L), class = c("tbl_df", "tbl", "data.frame"))
table1
package does. – Stagnate