I would like to draw a plot with percentage labels per x-axis group. This works fine without empty groups:
# library
library(ggplot2)
library(reshape2)
# example data from reshape2
str(tips)
#> 'data.frame': 244 obs. of 7 variables:
#> $ total_bill: num 17 10.3 21 23.7 24.6 ...
#> $ tip : num 1.01 1.66 3.5 3.31 3.61 4.71 2 3.12 1.96 3.23 ...
#> $ sex : Factor w/ 2 levels "Female","Male": 1 2 2 2 1 2 2 2 2 2 ...
#> $ smoker : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
#> $ day : Factor w/ 4 levels "Fri","Sat","Sun",..: 3 3 3 3 3 3 3 3 3 3 ...
#> $ time : Factor w/ 2 levels "Dinner","Lunch": 1 1 1 1 1 1 1 1 1 1 ...
#> $ size : int 2 3 3 2 4 4 2 4 2 2 ...
# function to count percentage per day
comp_pct <- function(count, day) {
count / tapply(count, day, sum)[day]
}
# correct plot
ggplot(tips, aes(x = day, group = sex)) +
geom_bar(aes(y = ..prop.., fill = factor(..group..)), stat = "count") +
geom_text(aes(
label = after_stat(scales::percent(comp_pct(count, x))),
y = ..prop..
), stat = "count", position = position_stack(vjust = 0.5)) +
labs(y = "Percent", fill = "sex") +
scale_x_discrete(drop=FALSE) +
scale_y_continuous(labels = scales::percent)
However, after adding an empty level, the labelling with after_stat
does not work anymore as expected. I am not sure if this is caused by the ordering of the output from the tapply()
in comp_pct
. However, I am unable to solve it.
# additional empty level
tips -> tips1
tips1$day <- factor(tips$day, levels=c("NewDay",levels(tips$day)))
levels(tips1$day)
#> [1] "NewDay" "Fri" "Sat" "Sun" "Thur"
# bars OK, labels not OK
ggplot(tips1, aes(x = day, group = sex)) +
geom_bar(aes(y = ..prop.., fill = factor(..group..)), stat = "count") +
geom_text(aes(
label = after_stat(scales::percent(comp_pct(count, x))),
y = ..prop..
), stat = "count", position = position_stack(vjust = 0.5)) +
labs(y = "Percent", fill = "sex") +
scale_x_discrete(drop=FALSE) +
scale_y_continuous(labels = scales::percent)
#> Warning: Removed 2 rows containing missing values (geom_text).
Created on 2022-04-02 by the reprex package (v2.0.1)
ggplot
's trickystat
functions, why not just pre-calculate what you need and plot that directly? It would also be easier to help if you could provide the output ofdput(tips)
. – Periodatetips
is part ofreshape2
, adput
would be very long. I added astr
. – Lyonnaise