Inconsistent bar widths when month-scaling in ggplot
Asked Answered
O

2

-12

I looked for a while, and was unable to locate a post discussing a similar issue. I am having an issue with date-scaling in ggplot, which I think is related to the way ggplot is handling dates. I am trying to get rid of all of the white space in between the columns because my end result will be similar to the following, which is a capacity planning chart showing projects spanning the months:

End Goal

The gap between the columns appears inconsistent even with default scaling and width parameters, which led me to believe that that ggplot may be drawing all of the columns at a set width but placing them on the x-axis relative to the number of days in each month. However, even after creating a "month length" variable to scale each column independently, I am unable to get all of the gaps to close, without causing others to overlap. Can someone provide some insight to how ggplot is handling dates?

Here is some reproducible code to play with:

library(lubridate)
library(ggplot2)
library(scales)

Months <- c("2015-01-01", "2015-02-01", "2015-03-01", "2015-04-01", "2015-05-01", 
            "2015-06-01", "2015-07-01", "2015-08-01", "2015-09-01", "2015-10-01")
Months <- as.Date(Months)
Totals <- c(1330, 1010, 950, 1110, 1020, 1160, 1320, 880, 620, 320)
df <- data.frame(Months, Totals)
df$MonthLength <- days_in_month(as.Date(df$Month))

ggplot()+
  geom_bar(data=df, aes (x = Months, y = Totals, fill = Months, width = MonthLength), 
                    position = "dodge", stat = "identity", alpha = 0.80)+
  coord_cartesian(ylim = c(0, 1600))+
  scale_x_date(breaks="1 month",limits=c(as.Date("2015-01-01"),as.Date("2015-10-01")), 
                                labels = date_format("%b-%Y"))
Overfill answered 9/11, 2015 at 18:48 Comment(4)
Looks to me like the separation between the months is constant, and the width parameter you are setting is simply the width of the bar that is drawn. I think if you want something different than that, then you will have to program it from scratch using grid or something.Especially
flaco777. May I ask why you unaccepted my answer below? Can I do anything to improve it?Sennar
This question is under discussion on meta.Rickard
@Sennar Sure, after I reviewed the answer later on, I realized it didn't answer the original question, which was a questions about how ggplot was handling date plotting. More of a question of why can't I use the date labels in ggplot to handle this, rather than how to find a workaround.Overfill
S
22

I think what is happening, is that ggplot puts each tick mark and hence each bar at the 15th of the month. Therefore, if you make the bars larger, the february-bar overlaps with march, but falls short of january. (I would call that a bug, but who am I ;-))

The way around this, is the convert your months to a factor, and then set width=1:

df$Fmonth <- factor(month(Months))
ggplot(data=df, aes (x = Fmonth, y = Totals, fill = Months)) +
  geom_bar(position = "dodge", stat = "identity", alpha = 0.80, width=1) +
  coord_cartesian(ylim = c(0, 1600))

enter image description here

Another way around it might be to calculate the middle of the month by hand, position your data there and then follow your original route of bar width=MonthLength.

Sennar answered 9/11, 2015 at 22:39 Comment(2)
@Overfill Can I ask why you un-accepted this answer?Sennar
The same thing happened to me @RHA. In my case, even the question had a bounty associated. The OP first accepted my answer without awarding me the bounty, and one day later awarded me the bounty and at the same exact time unaccepted my answer. My answer was fine, and as in your case it was the only one. These are people who do not value the effort put in an answer, which is FREE. You have to move on, but I blame MSE for my particular case. Anyway, +1 for you, your answer is good.Mouthwash
W
0

I get around it by setting width in aes(). It works in ggplot 3.4.4, but it's not really supported - it produces a warning and they're considering disabling it: https://github.com/tidyverse/ggplot2/issues/3142

library(dplyr, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)
library(ggplot2, warn.conflicts = FALSE)

set.seed(1)

df <- tibble(
  date = seq.Date(ymd("2020-01-01"), ymd("2020-12-01"), by = "1 month"),
  quantity = sample(20:100, 12),
  ndays = days_in_month(date)  # width for different months
) |> 
  mutate(date = date + ndays / 2)  # reposition to fix the overlaps

df
#> # A tibble: 12 × 3
#>    date       quantity ndays
#>    <date>        <int> <int>
#>  1 2020-01-16       87    31
#>  2 2020-02-15       58    29
#>  3 2020-03-16       20    31
#>  4 2020-04-16       53    30
#>  5 2020-05-16       62    31
#>  6 2020-06-16       33    30
#>  7 2020-07-16       78    31
#>  8 2020-08-16       70    31
#>  9 2020-09-16       40    30
#> 10 2020-10-16       73    31
#> 11 2020-11-16       26    30
#> 12 2020-12-16       56    31

df |> 
  ggplot() +
  geom_col(aes(date, quantity, width = ndays), alpha = 0.7)
#> Warning in geom_col(aes(date, quantity, width = ndays), alpha = 0.7): Ignoring
#> unknown aesthetics: width

Bar chart by month with touching bars

Wendell answered 2/2 at 15:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.