Add legend to ggplot histogram with different types of aesthetics
Asked Answered
F

1

3

I want to add a legend to one of my plots, but I have different aesthetics and I never created a legend so I find it very difficult to determine how to build it.

One of my aesthetics is a fill code, which I added manually as a vector. The other aesthetic is a vertical line that I added with geom_vline.

From the graph below, there are three characteristics that I want to add to the legend: 1) The bars with color dark blue, 2) The bars with color light blue and 3) The vertical line.

Does anyone have a suggestion for me on how to code this efficiently?

#df
df <- data.frame(Time_Diff <- runif(1000, 0, 200))


# Show median, IQR range and outliers
colors <- c(rep("blue",3), rep("paleturquoise2",38))
bp_overall <- ggplot(data = df, aes(Time_Diff)) 
bp_overall + 
  geom_histogram(binwidth = 5, fill = colors) + #create histogram
  ggtitle("Time Difference")  +
  xlab("Time in Days") +
  ylab("Amount") +
  geom_vline(xintercept = 3, linetype = "twodash", size = 1,      colour= "darkblue") + #show median
  scale_x_continuous(breaks = seq(0, 202, 10)) +
  theme_light() +
  theme(panel.grid.minor = element_blank(),
    panel.border = element_blank(), #remove all border lines
    axis.line.x = element_line(size = 0.5, linetype = "solid", colour = "black"), #add x-axis border line
    axis.line.y = element_line(size = 0.5, linetype = "solid", colour = "black")) + #add y-axis border line
  theme(plot.title = element_text(family = windowsFont("Verdana"),     color="black", size=14, hjust = 0.5)) +
  theme(axis.title = element_text(family = windowsFont("Verdana"), color="black", size=12)) 

After the suggestion of Djork I arrived to the following script, which works and I am happy with. The only thing I am trying to accomplish now is to get the Legend to be one whole (Histogram Legend and Line Legend are combined into a coherent whole). Anyone has a suggestion?

# reformat data
set.seed(1)
df <- data.frame(runif(1000, 0, 200))
colnames(df) <- "Time_Diff"

bp_overall + 
  geom_histogram(data = subset(df, Time_Diff <= 12.5), aes(x = Time_Diff, fill="BAR BLUE"), binwidth = 5) + # subset for blue data, where aes fill is fill group 1 label
  geom_histogram(data = subset(df, Time_Diff > 12.5), aes(x = Time_Diff, fill="BAR TURQUOISE"), binwidth = 5) + # subset for turquoise data, where aes fill is fill group 2 label
  scale_fill_manual("Histogram Legend", values=c("blue", "paleturquoise2")) + # manually assign histogram fill colors
  geom_vline(aes(xintercept = 3, colour="LINE DARK BLUE"), linetype="twodash", size = 1) + # where aes colour is vline label
  scale_colour_manual("Line Legend", values="darkblue") + #removed legend title
  scale_x_continuous(breaks = seq(0, 202, 10)) +
  ggtitle("Time Difference")  +
  xlab("Time in Days") +
  ylab("Amount") +
  theme_light() +
  theme(panel.grid.minor = element_blank(),
        panel.border = element_blank(), 
        axis.line.x = element_line(size = 0.5, linetype = "solid", colour = "black"), 
        axis.line.y = element_line(size = 0.5, linetype = "solid", colour = "black"),
        legend.position = c(0.95, 0.95),
        legend.justification = c("right", "top"),
        legend.box.just = ("right"))
Floccule answered 16/10, 2017 at 7:10 Comment(2)
the easiest way to get this legend is to specify the characteristics inside aes(). Thus, you have to transform your data.frame including the statistics and a column with the colors. See here or hereApparatus
@Jimbou I think I understand what you mean with transforming the dataframe. I have done this in my problem (see code below CONCEPT OF SOLUTION), but I can't figure out how to adjust the code in my ggplot figureFloccule
M
6

I think @Jimbou's suggestion is preferable, but there is a work-around for creating legends artificially by assigning a character value to the geom_histogram aes fill value and geom_vline aes colour value, then setting colors in scale_fill_manual or scale_colour_manual.

However with this approach aes fill will only take one value (length 1) so you have to subset your df for the blue and turquoise values and plot a histogram for each, with the cutoff determined by your binwidth.

Here is the approach. Note your data needed reformatting.

# reformat data
set.seed(1)
df <- data.frame(runif(1000, 0, 200))
colnames(df) <- "Time_Diff"


bp_overall <- ggplot(data = df) 
bp_overall +
  geom_histogram(data = subset(df, Time_Diff <= 12.5), aes(x = Time_Diff, fill="BAR BLUE"), binwidth = 5) + # subset for blue data, where aes fill is fill group 1 label
  geom_histogram(data = subset(df, Time_Diff > 12.5), aes(x = Time_Diff, fill="BAR TURQUOISE"), binwidth = 5) + # subset for turquoise data, where aes fill is fill group 2 label
  scale_fill_manual("Histogram Legend", values=c("blue", "paleturquoise2")) + # manually assign histogram fill colors
  geom_vline(aes(xintercept = 3, colour="LINE DARK BLUE"), linetype="twodash", size = 1) + # where aes colour is vline label
  scale_colour_manual("Line Legend", values="darkblue") + # manually assign vline colors
  scale_x_continuous(breaks = seq(0, 202, 10)) +
  ggtitle("Time Difference")  +
  xlab("Time in Days") +
  ylab("Amount") +
  theme_light() +
  theme(panel.grid.minor = element_blank(),
    panel.border = element_blank(), 
    axis.line.x = element_line(size = 0.5, linetype = "solid", colour = "black"), 
    axis.line.y = element_line(size = 0.5, linetype = "solid", colour = "black"))

Etc. add your remaining theme.

enter image description here

EDITED: To answer question on how to unify legend and decrease spacing between the two legend types

(1) Remove legend name for vline by setting to "" in scale_fill_manual, change histogram fill legend name to "Legend" in scale_colour_manual.

(2) Specify order in which legends should appear, fill first then colour using guides guide_legend.

(3) Remove the y-spacing between the two legend types by setting legend.spacing.y to 0, and remove margins on top and bottom using legend.margin in theme

bp_overall <- ggplot(data = df) 
bp_overall +
  geom_histogram(data = subset(df, Time_Diff <= 12.5), aes(x = Time_Diff, fill="BAR BLUE"), binwidth = 5) + 
  geom_histogram(data = subset(df, Time_Diff > 12.5), aes(x = Time_Diff, fill="BAR TURQUOISE"), binwidth = 5) + 
  scale_fill_manual(name="Legend", values=c("blue", "paleturquoise2")) +
  geom_vline(aes(xintercept = 3, colour="LINE DARK BLUE"), linetype="twodash", size = 1) + 
  scale_colour_manual(name="", values="darkblue") + 
  scale_x_continuous(breaks = seq(0, 202, 10)) +
  ggtitle("Time Difference")  +
  xlab("Time in Days") +
  ylab("Amount") +
  theme_light() +
  theme(panel.grid.minor = element_blank(),
    panel.border = element_blank(),
    axis.line.x = element_line(size = 0.5, linetype = "solid", colour = "black"),
    axis.line.y = element_line(size = 0.5, linetype = "solid", colour = "black"),
    legend.spacing.y = unit(0, "cm"),
    legend.margin=margin(t=0, r=0.5, b=0, l=0.5, unit="cm")) +
  guides(fill = guide_legend(order = 1), 
     colour = guide_legend(order = 2))

enter image description here

Minnesota answered 16/10, 2017 at 9:12 Comment(7)
Thank you for your suggestion. Because you and Jimdou both indicate that there's a more neat way to do this. I added a little piece of code to transform the dataframe to make it suitable to take up in the ggplot, but I don't know how to take it from there. Can you check the new code in my problem?Floccule
I tried to transform my code into yours, but I get the following error: 'Error: Aesthetics must be either length 1 or the same as the data (77): x, fill, binwidth'. What do you mean with reshaping the df?Floccule
You will have to provide your data or a subset that reproduces the error for us to troubleshoot the error, but it is likely you are give aes something longer that 1 character which is what I used above.Minnesota
Sorry, one more question :). It would be perfect to have just one legend (not a separate Histogram Legend and Line Legend). I get one step closer by removing the "Line Legend" tekst in the scale_colour_manual line of code (replace it with ""). But still, the legend for the line is too far below the rest of the legend indications. Do you know if there's a way in which I can make the "LINE DARK BLUE" variable be one of the "Histogram Legend" guys? (at least visually)Floccule
Please see edited answer on workaround to unify the legends. If this works for you, please accept the answer to mark it close.Minnesota
I understand what you are doing here, but unfortunately, I get the error 'legend.spacing.y' is not a valid theme element nameFloccule
Which version of ggplot2 are you running? It run the most current ggplot2_2.2.1, otherwise check your syntax.Minnesota

© 2022 - 2024 — McMap. All rights reserved.