Filling in the area under a line graph in ggplot2: geom_area()
Asked Answered
L

2

22

For the data:

    def.percent period  valence
1   6.4827843   1984-1985   neg
2   5.8232425   1985-1986   neg
3   -2.4003260  1986-1987   pos
4   -3.5994399  1987-1988   pos

If I add a line to the points, how can I use ggplot2 to color the area under the line [ geom_area() ] with different colors for the valence values "neg" and "pos"?

I tried this:

ggplot(data, aes(x=period, y=def.percent, group = 1)) +
geom_area(aes(fill=valence)) +
geom_line() + geom_point() + geom_hline(yintercept=0)

But R returns the error:

Error: Aesthetics can not vary with a ribbon

This same code works for a different dataset, I don't understand what is happening here, for example:

library(gcookbook) # For the data set
cb <- subset(climate, Source=="Berkeley")
cb$valence[cb$Anomaly10y >= 0] <- "pos"
cb$valence[cb$Anomaly10y < 0] <- "neg"

ggplot(cb, aes(x=Year, y=Anomaly10y)) +
  geom_area(aes(fill=valence)) +
  geom_line() +
  geom_hline(yintercept=0)
Lowry answered 25/2, 2015 at 21:35 Comment(0)
H
28

This happens because in your case period is a categorical i.e. a factor variable. If you convert it to numeric it works fine:

Data

df <- read.table(header=T, text='  def.percent period  valence
1   6.4827843   1984   neg
2   5.8232425   1985   neg
3   -2.4003260  1986   pos
4   -3.5994399  1987   pos')

Solution

ggplot(df, aes(x=period, y=def.percent)) +
  geom_area(aes(fill=valence)) +
  geom_line() + geom_point() + geom_hline(yintercept=0)

Plot

enter image description here

Heliocentric answered 25/2, 2015 at 21:49 Comment(2)
Note that the working code in the question also has these uncolored regions.Term
Yes, it would be nice if we could get hid of the uncolored regions.Lowry
T
1

A little data-wrangling is needed to "get rid of the uncoloured regions". Note that these are present in the plot "for a different dataset", but you need to "zoom in" to see this (e.g., focus on the years from 1930 to 1940).

One needs to decide what colour the "uncoloured regions" will have. Switch from lead() to lag() to use the alternative choice from that used here.

library(tidyverse)

df <- tribble(
  ~def.percent, ~period, ~valence,
  6.4827843, 1984, "neg",
  5.8232425, 1985, "neg",
  -2.4003260, 1986, "pos",
  -3.5994399, 1987, "pos"
)

df |> 
  arrange(period) |>
  mutate(pos_valence = valence == "pos") |>
  mutate(switch = coalesce(valence != lead(valence), FALSE)) |>
  ggplot(aes(x = period, y = def.percent)) +
  geom_ribbon(mapping = aes(ymax = if_else(!pos_valence | (pos_valence & switch), 
                                           def.percent, NA),
                            ymin = 0,
                            fill = FALSE)) +
  geom_ribbon(mapping = aes(ymax = if_else(pos_valence | (!pos_valence & switch), 
                                           def.percent, NA),
                            ymin = 0,
                            fill = TRUE)) +
  geom_line() +
  labs(fill = "valence=='pos'")

Created on 2024-04-05 with reprex v2.1.0

Trainbearer answered 5/4 at 10:16 Comment(1)
just was I was looking for!Torre

© 2022 - 2024 — McMap. All rights reserved.