Use curved lines in bumps chart
Asked Answered
P

2

18

I'm trying to make a bumps chart (like parallel coordinates but with an ordinal x-axis) to show ranking over time. I can make a straight-line chart very easily:

library(ggplot2)
set.seed(47)

df <- as.data.frame(as.table(replicate(8, sample(4))), responseName = 'rank')
df$Var2 <- as.integer(df$Var2)

head(df)
#>   Var1 Var2 rank
#> 1    A    1    4
#> 2    B    1    2
#> 3    C    1    3
#> 4    D    1    1
#> 5    A    2    3
#> 6    B    2    4

ggplot(df, aes(Var2, rank, color = Var1)) + geom_line() + geom_point()

Wonderful. Now, though, I want to make the connecting lines curved. Despite never having more than one y per x, geom_smooth offers some possibilities. loess seems like it should work, as it can ignore points except the closest. However, even with tweaking the best I can get still misses lots of points and overshoots others where it should be flat:

ggplot(df, aes(Var2, rank, color = Var1)) + 
    geom_smooth(method = 'loess', span = .7, se = FALSE) + 
    geom_point()

I've tried a number of other splines, like ggalt::geom_xspline, but they all still overshoot or miss the points:

ggplot(df, aes(Var2, rank, color = Var1)) + ggalt::geom_xspline() + geom_point()

Is there an easy way to curve these lines? Do I need to build my own sigmoidal spline? To clarify, I'm looking for something like D3.js's d3.curveMonotoneX which hits every point and whose local maxima and minima do not exceed the y values:

d3.curveMonotoneX image

Ideally it would probably have a slope of 0 at each point, too, but that's not absolutely necessary.

Pushed answered 4/5, 2017 at 0:19 Comment(4)
As per this answer -- what about the cobs package? "COBS stands for Constrained B-splines. Possible constraints include going through specific points, setting derivatives to specified values, monotonicity (increasing or decreasing), concavity, convexity, periodicity, etc." I can't immediately get it to work but there's promise.Dividers
Ooh, that looks promising. I was trying fda::smooth.monotone, but it's parameters are ridiculously complex.Pushed
i think you can do this with loess by tweaking the degree and span geom_smooth(method = 'loess', span = 0.3, se = FALSE, method.args=list(degree=1))Germ
@Germ Interesting, that works! It does spew a lot of warnings, though, and even tweaking the span further, it still fails on my real data, which is a little bigger.Pushed
S
24

Using signal::pchip with a grid of X-values works, at least in your example with numeric axes. A proper geom_ would be nice, but hey...

library(tidyverse)
library(signal)
set.seed(47)

df <- as.data.frame(as.table(replicate(8, sample(4))), responseName = 'rank')
df$Var2 <- as.integer(df$Var2)

head(df)
#>   Var1 Var2 rank
#> 1    A    1    4
#> 2    B    1    2
#> 3    C    1    3
#> 4    D    1    1
#> 5    A    2    3
#> 6    B    2    4

ggplot(df, aes(Var2, rank, color = Var1)) +
  geom_line(data = df %>%
              group_by(Var1) %>%
              do({
                tibble(Var2 = seq(min(.$Var2), max(.$Var2),length.out=100),
                       rank = pchip(.$Var2, .$rank, Var2))
              })) +
  geom_point()

Result: Result

Sanorasans answered 4/5, 2017 at 8:48 Comment(1)
Beautiful! I've got to say, I think this is the best first answer I've ever seen. With some slight trickery, you can hack it all into ggplot, too: ggplot(df, aes(Var2, rank, color = Var1)) + geom_point() + lapply(split(df, df$Var1), function(.data){stat_function(aes(color = Var1), .data, fun = function(x){signal::pchip(.data$Var2, .data$rank, x)})})Pushed
C
2

Building on Henrik's answer, this wraps up pchip (I'm using the one from pracma here but the result is the same) so it can be used alongside existing smooth methods more easily:

ggpchip = function(formula, data, weights) structure(pracma::pchipfun(data$x, data$y), class='ggpchip')
predict.ggpchip = function(object, newdata, se.fit=F, ...) {
  fit = unclass(object)(newdata$x)
  if (se.fit) list(fit=data.frame(fit, lwr=fit, upr=fit), se.fit=fit * 0) else fit
}

Then the actual ggplot call is straightforward:

ggplot(df, aes(Var2, rank, color=Var1)) + geom_smooth(method='ggpchip', se=F) + geom_point()

You can then use pchip to smooth other geoms, eg area plots:

ggplot(df, aes(Var2, rank, fill=Var1)) + stat_smooth(method='ggpchip', geom='area', position='fill')
Cheesy answered 9/1, 2019 at 12:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.