ggplot2: sorting a plot
Asked Answered
H

5

65

I have a data.frame, that is sorted from highest to lowest. For example:

x <- structure(list(variable = structure(c(10L, 6L, 3L, 4L, 2L, 8L, 
9L, 5L, 1L, 7L), .Label = c("a", "b", "c", "d", "e", "f", "g", 
"h", "i", "j"), class = c("ordered", "factor")), value = c(0.990683229813665, 
0.975155279503106, 0.928571428571429, 0.807453416149068, 0.717391304347826, 
0.388198757763975, 0.357142857142857, 0.201863354037267, 0.173913043478261, 
0.0496894409937888)), .Names = c("variable", "value"), row.names = c(10L, 
6L, 3L, 4L, 2L, 8L, 9L, 5L, 1L, 7L), class = "data.frame")

ggplot(x, aes(x=variable,y=value)) + geom_bar(stat="identity") + 
 scale_y_continuous("",label=scales::percent) + coord_flip() 

Now, the data is nice and sorted, but when I plot, it comes out sorted by factor. It's annoying, how do I fix it?

Humfrey answered 19/9, 2010 at 1:29 Comment(3)
With R version 3.2.2, I get an error: scale_y_continuous("", formatter = "percent") : unused argument (formatter = "percent")Septuagesima
Yes, I beleive its scale_y_continuos(labels=percent) and you must also load the scales package.Humfrey
Then I have a new error Error: stat_count() must not be used with a y aesthetic.Septuagesima
D
65

Here are a couple of ways.

The first will order things based on the order seen in the data frame:

x$variable <- factor(x$variable, levels=unique(as.character(x$variable)) )

The second orders the levels based on another variable (value in this case):

x <- transform(x, variable=reorder(variable, -value) ) 
Diametral answered 19/9, 2010 at 3:31 Comment(3)
The second one consitently provided the result I was looking for without the "-".Humfrey
reorder() will be overwritten by the gdata package. If you're at a loss for why it's not working, this could be why.Humfrey
I wish ggplot2 would be rewritten to make it little easier. I already sort my data.frame and why the order is not respected by the plot....Susuable
I
91

This seems to be what you're looking for:

g <- ggplot(x, aes(reorder(variable, value), value))
g + geom_bar() + scale_y_continuous(formatter="percent") + coord_flip()

The reorder() function will reorder your x axis items according to the value of variable.

Inch answered 21/9, 2010 at 8:24 Comment(4)
Would be good to add an explanation of what this is supposed to do.Ashes
In case someone is having issues with the formatter= argument: this has changed to labels = scales::percent in more recent versions (see https://mcmap.net/q/126438/-show-percent-instead-of-counts-in-charts-of-categorical-variables).Gwenn
What does this do when there are multiple groups?Azure
Unfortunately the x-labels are now numeric, and not the variable's values.Hygienic
D
65

Here are a couple of ways.

The first will order things based on the order seen in the data frame:

x$variable <- factor(x$variable, levels=unique(as.character(x$variable)) )

The second orders the levels based on another variable (value in this case):

x <- transform(x, variable=reorder(variable, -value) ) 
Diametral answered 19/9, 2010 at 3:31 Comment(3)
The second one consitently provided the result I was looking for without the "-".Humfrey
reorder() will be overwritten by the gdata package. If you're at a loss for why it's not working, this could be why.Humfrey
I wish ggplot2 would be rewritten to make it little easier. I already sort my data.frame and why the order is not respected by the plot....Susuable
R
12

I've recently been struggling with a related issue, discussed at length here: Order of legend entries in ggplot2 barplots with coord_flip() .

As it happens, the reason I had a hard time explaining my issue clearly, involved the relation between (the order of) factors and coord_flip(), as seems to be the case here.

I get the desired result by adding + xlim(rev(levels(x$variable))) to the ggplot statement:

ggplot(x, aes(x=variable,y=value)) + geom_bar() + 
scale_y_continuous("",formatter="percent") + coord_flip() 
+  xlim(rev(levels(x$variable)))

This reverses the order of factors as found in the original data frame in the x-axis, which will become the y-axis with coord_flip(). Notice that in this particular example, the variable also happen to be in alphabetical order, but specifying an arbitrary order of levels within xlim() should work in general.

Ridinger answered 5/9, 2011 at 16:41 Comment(0)
T
5

I don't know why this question was reopened but here is a tidyverse option.

x %>% 
  arrange(desc(value)) %>%
  mutate(variable=fct_reorder(variable,value)) %>% 
ggplot(aes(variable,value,fill=variable)) + geom_bar(stat="identity") + 
  scale_y_continuous("",label=scales::percent) + coord_flip() 
Trudytrue answered 18/12, 2018 at 11:53 Comment(4)
It wasn't reopened. Someone update the parameter arguments to reflect changes to ggplot2 since 2010 :)Humfrey
Isn't it that geom_bar(stat="identity") = geom_col()?Hygienic
@JabroJacob Yes but not sure why you asked since OP used stat identity and just needed a sorted plot.Trudytrue
Just to hint that it can be used insteadHygienic
S
2

You need to make the x-factor into an ordered factor with the ordering you want, e.g

x <- data.frame("variable"=letters[1:5], "value"=rnorm(5)) ## example data
x <- x[with(x,order(-value)), ] ## Sorting
x$variable <- ordered(x$variable, levels=levels(x$variable)[unclass(x$variable)])

ggplot(x, aes(x=variable,y=value)) + geom_bar() +
   scale_y_continuous("",formatter="percent") + coord_flip()

I don't know any better way to do the ordering operation. What I have there will only work if there are no duplicate levels for x$variable.

Stacy answered 19/9, 2010 at 1:40 Comment(3)
This works for the example I have provided, but it doesn't seem to translate for my actual problem.Humfrey
I've changed the example to provide actual data that I'm working withHumfrey
It doesn't need to be an ordered factor - it just needs to be a factor with the right order.Ornithomancy

© 2022 - 2024 — McMap. All rights reserved.