Order discrete x scale by frequency/value
Asked Answered
B

7

158

I am making a dodged bar chart using ggplot with discrete x scale, the x axis are now arranged in alphabetical order, but I need to rearrange it so that it is ordered by the value of the y-axis (i.e., the tallest bar will be positioned on the left).

I tried order or sort, but result in sort the x-axis, but not the bars respectively.

What have I done wrong?

Beldam answered 15/7, 2010 at 8:11 Comment(0)
C
113

Try manually setting the levels of the factor on the x-axis. For example:

library(ggplot2)
# Automatic levels
ggplot(mtcars, aes(factor(cyl))) + geom_bar()    

ggplot of the cars dataset with factor levels automatically determined

# Manual levels
cyl_table <- table(mtcars$cyl)
cyl_levels <- names(cyl_table)[order(cyl_table)]
mtcars$cyl2 <- factor(mtcars$cyl, levels = cyl_levels)
# Just to be clear, the above line is no different than:
# mtcars$cyl2 <- factor(mtcars$cyl, levels = c("6","4","8"))
# You can manually set the levels in whatever order you please. 
ggplot(mtcars, aes(cyl2)) + geom_bar()

ggplot of the cars dataset with factor levels reordered manually

As James pointed out in his answer, reorder is the idiomatic way of reordering factor levels.

mtcars$cyl3 <- with(mtcars, reorder(cyl, cyl, function(x) -length(x)))
ggplot(mtcars, aes(cyl3)) + geom_bar()

ggplot of the cars dataset with factor levels reordered using the reorder function

Coumas answered 15/7, 2010 at 12:23 Comment(0)
N
216

The best way for me was using vector with categories in order I need as limits parameter to scale_x_discrete. I think it is pretty simple and straightforward solution.

ggplot(mtcars, aes(factor(cyl))) + 
  geom_bar() + 
  scale_x_discrete(limits=c(8,4,6))

enter image description here

Natatorium answered 18/9, 2011 at 22:15 Comment(5)
@HendyIrawan there is no legend unless you've got other dimensions (color, fill) also mapped to the same variable.Mishap
I think this is the best answer. It controls the order of the x-axis values and doesn't transform or affect the data-frame. The use of factor and reorder changes characteristics of the data, albeit within the ggplot() call, and so does more than it need do for the problem at hand.Sealy
This should be the accepted answer !! Why complicate things by writing 2 to 3 lines of code for something you can do in a single elegant (predefined) line of code?Ulises
This also worked for me to order x by value of y: scale_x_discrete(limits = DT$x[order(-DT$y)])+Faunie
If you want to specify the order for all possible levels but don't want unused levels to end up in you plot, check out my answer below.Epistyle
C
113

Try manually setting the levels of the factor on the x-axis. For example:

library(ggplot2)
# Automatic levels
ggplot(mtcars, aes(factor(cyl))) + geom_bar()    

ggplot of the cars dataset with factor levels automatically determined

# Manual levels
cyl_table <- table(mtcars$cyl)
cyl_levels <- names(cyl_table)[order(cyl_table)]
mtcars$cyl2 <- factor(mtcars$cyl, levels = cyl_levels)
# Just to be clear, the above line is no different than:
# mtcars$cyl2 <- factor(mtcars$cyl, levels = c("6","4","8"))
# You can manually set the levels in whatever order you please. 
ggplot(mtcars, aes(cyl2)) + geom_bar()

ggplot of the cars dataset with factor levels reordered manually

As James pointed out in his answer, reorder is the idiomatic way of reordering factor levels.

mtcars$cyl3 <- with(mtcars, reorder(cyl, cyl, function(x) -length(x)))
ggplot(mtcars, aes(cyl3)) + geom_bar()

ggplot of the cars dataset with factor levels reordered using the reorder function

Coumas answered 15/7, 2010 at 12:23 Comment(0)
A
45

You can use reorder:

qplot(reorder(factor(cyl),factor(cyl),length),data=mtcars,geom="bar")

Edit:

To have the tallest bar at the left, you have to use a bit of a kludge:

qplot(reorder(factor(cyl),factor(cyl),function(x) length(x)*-1),
   data=mtcars,geom="bar")

I would expect this to also have negative heights, but it doesn't, so it works!

Aloise answered 15/7, 2010 at 14:39 Comment(4)
I'm shocked this answer doesn't have more upvotes, 90% of the time this is the proper way to do it.Mishap
I think both the factor calls are superfluous. There is an implicit call to factor for the first argument and hte second argument is assumed to be numeric.Locomotor
An explanation that helped me figure out what these solutions were doing under the hood: rstudio-pubs-static.s3.amazonaws.com/…Jannet
desc(reorder(.)) can give you the tallest bar on the left.Folkrock
S
41

Hadley has been developing a package called forcats. This package makes the task so much easier. You can exploit fct_infreq() when you want to change the order of x-axis by the frequency of a factor. In the case of the mtcars example in this post, you want to reorder levels of cyl by the frequency of each level. The level which appears most frequently stays on the left side. All you need is the fct_infreq().

library(ggplot2)
library(forcats)

ggplot(mtcars, aes(fct_infreq(factor(cyl)))) +
geom_bar() +
labs(x = "cyl")

If you wanna go the other way around, you can use fct_rev() along with fct_infreq().

ggplot(mtcars, aes(fct_rev(fct_infreq(factor(cyl))))) +
geom_bar() +
labs(x = "cyl") 

enter image description here

Sheree answered 22/12, 2016 at 6:46 Comment(1)
This works for plot_ly (R plotly) too.Cracknel
D
2

I realize this is old, but maybe this function I created is useful to someone out there:

order_axis<-function(data, axis, column)
{
  # for interactivity with ggplot2
  arguments <- as.list(match.call())
  col <- eval(arguments$column, data)
  ax <- eval(arguments$axis, data)

  # evaluated factors
  a<-reorder(with(data, ax), 
             with(data, col))

  #new_data
  df<-cbind.data.frame(data)
  # define new var
  within(df, 
         do.call("<-",list(paste0(as.character(arguments$axis),"_o"), a)))
}

Now, with this function you can interactively plot with ggplot2, like this:

ggplot(order_axis(df, AXIS_X, COLUMN_Y), 
       aes(x = AXIS_X_o, y = COLUMN_Y)) +
        geom_bar(stat = "identity")

As can be seen, the order_axis function creates another dataframe with a new column named the same but with a _oat the end. This new column has levels in ascending order, so ggplot2 automatically plots in that order.

This is somewhat limited (only works for character or factor and numeric combinations of columns and in ascending order) but I still find it very useful for plotting on the go.

Dye answered 8/10, 2015 at 23:0 Comment(1)
I guess I don't see the advantage of this compared to just using reorder directly. Doesn't ggplot(df, aes(x = reorder(AXIS_X, COLUMN_Y), y = COLUMN_Y)) + ... do the same thing, about as concisely, and without the helper function?Mishap
E
1

@Yuriy Petrovskiy's answer is great if you know the levels you want to plot beforehand. If you don't (e.g. because you don't want to plot levels not present in the data), consider using a limit function instead to specify the order:

library(ggplot2)

my_order <- as.character(c(8,3,4,5,6))    # the `as.character` calls are only
ggplot(mtcars, aes(as.character(cyl))) +  # necessary for the OP's example
  geom_bar() +
  scale_x_discrete(limits = function(x) intersect(my_order, x))

From the documentation of scale_x_discrete:

limits
      One of:
      - NULL to use the default scale values
      - A character vector that defines possible values of the scale and their order
      - A function that accepts the existing (automatic) values and returns new ones

Otherwise your graph would end up like this (might be preferable):

ggplot(mtcars, aes(as.character(cyl))) +
  geom_bar() +
  scale_x_discrete(limits = my_order)

Epistyle answered 4/3, 2021 at 21:9 Comment(0)
A
1

Another option is to manually set the order along the x-axis using fct_relevel from forcats (part of tidyverse). However, for arranging by frequency, @jazzurro provides the best answer by using fct_infreq (also from forcats).

library(tidyverse)

ggplot(iris, aes(
  x = fct_relevel(Species, 'virginica', 'versicolor', 'setosa'),
  y = Petal.Width)) +
  geom_col() +
  xlab("Species") +
  ylab("Petal Width") +
  theme_bw()

Output

enter image description here

Further, the variable needs to be a factor before using fct_relevel inside ggplot. So, just apply factor to the variable first, then use fct_relevel.

ggplot(mtcars, aes(fct_relevel(factor(cyl), "6", "8", "4"))) +
  geom_bar() +
  labs(x = "cyl")

Output

enter image description here

Ailsun answered 26/12, 2021 at 18:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.