Ordering the axis labels in geom_tile
Asked Answered
S

5

12

I have a data frame containing order data for each of 20+ products from each of 20+ countries. I have put it in a highlight table using ggplot2 with code similar to this:

require(ggplot2)
require(reshape)
require(scales)

mydf <- data.frame(industry = c('all industries','steel','cars'), 
    'all regions' = c(250,150,100), americas = c(150,90,60), 
     europe = c(150,60,40), check.names = FALSE)
mydf

mymelt <- melt(mydf, id.var = c('industry'))
mymelt

ggplot(mymelt, aes(x = industry, y = variable, fill = value)) +
    geom_tile() + geom_text(aes(fill = mymelt$value, label = mymelt$value))

Which produces a plot like this:

highlight table

In the real plot, the 450 cell table very nicely shows the 'hotspots' where orders are concentrated. The last refinement I want to implement is to arrange the items on both the x-axis and y-axis in alphabetical order. So in the plot above, the y-axis (variable) would be ordered as all regions, americas, then europe and the x-axis (industry) would be ordered all industries, cars and steel. In fact the x-axis is already ordered alphabetically, but I wouldn't know how to achieve that if it were not already the case.

I feel somewhat embarrassed about having to ask this question as I know there are many similar on SO, but sorting and ordering in R remains my personal bugbear and I cannot get this to work. Although I do try, in all except the simplest cases I got lost in a welter of calls to factor, levels, sort, order and with.

Q. How can I arrange the above highlight table so that both y-axis and x-axis are ordered alphabetically?

EDIT: The answers from smillig and joran below do resolve the question with the test data but with the real data the problem remains: I can't get an alphabetical sort. This leaves me scratching my head as the basic structure of the data frame looks the same. Clearly I have omitted something, but what??

> str(mymelt)
'data.frame':   340 obs. of  3 variables:
 $ Industry: chr  "Animal and vegetable products" "Food and beverages" "Chemicals" "Plastic and rubber goods" ...
 $ variable: Factor w/ 17 levels "Other areas",..: 17 17 17 17 17 17 17 17 17 17 ...
 $ value   : num  0.000904 0.000515 0.007189 0.007721 0.000274 ...

However, applying the with statement doesn't result in levels with an alphabetical sort.

> with(mymelt,factor(variable,levels = rev(sort(unique(variable)))))

  [1] USA                   USA                   USA                  
  [4] USA                   USA                   USA                  
  [7] USA                   USA                   USA                  
 [10] USA                   USA                   USA                  
 [13] USA                   USA                   USA                  
 [16] USA                   USA                   USA                  
 [19] USA                   USA                   Canada               
 [22] Canada                Canada                Canada               
 [25] Canada                Canada                Canada               
 [28] Canada                Canada                Canada    

All the way down to:

 [334] Other areas           Other areas           Other areas          
 [337] Other areas           Other areas           Other areas          
 [340] Other areas

And if you do a levels() it seems to show the same thing:

 [1] "Other areas"           "Oceania"               "Africa"               
 [4] "Other Non-Eurozone"    "UK"                    "Other Eurozone"       
 [7] "Holland"               "Germany"               "Other Asia"           
[10] "Middle East"           "ASEAN-5"               "Singapore"            
[13] "HK/China"              "Japan"                 "South Central America"
[16] "Canada"                "USA"  

That is, the non-reversed version of the above.

The following shot shows what the plot of the real data looks like. As you can see, the x-axis is sorted and the y-axis is not. I'm perplexed. I'm missing something but can't see what it is.

screenshot of plot with real data

Spongy answered 22/7, 2012 at 9:14 Comment(0)
O
6

The y-axis on your chart is also already ordered alphabetically, but from the origin. I think you can achieve the order of the axes that you want by using xlim and ylim. For example:

ggplot(mymelt, aes(x = industry, y = variable, fill = value)) +
    geom_tile() + geom_text(aes(fill = mymelt$value, label = mymelt$value)) +
    ylim(rev(levels(mymelt$variable))) + xlim(levels(mymelt$industry))

will order the y-axis from all regions at the top, followed by americas, and then europe at the bottom (which is reverse alphabetical order, technically). The x-axis is alphabetically ordered from all industries to steel with cars in between.

enter image description here

Otis answered 22/7, 2012 at 11:5 Comment(8)
thank you for your answer. If I take the ggplot call above and slot it into my code, I get a Error in UseMethod("limits") : no applicable method for 'limits' applied to an object of class "NULL" error. Was there something else you did in addition to adding the xlim and ylim statements?Spongy
No, I just ran your code exactly as it appears above except for adding the xlim and ylim bits. I'm sorry that I don't know what that error means.Otis
I suspect it's because levels(mymelt$industry) returns NULL. I appreciate the attempt.Spongy
It shouldn't though. For me, levels(mymelt$industry) gives [1] "all industries" "cars" "steel". What does str(mymelt) tell you? Both industry and variable should be Factors.Otis
Well, machines vary. I have options(stringsAsFactors=FALSE) in my startup - that's probably the cause.Spongy
Then wouldn't mymelt$industry<-as.factor(mymelt$industry) solve your problem?Otis
I added mydf$industry <- as.factor(mydf$industry) and the example above now seems to work. Will experiment with the real data and report back. CheersSpongy
The suggested changes do reverse the y-axis (mymelt$variable) but it doesn't leave it sorted alphabetically.Spongy
F
4

As smillig says, the default is already to order the axes alphabetically, but the y axis will be ordered from the lower left corner up.

The basic rule with ggplot2 that applies to almost anything that you want in a specific order is:

  • If you want something to appear in a particular order, you must make the corresponding variable a factor, with the levels sorted in your desired order.

In this case, all you should need to do it this:

mymelt$variable <- with(mymelt,factor(variable,levels = rev(sort(unique(variable)))))

which should work regardless of whether you're running R with stringsAsFactors = TRUE or FALSE.

This principle applies to ordering axis labels, ordering bars, ordering segments within bars, ordering facets, etc.

For continuous variables there is a convenient scale_*_reverse() but apparently not for discrete variables, which would be a nice addition, I think.

Fortenberry answered 22/7, 2012 at 15:4 Comment(6)
@Spongy This is very simple: character variables -> default alphabetical ordering. factor variables -> ordered in the order their levels are in. That's all there is to it.Fortenberry
Thanks, this is actually the bit I can't get right! I have tried something like within(mymelt, variable <- factor(mymelt$variable, levels = mymelt$variable[order(mymelt$variable, decreasing = T)], ordered = TRUE)) but this didn't do the trick.Spongy
@Spongy (1) Using within means you can omit the mymelt$, (2) try it with levels = sort(levels(variable)) (or rev it if needed).Fortenberry
appreciate the help; as per below it's not quite getting there as variable still isn't sorted within(mymelt, variable <- factor(variable, levels = sort(levels(variable)), ordered = TRUE)) Industry variable value 1 Animal and vegetable products USA 9.039006e-04 2 Food and beverages USA 5.152928e-04Spongy
@Spongy If you actually bother to give me a reproducible data set I wil demonstrate exactly how to do this. You will be able to copy+paste it and it will work. Until then, I can't help any more.Fortenberry
let us continue this discussion in chatFortenberry
S
1

Another possibility is to use fct_reorder from forecast library.

library(forecast)
mydf %>%
pivot_longer(cols=c('all regions', 'americas', 'europe')) %>% 
  mutate(name1=fct_reorder(name, value, .desc=FALSE)) %>% 
  ggplot( aes(x = industry, y = name1, fill = value)) +
  geom_tile() + geom_text(aes( label = value))
Sourdine answered 29/7, 2020 at 17:55 Comment(0)
B
0

maybe this StackOverflow question can help:

Order data inside a geom_tile

specifically the first answer by Brandon Bertelsen:

"Note it's not an ordered factor, it's a factor in the right order"

It helped me to get the right order of the y-axis in a ggplot2 geom_tile plot.

Beep answered 11/7, 2014 at 12:51 Comment(0)
S
-1

Maybe a little bit late,

with(mymelt,factor(variable,levels = rev(sort(unique(variable)))))

this function doesn't order, because you are ordering "variable" that has no order (it's an unordered factor).

You should transform first the variable to a character, with the as.character function, like so:

with(mymelt,factor(variable,levels = rev(sort(unique(as.character(variable))))))
Sharpset answered 26/2, 2015 at 6:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.