Transform only one axis to log10 scale with ggplot2
Asked Answered
T

4

68

I have the following problem: I would like to visualize a discrete and a continuous variable on a boxplot in which the latter has a few extreme high values. This makes the boxplot meaningless (the points and even the "body" of the chart is too small), that is why I would like to show this on a log10 scale. I am aware that I could leave out the extreme values from the visualization, but I am not intended to.

Let's see a simple example with diamonds data:

m <- ggplot(diamonds, aes(y = price, x = color))

alt text

The problem is not serious here, but I hope you could imagine why I would like to see the values at a log10 scale. Let's try it:

m + geom_boxplot() + coord_trans(y = "log10")

alt text

As you can see the y axis is log10 scaled and looks fine but there is a problem with the x axis, which makes the plot very strange.

The problem do not occur with scale_log, but this is not an option for me, as I cannot use a custom formatter this way. E.g.:

m + geom_boxplot() + scale_y_log10() 

alt text

My question: does anyone know a solution to plot the boxplot with log10 scale on y axis which labels could be freely formatted with a formatter function like in this thread?


Editing the question to help answerers based on answers and comments:

What I am really after: one log10 transformed axis (y) with not scientific labels. I would like to label it like dollar (formatter=dollar) or any custom format.

If I try @hadley's suggestion I get the following warnings:

> m + geom_boxplot() + scale_y_log10(formatter=dollar)
Warning messages:
1: In max(x) : no non-missing arguments to max; returning -Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In max(x) : no non-missing arguments to max; returning -Inf

With an unchanged y axis labels:

alt text

Tamanaha answered 15/1, 2011 at 12:7 Comment(6)
That's a bug in coord_trans - but you can specify custom labels to scale_y_log10...Horntail
Thank you @hadley, I should miss something but eg. + scale_y_continous(formatter=dollar) just do not work. I cannot see the result of any formatter given and I also get three In max(x) : no non-missing arguments to max; returning -Inf warnings messages.Tamanaha
@daroxzig: The examples I have seen for the formatter argument have all involved quoted names, so perhaps formatter="dollar"?Eleventh
@DWin: I tried with quotes also, but the result is exactly the same.Tamanaha
Formatter doesn't work (yet) but you can still set the labels manually...Horntail
@hadley: I will look after this (manual/vustom labels) also. Now, it looks like that data transformation and a scale_y_continuous formatter solved the problem. Thanks!Tamanaha
E
53

The simplest is to just give the 'trans' (formerly 'formatter') argument of either the scale_x_continuous or the scale_y_continuous the name of the desired log function:

library(ggplot2)  # which formerly required pkg:plyr
m + geom_boxplot() + scale_y_continuous(trans='log10')

EDIT: Or if you don't like that, then either of these appears to give different but useful results:

m <- ggplot(diamonds, aes(y = price, x = color), log="y")
m + geom_boxplot() 
m <- ggplot(diamonds, aes(y = price, x = color), log10="y")
m + geom_boxplot()

EDIT2 & 3: Further experiments (after discarding the one that attempted successfully to put "$" signs in front of logged values):

# Need a function that accepts an x argument
# wrap desired formatting around numeric result
fmtExpLg10 <- function(x) paste(plyr::round_any(10^x/1000, 0.01) , "K $", sep="")

ggplot(diamonds, aes(color, log10(price))) + 
  geom_boxplot() + 
  scale_y_continuous("Price, log10-scaling", trans = fmtExpLg10)

alt text

Note added mid 2017 in comment about package syntax change:

scale_y_continuous(formatter = 'log10') is now scale_y_continuous(trans = 'log10') (ggplot2 v2.2.1)

Eleventh answered 15/1, 2011 at 14:35 Comment(13)
Thank you @DWin, but this is not the one I was looking for. This way the y axis' labels will be converted to log10, but the axis will not be transformed. What I would like to get: one transformed axis (y) with not scientific labels.Tamanaha
@daroczig: See if this is more satisfactory. I would have sworn that the first time I ran my first solution that I got even powers of ten but I cannot reproduce. Maybe I was so focused on seeing the x-positions that I overlooked the obvious problemsEleventh
Thank you @DWin, I just tested your proposals, but as I can see both commands give back the same: the first image I attached to my question. What I would like to get: the last plots in my question (no. 3 and 4, as they are the same) with customizable label formatting.Tamanaha
@daroczig: The "successful experiment" with "dollarizing" used fmtLg10dlr <- function(x) dollar(log10(x)); m + geom_boxplot() + scale_y_continuous(formatter='fmtLg10dlr') , but it just looks "wrong" to me.Eleventh
I suspect you're trying to do something like ggplot(diamonds, aes(color, log10(price))) + geom_boxplot() + scale_y_continuous(formatter = function(x) format(10 ^ x)) - you need to transform the data and back-transform the labels.Horntail
@DWin and @hadley: thank you both, I just got to the same solution fifteen minutes before, that I have to transform and later retransform the data. See the other answer. Sorry for bothering!Tamanaha
@hadley: Got it. Thks. But shouldn't you fix the ylab, now that it is not logged values at the tick marks?Eleventh
@DWin: please update your answer to first transform the data and after apply the formatter function as discussed here in the comments, that I would be able to accept and upvote your answer. Thank you!Tamanaha
@daroczig: I did so and added the fix I was suggesting for scale label.Eleventh
Another similar solution, using sprintf: fmtdol<- function(x)sprintf('$%sK',x/1000)Gangplank
scale_y_continuous(formatter = 'log10') is now scale_y_continuous(trans = 'log10') (ggplot2 v2.2.1)Equitable
getting an error cannot coerce type 'closure' to vector of type 'character' when using a functionRoan
I now get the same error.The 'scales' package appears to have changed its mechanism for handling transformations. User-defined transformation no longer succeed for a variety of reasons, one of them from a failure with naming and another one from difficulty with scoping. See help(pac='scales', as.trans)Eleventh
S
20

I had a similar problem and this scale worked for me like a charm:

breaks = 10**(1:10)
scale_y_log10(breaks = breaks, labels = comma(breaks))

as you want the intermediate levels, too (10^3.5), you need to tweak the formatting:

breaks = 10**(1:10 * 0.5)
m <- ggplot(diamonds, aes(y = price, x = color)) + geom_boxplot()
m + scale_y_log10(breaks = breaks, labels = comma(breaks, digits = 1))

After executing::

enter image description here

Sfax answered 9/2, 2011 at 10:30 Comment(2)
I just noticed this very similar problem has the same solution.Sfax
thank you for pointing my attention to this alternate solution which would be complete with specifying the simple dollar formatter or by writing a custom one: + scale_y_log10(breaks = breaks, labels = dollar(breaks))Tamanaha
V
11

Another solution using scale_y_log10 with trans_breaks, trans_format and annotation_logticks()

library(ggplot2)

m <- ggplot(diamonds, aes(y = price, x = color))

m + geom_boxplot() +
  scale_y_log10(
    breaks = scales::trans_breaks("log10", function(x) 10^x),
    labels = scales::trans_format("log10", scales::math_format(10^.x))
  ) +
  theme_bw() +
  annotation_logticks(sides = 'lr') +
  theme(panel.grid.minor = element_blank())

Vudimir answered 8/8, 2018 at 16:42 Comment(2)
Very elegant outputColiseum
In 2020, this is the first answer that copies, pastes n' works. (Yes, I tried them all.) Thanks!Tritheism
T
2

I think I got it at last by doing some manual transformations with the data before visualization:

d <- diamonds
# computing logarithm of prices
d$price <- log10(d$price)

And work out a formatter to later compute 'back' the logarithmic data:

formatBack <- function(x) 10^x 
# or with special formatter (here: "dollar")
formatBack <- function(x) paste(round(10^x, 2), "$", sep=' ') 

And draw the plot with given formatter:

m <- ggplot(d, aes(y = price, x = color))
m + geom_boxplot() + scale_y_continuous(formatter='formatBack')

alt text

Sorry to the community to bother you with a question I could have solved before! The funny part is: I was working hard to make this plot work a month ago but did not succeed. After asking here, I got it.

Anyway, thanks to @DWin for motivation!

Tamanaha answered 15/1, 2011 at 17:47 Comment(1)
I think formatter now changed to labels => #10146609Accommodation

© 2022 - 2024 — McMap. All rights reserved.