change x location of violin plot in ggplot2
Asked Answered
T

2

4

I was trying to create a violin plot using a continuous variable factored in x. I currently have the x values of 0,3,5,8. When I plot them as a violin they show up equally spaced from each other. Is there a way to force the locations of the violins to be at essentially 0,3,5,8?

I included some sample data and the line I was essentially trying to run.

     condition movedur
 [1,]         5   0.935
 [2,]         0   1.635
 [3,]         3   0.905
 [4,]         8   0.875
 [5,]         3   1.060
 [6,]         8   1.110
 [7,]         3   1.830
 [8,]         5   1.060
 [9,]         5   1.385
[10,]         5   1.560
[11,]         0   1.335
[12,]         3   0.880
[13,]         0   1.030
[14,]         8   1.300
[15,]         3   1.230
[16,]         3   1.210
[17,]         5   1.710
[18,]         3   1.000
[19,]         0   1.365
[20,]         0   1.000

ggplot(a, aes(x = condition, y = movedur, fill = condition)) +
geom_violin()

When I run the full code I get the image below. But the x axis is equally spaced instead of being spaced by the values.

enter image description here

Tablecloth answered 2/7, 2018 at 16:58 Comment(0)
C
2

If you leave the condition variable as an integer/numeric for the x axis but use it as a factor for fill you can get the plot you want.

Note that the dataset example you give already has condition as an integer, but if it is a factor and you want to convert it you could do

a$condition = as.numeric(as.character(a$condition))

I add breaks in scale_x_continuous() to make the breaks look nice.

ggplot(a, aes(x = condition, y = movedur, fill = factor(condition))) +
     geom_violin() +
     scale_x_continuous(breaks = c(0, 3, 5, 8) )

enter image description here

Copse answered 2/7, 2018 at 17:11 Comment(0)
W
1

This is because violin plots are intended to be used for categorical data on the x-axis and so it is just treating the different values of condition as categories rather than values on a continuous axis. To get the desired result you can insert missing values corresponding to the other axis values with complete as shown below. Note that you need to insert a factor call to get ggplot2 to use a discrete fill scale.

library(tidyverse)
tbl <- structure(list(condition = c(5L, 0L, 3L, 8L, 3L, 8L, 3L, 5L, 5L, 5L, 0L, 3L, 0L, 8L, 3L, 3L, 5L, 3L, 0L, 0L), movedur = c(0.935, 1.635, 0.905, 0.875, 1.06, 1.11, 1.83, 1.06, 1.385, 1.56, 1.335, 0.88, 1.03, 1.3, 1.23, 1.21, 1.71, 1, 1.365, 1)), row.names = c(NA, -20L), class = c("tbl_df", "tbl", "data.frame"), spec = structure(list(cols = list(condition = structure(list(), class = c("collector_integer", "collector")), movedur = structure(list(), class = c("collector_double", "collector"))), default = structure(list(), class = c("collector_guess", "collector"))), class = "col_spec"))

tbl %>%
  complete(condition = 0:8) %>%
  ggplot() +
  geom_violin(aes(x = condition, y = movedur, fill = factor(condition)))
#> Warning: Removed 5 rows containing non-finite values (stat_ydensity).

Created on 2018-07-02 by the reprex package (v0.2.0).

Waterline answered 2/7, 2018 at 17:6 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.