Restrain scattered jitter points within a violin plot by ggplot2
Asked Answered
L

3

8

A following is used to generate the violin plot in ggplot2 :

ggplot(violin,aes(x=variable,y=log(value+0.5),color=Group)) + 
  geom_violin(scale="width") + 
  geom_jitter(aes(group=Group), position=position_jitterdodge()) + 
  stat_summary(fun.y="mean",geom="crossbar", mapping=aes(ymin=..y.., ymax=..y..), 
     width=1, position=position_dodge(),show.legend = FALSE) + 
  theme(axis.text.x = element_text(angle = 45, margin=margin(0.5, unit="cm")))

A resulting plot looks like following;

enter image description here

As you can see, some points are jittered outside the boundary of violin shape and I need to those points to be inside of the violin. I've played different levels of jittering but have had any success. I'd appreciate any pointers to achieve this.

Langevin answered 27/6, 2018 at 19:16 Comment(2)
where does the data violin come from?Naoise
maybe try ggforce::geom_sinaWalt
E
7

It is a little bit old question but I think there is a better solution.

As @Richard Telford pointed out in a comment, geom_sina is the best solution IMO.

simulate data

df <- data.frame(data=rnorm(1200), 
                 group=rep(c("A","A","A", "B","B","C"),
                           200)
                 )

make plot

ggplot(df, aes(y=data,x=group,color=group)) +
  geom_violin()+
  geom_sina()

result

enter image description here

Hope this is helpful.

Edirne answered 13/3, 2020 at 15:51 Comment(1)
Hi @akh22, I am glad it works for you. If you like the answer you may want to upvote and/or mark as the answer to your question. ThanksEdirne
H
9

The package ggbeeswarm has the geoms quasirandom and beeswarm, which do exactly what you are searching for: https://github.com/eclarke/ggbeeswarm

Hammon answered 27/6, 2018 at 19:41 Comment(0)
E
7

It is a little bit old question but I think there is a better solution.

As @Richard Telford pointed out in a comment, geom_sina is the best solution IMO.

simulate data

df <- data.frame(data=rnorm(1200), 
                 group=rep(c("A","A","A", "B","B","C"),
                           200)
                 )

make plot

ggplot(df, aes(y=data,x=group,color=group)) +
  geom_violin()+
  geom_sina()

result

enter image description here

Hope this is helpful.

Edirne answered 13/3, 2020 at 15:51 Comment(1)
Hi @akh22, I am glad it works for you. If you like the answer you may want to upvote and/or mark as the answer to your question. ThanksEdirne
L
6

Option 1

Using the function geom_quasirandom from package geom_beeswarm:

The quasirandom geom is a convenient means to offset points within categories to reduce overplotting. Uses the vipor package.

library(ggbeeswarm)
p <- ggplot(mpg, aes(class, hwy))
p + geom_violin(width = 1.3) + geom_quasirandom(alpha = 0.2, width = 0.2)

enter image description here

Option 2

Not a satisfactory answer, because by restricting the horizontal jitter we defeat the purpose of handling overplotting. But you can enlarge the width of the violin plots (width = 1.3), and play with alpha for transparency and limit the horizontal jitter (width = .02).

p <- ggplot(mpg, aes(class, hwy))
p + geom_violin(width = 1.3) + geom_jitter(alpha = 0.2, width = .02)

enter image description here

Lenlena answered 27/6, 2018 at 22:25 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.