I have columnar data set that I am plotting a series of box plots with, most similar to the setup in this example: Boxplot of table using ggplot2
require(reshape2)
ggplot(data = melt(dd), aes(x=variable, y=value)) + geom_boxplot(aes(fill=variable))
However, in my case, each of the boxplots represents a different number of data points. For example, Column A might have 8000 data points, Column B might have 6000, Column C might have 2500, and Column D might have 800.
To help communicate this, I thought I could alpha the fill color of the box to reflect the number of datapoints. The darker the box, the more datapoints were used in computing the statistics the boxplot represents.
In the ggplot2 help file for geom_histogram, they use aes(fill=..count..) to shade the bins corresponding to the # of counts in the bin.
m <- ggplot(movies, aes(x=rating))
m + geom_histogram(aes(fill=..count..))
(Wanted to include a picture of the example histogram here, but can't because I don't have enough reputation points...sorry)
I tried using this with my ggplot geom_boxplot, but it doesn't seem to know the ..count.. part. Here is my line that is generating the boxplot:
ggplot(meltedData, aes(x=variable, y=value)) + geom_boxplot(aes(fill=variable), outlier.size = 1) + ylim(-4,3)
Anyone have any pointers? I know I can add the "alpha" property to geom_boxplot, but how can I apply it to each boxplot individually based on the # of datapoints in the boxplot?
Thanks in advance.
..count..
system very well, but I think it works with histograms because of thestat="bin"
argument. You may have to just addcount
to the data itself. – Desiderata