ggplot2 and cumsum()
Asked Answered
T

1

7

I have a set of UNIX timestamps and URIs and I'm trying to plot the cumulative count of requests for each URI. I managed to do that for one URI at a time using a dummy column:

x.df$count <- apply(x.df,1,function(row) 1) # Create a dummy column for cumsum
x.df <- x.df[order(x.df$time, decreasing=FALSE),] # Sort
ggplot(x.df, aes(x=time, y=cumsum(count))) + geom_line()

However, that would make roughly 30 plots in my case.

ggplot2 does allow you to plot multiple lines into one plot (I copied this piece of code from here):

ggplot(data=test_data_long, aes(x=date, y=value, colour=variable)) +
    geom_line()

The problem is that, this way, cumsum() would count on and on.

Does anybody have an idea?

Thwart answered 2/4, 2013 at 15:29 Comment(2)
Can you provide a small example data set that illustrates what you're describing?Streamway
this sounds like a job for plyr or data.table. Split the data by URI then to your cumsum on each piece. data.table(x); x[, list(count=.I), by=URI] Or something like that...Lubbi
T
9

Here's a test data which uses plyr's transform to calculate the cumulative sum first and then apply that data to plot using ggplot2:

set.seed(45)
DF <- data.frame(grp = factor(rep(1:5, each=10)), x=rep(1:10, 5))
DF <- transform(DF, y=runif(nrow(DF)))

# use plyr to calculate cumsum per group of x
require(plyr)
DF.t <- ddply(DF, .(grp), transform, cy = cumsum(y))

# plot
require(ggplot2)
ggplot(DF.t, aes(x=x, y=cy, colour=grp, group=grp)) + geom_line()

enter image description here

Tebet answered 2/4, 2013 at 15:54 Comment(5)
Sorry, I didn't understand what you meant by sample data, I'm still rather new with R. Your plot doesn't actually show the cumulative sums, though. The lines would have to be monotonic. (BTW: You wouldn't need cumsum to create that kind of lines, ECDF would do the job.)Thwart
I generated some sample data, as you dint provide one (see @joran's comment). I guess you are still looking at the old plot? This is monotonic and is the cumulative sum. if you want to see the points just add, + geom_point().Tebet
Hi Arun, thanks for your help. The example worked, but I simply didn't get a monotonic plot for the actual data. So I started to play around with the numbers in the example and I think there's a problem with big numbers. Can you reproduce this?Thwart
@Thwart the onus is on you to provide a reproducible example.Headward
I think I have identified the problem with my data. Looks like you will not get a monotonic plot if your data is not sorted in increasing order.Thwart

© 2022 - 2024 — McMap. All rights reserved.