Generate stochastic random deviates from a density object with R

Asked 14/9, 2009 at 15:22 Answered 14/9, 2009 at 16:34

I have a density object dd created like this:

x1 <- rnorm(1000) 
x2 <- rnorm(1000, 3, 2) 
x <- rbind(x1, x2)
dd <- density(x) 
plot(dd)

Which produces this very non-Gaussian distribution:

alt text http://www.cerebralmastication.com/wp-content/uploads/2009/09/nongaus.png

I would ultimately like to get random deviates from this distribution similar to how rnorm gets deviates from a normal distribution.

The way I am trying to crack this is to get the CDF of my kernel and then get it to tell me the variate if I pass it a cumulative probability (inverse CDF). That way I can turn a vector of uniform random variates into draws from the density.

It seems like what I am trying to do should be something basic that others have done before me. Is there a simple way or a simple function to do this? I hate reinventing the wheel.

FWIW I found this R Help article but I can't grok what they are doing and the final output does not seem to produce what I am after. But it could be a step along the way that I just don't understand.

I've considered just going with a Johnson distribution from the suppdists package but Johnson won't give me the nice bimodal hump which my data has.

Verbality answered 14/9, 2009 at 15:22 Comment(2)

more of a stats question than programming... – Salmanazar 14/9, 2009 at 15:33

I know the stats. I want to implement the stats method in a given language. That's programming. – Verbality 14/9, 2009 at 15:34

Alternative approach:

sample(x, n, replace = TRUE)

Peru answered 14/9, 2009 at 16:34 Comment(3)

yeah, I've been over thinking this. If I do sample + a draw from a normal I should end up thickening my tails the same way the kernel would, right? Assuming I param my normal the same way the kerneling method would. – Verbality 14/9, 2009 at 17:2

Yes, add normal rvs with zero mean and sd=bandwidth from density estimate: sample(x, n, replace=TRUE) + rnorm(n,0,sd=0.4214) Simulating like this is discussed in Silverman's 1986 book on density estimation. – Ramayana 14/9, 2009 at 22:54

Or, sampling from the density curve, rather than from the data itself sample(dd$x, prob=dd$y, replace=T). – Fortuitism 19/1, 2014 at 9:47

This is just a mixture of normals. So why not something like:

rmnorm <- function(n,mean, sd,prob) {
    nmix <- length(mean)
    if (length(sd)!=nmix) stop("lengths should be the same.")
    y <- sample(1:nmix,n,prob=prob, replace=TRUE)
    mean.mix <- mean[y]
    sd.mix <- sd[y]
    rnorm(n,mean.mix,sd.mix)
}
plot(density(rmnorm(10000,mean=c(0,3), sd=c(1,2), prob=c(.5,.5))))

This should be fine if all you need are samples from this mixture distribution.

Coralline answered 14/9, 2009 at 16:17 Comment(2)

I like the idea! But my example is oversimplified for illustrative purposes. In reality I don't know the two modes and it might have just one mode and a long ass tail (i.e. leptokurtosis). But I like your example. I could not have programmed that near as succinctly. BTW, i believe are missing a c in : plot(density(rmnorm(10000,mean=c(0,3), sd=c(1,2), prob=c(.5,.5)))) – Verbality 14/9, 2009 at 16:29

Which is why you want Hadley's answer -- resampling it is. Remember that your density plot is /just an estimate/ that also depends on your smoothing parameter. – Ringe 14/9, 2009 at 16:49

Recommended topics

Hot tags