adjusted bootstrap confidence intervals (BCa) with parametric bootstrap in boot package

I am attempting to use boot.ci from R's boot package to calculate bias- and skew-corrected bootstrap confidence intervals from a parametric bootstrap. From my reading of the man pages and experimentation, I've concluded that I have to compute the jackknife estimates myself and feed them into boot.ci, but this isn't stated explicitly anywhere. I haven't been able to find other documentation, although to be fair I haven't looked at the original Davison and Hinkley book on which the code is based ...

If I naively run b1 <- boot(...,sim="parametric") and then boot.ci(b1), I get the error influence values cannot be found from a parametric bootstrap. This error occurs if and only if I specify type="all" or type="bca"; boot.ci(b1,type="bca") gives the same error. So does empinf(b1). The only way I can get things to work is to explicitly compute jackknife estimates (using empinf() with the data argument) and feed these into boot.ci.

Construct data:

set.seed(101)
d <- data.frame(x=1:20,y=runif(20))
m1 <- lm(y~x,data=d)

Bootstrap:

b1 <- boot(d$y,
           statistic=function(yb,...) {
             coef(update(m1,data=transform(d,y=yb)))
           },
           R=1000,
           ran.gen=function(d,m) {
             unlist(simulate(m))
           },
           mle=m1,
           sim="parametric")

Fine so far.

boot.ci(b1)
boot.ci(b1,type="bca")
empinf(b1)

all give the error described above.

This works:

L <- empinf(data=d$y,type="jack",
            stype="i",
            statistic=function(y,f) {
              coef(update(m1,data=d[f,]))
            })

boot.ci(b1,type="bca",L=L)

Does anyone know if this is the way I'm supposed to be doing it?

update: The original author of the boot package responded to an e-mail:

... you are correct that the issue is that you are doing a parametric bootstrap. The bca intervals implemented in boot are non-parametric intervals and this should have been stated explicitely somewhere. The formulae for parametric bca intervals are not the same and depend on derivatives of the least favourable family likelihood when there are nuisance parameters as in your case. (See pp 206-207 in Davison & Hinkley) empinf assumes that the statistic is in one of forms used for non-parametric bootstrapping (which you did in your example call to empinf) but your original call to boot (correctly) had the statistic in a different form appropriate for parametric resampling.

You can certainly do what you're doing but I am not sure of the theoretical properties of mixing parametric resampling with non-parametric interval estimation.

require(boot) lmcoef <- function(data, i){ d <- data[i, ] d.reg <- lm(y~x, d) c(coef(d.reg)) } lmboot <- boot(d, lmcoef, R=999) m1 boot.ci(lmboot, index=2) # I am presuming that the interest is in the x-coefficient #---------------------------------- BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 999 bootstrap replicates CALL : boot.ci(boot.out = lmboot, index = 2) Intervals : Level Normal Basic 95% (-0.0210, 0.0261 ) (-0.0236, 0.0245 ) Level Percentile BCa 95% (-0.0171, 0.0309 ) (-0.0189, 0.0278 ) Calculations and Intervals on Original Scale Warning message: In boot.ci(lmboot, index = 2) : bootstrap variances needed for studentized intervals

Recommended topics

Hot tags