R calculate the standard error using bootstrap
Asked Answered
C

3

8

I have this array of values:

> df
[1] 2 0 0 2 2 0 0 1 0 1 2 1 0 1 3 0 0 1 1 0 0 0 2 1 2 1 3 1 0 0 0 1 1 2 0 1 3
[38] 1 0 2 1 1 2 2 1 2 2 2 1 1 1 2 1 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0
[75] 0 0 0 0 0 1 1 0 1 1 1 1 3 1 3 0 1 2 2 1 2 3 1 0 0 1

I want to use package boot to calculate the standard error of the data. http://www.ats.ucla.edu/stat/r/faq/boot.htm

So, I used this command to pursue:

library(boot)
boot(df, mean, R=10)

and I got this error:

Error in mean.default(data, original, ...) : 
'trim' must be numeric of length one

Can someone help me figure out the problem? Thanks

Coulee answered 20/8, 2013 at 17:38 Comment(1)
What is your function definition for c? The base c function is not suitable for bootstrapping.Adaline
S
16

If you are bootstrapping the mean you can do as follows:

set.seed(1)
library(boot)
x<-rnorm(100)
meanFunc <- function(x,i){mean(x[i])}
bootMean <- boot(x,meanFunc,100)
>bootMean

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = x, statistic = meanFunc, R = 100)


Bootstrap Statistics :
     original      bias    std. error
t1* 0.1088874 0.002614105  0.07902184

If you just input the mean as an argument you will get the error like the one you got:

bootMean <- boot(x,mean,100)
Error in mean.default(data, original, ...) : 
  'trim' must be numeric of length one
Spot answered 20/8, 2013 at 17:46 Comment(1)
I'm using the bootstrap function and wondering if there's a way to automatically pull the standard error out of the boot call? There doesn't seem to be a subset of "bootMean" that can be called to pull up the individual statistics separatelyBalneal
G
5

I never really used boot, since I do not understand what it will bring to the table.

Given that the standard error is defined as:

sd(sampled.df) / sqrt(length(df))

I believe you can simply use the following function to get this done:

custom.boot <- function(times, data=df) {
  boots <- rep(NA, times)
  for (i in 1:times) {
    boots[i] <- sd(sample(data, length(data), replace=TRUE))/sqrt(length(data))  
  }
  boots
}

You can then calculate the expected value for yourself (since you get a distribution of some sample realization):

# Mean standard error
mean(custom.boot(times=1000))
[1] 0.08998023

Some years later...

I think this is nicer:

mean(replicate(times, sd(sample(df, replace=T))/sqrt(length(df))))
Gamekeeper answered 20/8, 2013 at 18:35 Comment(0)
C
1

The function c is not sufficient for boot. If you'll look at the help for boot then you'll see that your function must be able to receive the data and an index. So, you need to write your own function. Furthermore, it should return the value that you want the standard error of, like the mean.

Cynth answered 20/8, 2013 at 17:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.