standard error binary variable R
Asked Answered
K

1

6

How can I calculate the standard error for a binary variable using R? I have a group of participants performing a task across several conditions. The output might be 0 (incorrect) or 1 (correct). I have calculated the mean proportion of correct answers and standard error (SE) in the next way:

mean<-tapply(dataRsp$Accuracy, dataRsp$Condition, FUN=mean)

SE<- with(dataRsp, tapply(Accuracy, Condition, sd)/sqrt(summary(dataRsp$Condition)) )

But the SE are extremelly tight that they can hardly be correct. Might someone give me some ideas?I found that the next might be the solution,

sqrt(p.est*(1-p.est)/n)

... but I don't know how to implement it to R.

Kilovolt answered 26/7, 2016 at 7:35 Comment(0)
D
9

Suppose that for variable X there are only 2 outcomes (0/1) and we assume that the chance for success (1) equals p. This means that X follows a Bernoulli(p) distribution.

The mean and variance are then given by p and p*(1-p)/n, where n is your sample size Now change p by p.est, where p.est is the proportions of correct of answers.

So if you have a variable called binary with 1's for successes and 0's for failures:

p.est <- mean(binary)
variance <- (p.est*(1-p.est))/nrow(binary)
std.dev <- sqrt(variance)

EDIT:

You also said that you found very small SE's, which were counter intuitive. Let us take a closer look at the formula for the variance: p*(1-p)/n. The largest value the numerator (p*(1-p)) can take is only 0.25, i.e., when p=0.5. This value can only decrease, since we divide it by n (the number of observations). Suppose we have p=0.5 and n=100, the variance is then only 0.0025. To find the SE, we take the square root, which will give an SE of 0.05 in this example. If you have more observations, i.e., n>100 the variance and SE will only decrease even more (intuition: more data => more certainty => smaller variance/SE).

If the formula for the variance/SE is explained like this, is it still weird that you have small SE's?

Diarthrosis answered 26/7, 2016 at 7:54 Comment(3)
Thank you for your help. The code word well, but I get a value of 0.006 which does not make any sense to me. With an average accuracy of e.g., 85% the expected SE should be much larger. Not sure what might be the underlying reasonKilovolt
Thank you Marcel. This was very clarifying. (p*(1-p))=0.13 n=3290, the variance is very low, and therefore I obtain SE=0.006. I though it was contraituitive, but attending to the large n it might make sense. Thank you.Kilovolt
@Kilovolt Happy to help! If your question is fully answered please check the checkmark box just below the up/downvote arrowsDiarthrosis

© 2022 - 2024 — McMap. All rights reserved.