p-value from fisher.test() does not match phyper()
Asked Answered
L

1

6

The Fisher's Exact Test is related to the hypergeometric distribution, and I would expect that these two commands would return identical pvalues. Can anyone explain what I'm doing wrong that they do not match?

#data (variable names chosen to match dhyper() argument names)
x = 14
m = 20
n = 41047
k = 40

#Fisher test, alternative = 'greater'
(fisher.test(matrix(c(x, m-x, k-x, n-(k-x)),2,2), alternative='greater'))$p.value 
#returns 2.01804e-39

#geometric distribution, lower.tail = F, i.e. P[X > x]
phyper(x, m, n, k, lower.tail = F, log.p = F)
#returns 5.115862e-43
Leghorn answered 29/10, 2018 at 18:48 Comment(1)
To the close voters, there are two ways to answer this question. One involves looking at these two function calls, seeing how they relate, and what might need to be changed to produce the same result. That seems entirely on topic here. The other involves describing the statistical theory behind the function calls, which is probably best asked on another SE site. Since this question was asked here, and is answerable on topic here, I would expect that's what the OP wants. If not, please edit and migrate.Howsoever
H
8

In this case, the actual call to phyper that is relevant is phyper(x - 1, m, n, k, lower.tail = FALSE). Look at the source code for fisher.test relevant to your call of fisher.test(matrix(c(x, m-x, k-x, n-(k-x)),2,2), alternative='greater'). At line 138, PVAL is set to:

switch(alternative, less = pnhyper(x, or), 
    greater = pnhyper(x, or, upper.tail = TRUE), 
    two.sided = {
      if (or == 0) as.numeric(x == lo) else if (or == 
        Inf) as.numeric(x == hi) else {
        relErr <- 1 + 10^(-7)
        d <- dnhyper(or)
        sum(d[d <= d[x - lo + 1] * relErr])
      }
    })

Since alternative = 'greater', PVAL is set to pnhyper(x, or, upper.tail = TRUE). You can see pnhyper defined on line 122. Here, or = 1, which is passed to ncp, so the call is phyper(x - 1, m, n, k, lower.tail = FALSE)

With your values:

x = 14
m = 20
n = 41047
k = 40
phyper(x - 1, m, n, k, lower.tail = FALSE)
# [1] 2.01804e-39
Howsoever answered 29/10, 2018 at 19:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.