I'm running a weighted t test in Python and am seeing different results. It appears that the issue is the degree of freedom calculation. Would like to understand why I'm seeing different outputs.
Here is some sample code.
In R:
library(weights)
x <- c(373,398,245,272,238,241,134,410,158,125,198,252,577,272,208,260)
y <- c(411,471,320,364,311,390,163,424,228,144,246,371,680,384,279,303)
weightsa = c(rep(1,8), rep(2,8))
weightsb = c(rep(2,8), rep(1,8))
wtd.t.test(x = x,
y = y,
weight = weightsa,
weighty = weightsb, samedata=F)
$test [1] "Two Sample Weighted T-Test (Welch)"
$coefficients
t.value df p.value
-1.88907197 29.93637837 0.06860382
$additional Difference Mean.x Mean.y Std. Err -80.50000
267.12500 347.62500 42.61352
In Python:
import numpy as np
from statsmodels.stats.weightstats import ttest_ind
x = np.asarray([373,398,245,272,238,241,134,410,158,125,198,252,577,272,208,260])
y = np.asarray([411,471,320,364,311,390,163,424,228,144,246,371,680,384,279,303])
weightsa = [1] * 8 + [2] * 8
weightsb = [2] * 8 + [1] * 8
ttest_ind(x, y, usevar='unequal', weights=(weightsa, weightsb))
(-2.3391969704691085, 0.023733058922455107, 45.90244683439944)
P value is .06 in R, .02 in Python.
The R source code uses the Satterthwaite formula for degrees of freedom:
df <- (((vx/n) + (vy/n2))^2)/((((vx/n)^2)/(n - 1)) +
((vy/n2)^2/(n2 - 1)))
The Python function source code also purports to use this formula:
def dof_satt(self):
'''degrees of freedom of Satterthwaite for unequal variance
'''
d1 = self.d1
d2 = self.d2
#this follows blindly the SPSS manual
#except I use ``_var`` which has ddof=0
sem1 = d1._var / (d1.nobs-1)
sem2 = d2._var / (d2.nobs-1)
semsum = sem1 + sem2
z1 = (sem1 / semsum)**2 / (d1.nobs - 1)
z2 = (sem2 / semsum)**2 / (d2.nobs - 1)
dof = 1. / (z1 + z2)
return dof
The numerator here looks the same but the denominator appears quite different.