Different results with weighted t test between R and Python

I'm running a weighted t test in Python and am seeing different results. It appears that the issue is the degree of freedom calculation. Would like to understand why I'm seeing different outputs.

Here is some sample code.

In R:

library(weights)
x <- c(373,398,245,272,238,241,134,410,158,125,198,252,577,272,208,260)
y <- c(411,471,320,364,311,390,163,424,228,144,246,371,680,384,279,303)

weightsa = c(rep(1,8), rep(2,8))
weightsb = c(rep(2,8), rep(1,8))

wtd.t.test(x = x,
           y = y,
           weight = weightsa,
           weighty = weightsb, samedata=F)

$test [1] "Two Sample Weighted T-Test (Welch)"

$coefficients
    t.value          df     p.value 
-1.88907197 29.93637837  0.06860382 

$additional Difference     Mean.x     Mean.y   Std. Err   -80.50000 
267.12500  347.62500   42.61352

In Python:

import numpy as np
from statsmodels.stats.weightstats import ttest_ind
x = np.asarray([373,398,245,272,238,241,134,410,158,125,198,252,577,272,208,260])
y = np.asarray([411,471,320,364,311,390,163,424,228,144,246,371,680,384,279,303])

weightsa = [1] * 8 + [2] * 8
weightsb = [2] * 8 + [1] * 8

ttest_ind(x, y, usevar='unequal', weights=(weightsa, weightsb))

(-2.3391969704691085, 0.023733058922455107, 45.90244683439944)

P value is .06 in R, .02 in Python.
The R source code uses the Satterthwaite formula for degrees of freedom:

df <- (((vx/n) + (vy/n2))^2)/((((vx/n)^2)/(n - 1)) + 
      ((vy/n2)^2/(n2 - 1)))

The Python function source code also purports to use this formula:

def dof_satt(self):
        '''degrees of freedom of Satterthwaite for unequal variance
        '''
        d1 = self.d1
        d2 = self.d2
        #this follows blindly the SPSS manual
        #except I use  ``_var`` which has ddof=0
        sem1 = d1._var / (d1.nobs-1)
        sem2 = d2._var / (d2.nobs-1)
        semsum = sem1 + sem2
        z1 = (sem1 / semsum)**2 / (d1.nobs - 1)
        z2 = (sem2 / semsum)**2 / (d2.nobs - 1)
        dof = 1. / (z1 + z2)
        return dof

The numerator here looks the same but the denominator appears quite different.

Recommended topics

Hot tags