How can I do test wald in Python?
Asked Answered
J

1

6

I want to test a hypothesis that "intercept = 0, beta = 1" so I should do wald test and used module 'statsmodel.formula.api'.

But I'm not sure which code is correct when doing wald test.

from statsmodels.datasets import longley
import statsmodels.formula.api as smf
data = longley.load_pandas().data

hypothesis_0 = '(Intercept = 0, GNP = 0)'
hypothesis_1 = '(GNP = 0)'
hypothesis_2 = '(GNP = 1)'
hypothesis_3 = '(Intercept = 0, GNP = 1)'
results = smf.ols('TOTEMP ~ GNP', data).fit()
wald_0 = results.wald_test(hypothesis_0)
wald_1 = results.wald_test(hypothesis_1)
wald_2 = results.wald_test(hypothesis_2)
wald_3 = results.wald_test(hypothesis_3)

print(wald_0)
print(wald_1)
print(wald_2)
print(wald_3)

results.summary()

I thought hypothesis_3 is right at first.

But the result of hypothesis_1 is same with F-test of regression, which represent that the hypothesis 'intercept = 0 and beta = 0'.

So, I thought that the module,'wald_test' set 'intercept = 0' by default.

I'm not sure which one is correct.

Could you please give me an answer which one is right?

Janes answered 1/5, 2018 at 13:14 Comment(2)
F-test of regression is that all slope coefficients are zero, but leaves the intercept unrestricted, i.e. same as hypothesis_1 in this case.Acevedo
yes in general why would checking for the intercept be necessary?Tanaka
A
6

Hypothesis 3 is the correct joint null hypothesis for the wald test. Hypothesis 1 is the same as the F-test in the summary output which is the hypothesis that all slope coefficients are zero.

I changed the example to use artificial data, so we can see the effect of different "true" beta coefficients.

import numpy as np
import pandas as pd
nobs = 100
np.random.seed(987125)
yx = np.random.randn(nobs, 2)
beta0 = 0
beta1 = 1
yx[:, 0] += beta0 + beta1 * yx[:, 1]
data = pd.DataFrame(yx, columns=['TOTEMP', 'GNP'])

hypothesis_0 = '(Intercept = 0, GNP = 0)'
hypothesis_1 = '(GNP = 0)'
hypothesis_2 = '(GNP = 1)'
hypothesis_3 = '(Intercept = 0, GNP = 1)'
results = smf.ols('TOTEMP ~ GNP', data).fit()
wald_0 = results.wald_test(hypothesis_0)
wald_1 = results.wald_test(hypothesis_1)
wald_2 = results.wald_test(hypothesis_2)
wald_3 = results.wald_test(hypothesis_3)

print('H0:', hypothesis_0)
print(wald_0)
print()
print('H0:', hypothesis_1)
print(wald_1)
print()
print('H0:', hypothesis_2)
print(wald_2)
print()
print('H0:', hypothesis_3)
print(wald_3)

In this case with beta0=0 and beta1=1, both hypothesis 2 and 3 hold. Hypothesis 0 and 1 are not consistent with the simulated data.

The wald test results reject the false and do not reject the true hypotheses, given sample size and effect size should result in high power.

H0: (Intercept = 0, GNP = 0)
<F test: F=array([[ 58.22023709]]), p=2.167936332972888e-17, df_denom=98, df_num=2>

H0: (GNP = 0)
<F test: F=array([[ 116.33149937]]), p=2.4054199668085043e-18, df_denom=98, df_num=1>

H0: (GNP = 1)
<F test: F=array([[ 0.1205935]]), p=0.7291363441993846, df_denom=98, df_num=1>

H0: (Intercept = 0, GNP = 1)
<F test: F=array([[ 0.0623734]]), p=0.9395692694166834, df_denom=98, df_num=2>

Similar results can be checked by changing beta0 and beta1.

Acevedo answered 1/5, 2018 at 14:11 Comment(2)
Why is checking that the intercept were 0 needed? What does a non-zero intercept imply about the IV/DV relationship ?Tanaka
In some cases we might want to test whether the regression goes through the origin, i.e. have zero intercept. But it's not a very common case.Acevedo

© 2022 - 2024 — McMap. All rights reserved.