I am running logistic regressions with a panel data set from survey data and I want to correct the standard errors for the panel design. The weights included in this survey account for sampling probability, panel mortality and post stratification.
If I understood Berger et al 2017 correctly I should use clustered covariances for panel data. However, I am not sure if the commands work for data with survey weights.
I started by estimating the regressions with glm
and correcting the standard errors with coeftest
from the lmtest
package and vcovPL
from the sandwich
package. Subsequently, I used svydesign
and svyglm
from the survey
package to estimate a weighted model and corrected the standard errors again the same way.
In this Question R's sandwich package producing strange results for robust standard errors in linear model Zeileis writes that using svyglm
objects might produce incorrect results. However, I am not sure if this applies only for complex survey weight (with strata etc.) or as well for basic survey weights, since its pointed out here that without the use of strata there is not much difference. However, I don't fully understand the explanation there. Moreover, I wonder if the newly implemented features described in Berger et al 2017 allow for using a svyglm
object. So I don't know if I can use the commands in the sandwich
package for objects of class svyglm
if my design has no strata.
Here is the code I used:
# not weighted
model <- glm(depend_var ~ indep_var1 + indep_var2 ,family=quasibinomial(link='logit'),data=dataset)
m_vcov <- coeftest(model,vcov. = sandwich::vcovPL(x = model, cluster = ~ id_var,order.by = ~ year ,pairwise = T))
# weighted
design.ps <- svydesign(ids=~1, weights=~wgt, data=dataset)
model_wgt <- svyglm(depend_var ~ indep_var1 + indep_var2, design=design.ps,family=quasibinomial(link='logit'),data=dataset)
mwt_vcov <- coeftest(model_wgt, vcov. = sandwich::vcovPL(x = model_wgt, cluster = ~ id_var,order.by = ~ year ,pairwise = T))
Looking at the coefficients provided by the test code above, the results seem kind of reasonable unlike the results provided here: R's sandwich package producing strange results for robust standard errors in linear model
# basic model
> summary(model)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.68570 0.01264 -291.68 <2e-16 ***
indep_var1_test 0.37538 0.01111 33.78 <2e-16 ***
indep_var2_test 1.05226 0.01100 95.62 <2e-16 ***
# basic model with SE correction
> m_vcov
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.685702 0.176121 -20.9271 < 2.2e-16 ***
indep_var1_test 0.375380 0.049817 7.5353 4.874e-14 ***
indep_var2_test 1.052258 0.068763 15.3027 < 2.2e-16 ***
# weighted model
> summary(model_wgt)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.89702 0.01751 -222.57 <2e-16 ***
indep_var1_test 0.42373 0.01454 29.15 <2e-16 ***
indep_var2_test 1.05291 0.01439 73.15 <2e-16 ***
# weighted model with SE correction
> mwt_vcov
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.897021 0.319932 -12.1808 < 2.2e-16 ***
indep_var1_test. 0.423732 0.075202 5.6346 1.755e-08 ***
indep_var2_test 1.052915 0.126569 8.3189 < 2.2e-16 ***
My question is: Can I use the commands above to correct the standard errors?
I guess my question is similar to this unanswered one: https://stats.stackexchange.com/questions/260515/does-coeftest-correctly-use-weights-from-svydesign-in-svyglm-object?rq=1