Weighted linear regression in R with lm() and svyglm(). Same model, different results - McMap

About

Weighted linear regression in R with lm() and svyglm(). Same model, different results

Asked 27/9, 2020 at 15:8 Answered 27/9, 2020 at 21:59

Solved r linear-regression survey weighted

H

1

6

I want to do a linear regression applying survey weights in R studio. I have seen that it is possible to do this with the lm() function, which enables me to specify the weights I want to use. However, it is also possible to do this with the svyglm() function, which does the regression with variables in a survey design object which has been weighted by the desired variable.

In theory, I see no reason for the results of these two regression models to be different, and the beta estimates are the same. However, the standard errors in each model are different, leading to different p-values and therefore to different levels of significance.

Which model is the most appropriate one? Any help would be greatly appreciated.

Here is the R code:

dat <- read.csv("https://raw.githubusercontent.com/LucasTremlett/questions/master/questiondata.csv")
model.weighted1 <-  lm(DV~IV1+IV2+IV3, data=dat, weights = weight)
summary(model.weighted1)
dat.weighted<- svydesign(ids = ~1, data = dat, weights = dat$weight)
model.weighted2<- svyglm(DV~IV1+IV2+IV3, design=dat.weighted)
summary(model.weighted2)

Holzer answered 27/9, 2020 at 15:8 Comment(3)

Weighting is tricky; the mathematical/statistical definition of the weights differs across contexts. Which method is appropriate probably depends on what the weights actually mean in the context of your problem. notstatschat.rbind.io/2020/08/04/weights-in-statistics is a very good (IMO) explanation of the differences. – Cashmere 27/9, 2020 at 15:11

I see... Thanks for the helpful link. I think based on the article I want to use "sampling weights", as this is data from the European Voter Election Study (which is a survey). Does this mean the second model is more appropiate, as it comes from the "survey" package? The documentation does not really specify which of the three kind of weights it is, but it does provide the means for weighted and unweighted samples (europeanelectionstudies.net/wp-content/uploads/2019/11/…). From the article it seems that the weights option in lm() calculates precision weights. – Holzer 27/9, 2020 at 15:35

Yes, it's highly likely that if you're working in a survey-data context that you want to use svyglm – Cashmere 27/9, 2020 at 15:39

R

8

Mostly to confirm what is in the comments already:

lm and svyglm will always give the same point estimates, but will typically give different standard errors. In the terminology I use here, and which @BenBolker already links (Thanks!), lm assumes precision weights and svyglm assumes sampling weights
For that particular survey data set, you have sampling weights and want svyglm
From the description of the survey you'd expect also to have a stratum variable, but it looks as though they don't supply it. If they did, it would go into svydesign and would be used to reduce the standard errors in svyglm

Redcoat answered 27/9, 2020 at 21:59 Comment(1)

answers are better than comments anyway. – Cashmere 27/9, 2020 at 22:52

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.