I want to do a linear regression applying survey weights in R studio. I have seen that it is possible to do this with the lm()
function, which enables me to specify the weights I want to use. However, it is also possible to do this with the svyglm()
function, which does the regression with variables in a survey design object which has been weighted by the desired variable.
In theory, I see no reason for the results of these two regression models to be different, and the beta estimates are the same. However, the standard errors in each model are different, leading to different p-values and therefore to different levels of significance.
Which model is the most appropriate one? Any help would be greatly appreciated.
Here is the R code:
dat <- read.csv("https://raw.githubusercontent.com/LucasTremlett/questions/master/questiondata.csv")
model.weighted1 <- lm(DV~IV1+IV2+IV3, data=dat, weights = weight)
summary(model.weighted1)
dat.weighted<- svydesign(ids = ~1, data = dat, weights = dat$weight)
model.weighted2<- svyglm(DV~IV1+IV2+IV3, design=dat.weighted)
summary(model.weighted2)
svyglm
– Cashmere