Clustered standard errors different in plm vs lfe

set.seed(572015) library(lfe) library(plm) library(lmtest) # clustering example x <- c(sapply(sample(1:20), rep, times = 1000)) + rnorm(20*1000, sd = 1) y <- 5 + 10*x + rnorm(20*1000, sd = 10) + c(sapply(rnorm(20, sd = 10), rep, times = 1000)) facX <- factor(sapply(1:20, rep, times = 1000)) mydata <- data.frame(y=y,x=x,facX=facX, state=rep(1:1000, 20)) model <- plm(y ~ x, data = mydata, index = c("facX", "state"), effect = "individual", model = "within") plmTest <- coeftest(model,vcov=vcovHC(model,type = "HC1", cluster="group")) lfeTest <- summary(felm(y ~ x | facX | 0 | facX)) data.frame(lfeClusterSE=lfeTest$coefficients[2], plmClusterSE=plmTest[2]) lfeClusterSE plmClusterSE 1 0.06746538 0.06572588

The difference is in the degrees-of-freedom adjustment. This is the usual first guess when looking for differences in supposedly similar standard errors (see e.g., Different Robust Standard Errors of Logit Regression in Stata and R). Here, the problem can be illustrated when comparing the results from (1) plm+vcovHC, (2) felm, (3) lm+cluster.vcov (from package multiwayvcov).

First, I refit all models:

m1 <- plm(y ~ x, data = mydata, index = c("facX", "state"),
  effect = "individual", model = "within")
m2 <- felm(y ~ x | facX | 0 | facX, data = mydata)
m3 <- lm(y ~ facX + x, data = mydata)

All lead to the same coefficient estimates. For m3 the fixed effects are explicitly reported while they are not for m1 and m2. Hence, for m3 only the last coefficient is extracted with tail(..., 1).

all.equal(coef(m1), coef(m2))
## [1] TRUE
all.equal(coef(m1), tail(coef(m3), 1))
## [1] TRUE

The non-robust standard errors also agree.

se <- function(object) tail(sqrt(diag(object)), 1)
se(vcov(m1))
##          x 
## 0.07002696 
se(vcov(m2))
##          x 
## 0.07002696 
se(vcov(m3))
##          x 
## 0.07002696

And when comparing the clustered standard errors we can now show that felm uses the degrees-of-freedom correction while plm does not:

se(vcovHC(m1))
##          x 
## 0.06572423 
m2$cse
##          x 
## 0.06746538 
se(cluster.vcov(m3, mydata$facX))
##          x 
## 0.06746538 
se(cluster.vcov(m3, mydata$facX, df_correction = FALSE))
##          x 
## 0.06572423

Recommended topics

Hot tags