plm or lme4 for Random and Fixed Effects model on Panel Data
Asked Answered
P

1

15

Can I specify a Random and a Fixed Effects model on Panel Data using ?

I am redoing Example 14.4 from Wooldridge (2013, p. 494-5) in . Thanks to this site and this blog post I've manged to do it in the package, but I'm curious if I can do the same in the package?

Here's what I've done in the package. Would be grateful for any pointers as to how I can do the same using . First, packages needed and loading of data,

# install.packages(c("wooldridge", "plm", "stargazer"), dependencies = TRUE)
library(wooldridge) 
data(wagepan)

Second, I estimate the three models estimated in Example 14.4 (Wooldridge 2013) using the package,

library(plm) 
Pooled.ols <- plm(lwage ~ educ + black + hisp + exper+I(exper^2)+ married + union +
                  factor(year), data = wagepan, index=c("nr","year") , model="pooling")

random.effects <- plm(lwage ~ educ + black + hisp + exper + I(exper^2) + married + union +
                      factor(year), data = wagepan, index = c("nr","year") , model = "random") 

fixed.effects <- plm(lwage ~ I(exper^2) + married + union + factor(year), 
                     data = wagepan, index = c("nr","year"), model="within")

Third, I output the resultants using to emulate Table 14.2 in Wooldridge (2013),

stargazer::stargazer(Pooled.ols,random.effects,fixed.effects, type="text",
           column.labels=c("OLS (pooled)","Random Effects","Fixed Effects"), 
          dep.var.labels = c("log(wage)"), keep.stat=c("n"),
          keep=c("edu","bla","his","exp","marr","union"), align = TRUE, digits = 4)
#> ======================================================
#>                         Dependent variable:           
#>              -----------------------------------------
#>                              log(wage)                
#>              OLS (pooled) Random Effects Fixed Effects
#>                  (1)           (2)            (3)     
#> ------------------------------------------------------
#> educ          0.0913***     0.0919***                 
#>                (0.0052)      (0.0107)                 
#>                                                       
#> black         -0.1392***    -0.1394***                
#>                (0.0236)      (0.0477)                 
#>                                                       
#> hisp            0.0160        0.0217                  
#>                (0.0208)      (0.0426)                 
#>                                                       
#> exper         0.0672***     0.1058***                 
#>                (0.0137)      (0.0154)                 
#>                                                       
#> I(exper2)     -0.0024***    -0.0047***    -0.0052***  
#>                (0.0008)      (0.0007)      (0.0007)   
#>                                                       
#> married       0.1083***     0.0640***      0.0467**   
#>                (0.0157)      (0.0168)      (0.0183)   
#>                                                       
#> union         0.1825***     0.1061***      0.0800***  
#>                (0.0172)      (0.0179)      (0.0193)   
#>                                                       
#> ------------------------------------------------------
#> Observations    4,360         4,360          4,360    
#> ======================================================
#> Note:                      *p<0.1; **p<0.05; ***p<0.01

is there an equally simple way to do this in ? Should I stick to ? Why/Why not?

Pansie answered 28/2, 2018 at 15:24 Comment(4)
Wouldn't this be more suited for Cross Validated?Radiophone
@Jaap, thank you for your comment. I see it as a mainly programmers question, and not really a statistical/Cross Validated question. But I'm happy to move it if you think if belongs in CV.Pansie
Please note that lme4 is about the maximum likely framework, so it won't be the "same": plm's vignette ch. 7 has some comparison to pkg nlme which is similar to lme4 and you should be able to take it from there.Polysyndeton
@Helix123, thank you for your comment. I will look into that.Pansie
U
26

Excepted for the difference in estimation method it seems indeed to be mainly a question of vocabulary and syntax

# install.packages(c("wooldridge", "plm", "stargazer", "lme4"), dependencies = TRUE)
library(wooldridge) 
library(plm) 
#> Le chargement a nécessité le package : Formula
library(lme4)
#> Le chargement a nécessité le package : Matrix
data(wagepan)

Your first example is a simple linear model ignoring the groups nr.
You can't do that with lme4 because there is no "random effect" (in the lme4 sense).
This is what Gelman & Hill call a complete pooling approach.

Pooled.ols <- plm(lwage ~ educ + black + hisp + exper+I(exper^2)+ married + 
                      union + factor(year), data = wagepan, 
                  index=c("nr","year"), model="pooling")

Pooled.ols.lm <- lm(lwage ~ educ + black + hisp + exper+I(exper^2)+ married + union +
                      factor(year), data = wagepan)

Your second example seems to be equivalent to a random intercept mixed model with nr as random effect (but the slopes of all predictors are fixed).
This is what Gelman & Hill call a partial pooling approach.

random.effects <- plm(lwage ~ educ + black + hisp + exper + I(exper^2) + married + 
                          union + factor(year), data = wagepan, 
                      index = c("nr","year") , model = "random") 

random.effects.lme4 <- lmer(lwage ~ educ + black + hisp + exper + I(exper^2) + married + 
                                union + factor(year) + (1|nr), data = wagepan) 

Your third example seems to correspond to a case were nr is a fixed effect and you compute a different nr intercept for each group.
Again : you can't do that with lme4 because there is no "random effect" (in the lme4 sense).
This is what Gelman & Hill call a "no pooling" approach.

fixed.effects <- plm(lwage ~ I(exper^2) + married + union + factor(year), 
                     data = wagepan, index = c("nr","year"), model="within")

wagepan$nr <- factor(wagepan$nr)
fixed.effects.lm <- lm(lwage ~  I(exper^2) + married + union + factor(year) + nr, 
                     data = wagepan)

Compare the results :

stargazer::stargazer(Pooled.ols, Pooled.ols.lm, 
                     random.effects, random.effects.lme4 , 
                     fixed.effects, fixed.effects.lm,
                     type="text",
                     column.labels=c("OLS (pooled)", "lm no pool.",
                                     "Random Effects", "lme4 partial pool.", 
                                     "Fixed Effects", "lm compl. pool."), 
                     dep.var.labels = c("log(wage)"), 
                     keep.stat=c("n"),
                     keep=c("edu","bla","his","exp","marr","union"), 
                     align = TRUE, digits = 4)
#> 
#> =====================================================================================================
#>                                                Dependent variable:                                   
#>              ----------------------------------------------------------------------------------------
#>                                                     log(wage)                                        
#>                 panel         OLS         panel            linear           panel           OLS      
#>                 linear                    linear       mixed-effects       linear                    
#>              OLS (pooled) lm no pool. Random Effects lme4 partial pool. Fixed Effects lm compl. pool.
#>                  (1)          (2)          (3)              (4)              (5)            (6)      
#> -----------------------------------------------------------------------------------------------------
#> educ          0.0913***    0.0913***    0.0919***        0.0919***                                   
#>                (0.0052)    (0.0052)      (0.0107)         (0.0108)                                   
#>                                                                                                      
#> black         -0.1392***  -0.1392***    -0.1394***       -0.1394***                                  
#>                (0.0236)    (0.0236)      (0.0477)         (0.0485)                                   
#>                                                                                                      
#> hisp            0.0160      0.0160        0.0217           0.0218                                    
#>                (0.0208)    (0.0208)      (0.0426)         (0.0433)                                   
#>                                                                                                      
#> exper         0.0672***    0.0672***    0.1058***        0.1060***                                   
#>                (0.0137)    (0.0137)      (0.0154)         (0.0155)                                   
#>                                                                                                      
#> I(exper2)     -0.0024***  -0.0024***    -0.0047***       -0.0047***      -0.0052***     -0.0052***   
#>                (0.0008)    (0.0008)      (0.0007)         (0.0007)        (0.0007)       (0.0007)    
#>                                                                                                      
#> married       0.1083***    0.1083***    0.0640***        0.0635***        0.0467**       0.0467**    
#>                (0.0157)    (0.0157)      (0.0168)         (0.0168)        (0.0183)       (0.0183)    
#>                                                                                                      
#> union         0.1825***    0.1825***    0.1061***        0.1053***        0.0800***      0.0800***   
#>                (0.0172)    (0.0172)      (0.0179)         (0.0179)        (0.0193)       (0.0193)    
#>                                                                                                      
#> -----------------------------------------------------------------------------------------------------
#> Observations    4,360        4,360        4,360            4,360            4,360          4,360     
#> =====================================================================================================
#> Note:                                                                     *p<0.1; **p<0.05; ***p<0.01

Gelman A, Hill J (2007) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press (a very very good book !)

Created on 2018-03-08 by the reprex package (v0.2.0).

Unconnected answered 8/3, 2018 at 8:49 Comment(4)
A truly excellent answer. Thanks a lot. Do you happen to know if Gelman and Hill (2007) cover the difference in estimation method? Thanks again!Pansie
Gelman & hill cover (Restricted) Maximum likelihood and MCMC/Bayesian approaches. But I don't think they cover the methods discussed in the plm packageUnconnected
Great answer. I have a quick question, do (any of) the Pooled OLS, RE or FE models count as strictly "longitudinal" analyses, or would you need to interact the IV with time, as in educ*year? Thanks in advance.Reynolds
"Strictly longitudinal" should be defined (the meaning will probably be different for different people). In the mixed model (partial pooling) you generally make the distinction between random intercept and random slope models. In a random slope model like lmer(lwage ~ year + (1+year|nr), data = wagepan) you compute a different slope of lwage~year for each nr and then a sort of (weighed) averaged global (meta parameter) slope.Unconnected

© 2022 - 2024 — McMap. All rights reserved.