How to compare a model with no random effects to a model with a random effect using lme4?
Asked Answered
M

2

24

I can use gls() from the nlme package to build mod1 with no random effects. I can then compare mod1 using AIC to mod2 built using lme() which does include a random effect.

mod1 = gls(response ~ fixed1 + fixed2, method="REML", data)
mod2 = lme(response ~ fixed1 + fixed2, random = ~1 | random1, method="REML",data)
AIC(mod1,mod2)

Is there something similar to gls() for the lme4 package which would allow me to build mod3 with no random effects and compare it to mod4 built using lmer() which does include a random effect?

mod3 = ???(response ~ fixed1 + fixed2, REML=T, data)
mod4 = lmer(response ~ fixed1 + fixed2 + (1|random1), REML=T, data)
AIC(mod3,mod4)
Mella answered 3/6, 2014 at 16:4 Comment(0)
D
36

With modern (>1.0) versions of lme4 you can make a direct comparison between lmer fits and the corresponding lm model, but you have to use ML --- it's hard to come up with a sensible analogue of the "REML criterion" for a model without random effects (because it would involve a linear transformation of the data that set all of the fixed effects to zero ...)

You should be aware that there are theoretical issues with information-theoretic comparisons between models with and without variance components: see the GLMM FAQ for more information.

library(lme4)
fm1 <- lmer(Reaction~Days+(1|Subject),sleepstudy, REML=FALSE)
fm0 <- lm(Reaction~Days,sleepstudy)
AIC(fm1,fm0)
##     df      AIC
## fm1  4 1802.079
## fm0  3 1906.293

I prefer output in this format (delta-AIC rather than raw AIC values):

bbmle::AICtab(fm1,fm0)
##     dAIC  df
## fm1   0.0 4 
## fm0 104.2 3 

To test, let's simulate data with no random effect (I had to try a couple of random-number seeds to get an example where the among-subject std dev was actually estimated as zero):

rr <- simulate(~Days+(1|Subject),
               newparams=list(theta=0,beta=fixef(fm1),
                         sigma=sigma(fm1)),
               newdata=sleepstudy,
               family="gaussian",
               seed=103)[[1]]
ss <- transform(sleepstudy,Reaction=rr)
fm1Z <- update(fm1,data=ss)
VarCorr(fm1Z)
##  Groups   Name        Std.Dev.
##  Subject  (Intercept)  0.000  
##  Residual             29.241
fm0Z <- update(fm0,data=ss)
all.equal(c(logLik(fm0Z)),c(logLik(fm1Z)))  ## TRUE
Disjunction answered 3/6, 2014 at 16:48 Comment(6)
with the example I gave, or with your own data? I can run the example fine with the current (devel) version of lme4. If it's with your own data, then more information is required; either ask a new question on StackOverflow, or send an e-mail to [email protected] [subscribe to the list first; you can find the info/subscription page by searching], with a reproducible example in either case (and the results of packageVersion("lme4"))Disjunction
And what can be the approach when the random model refitted with ML results to be singular? But it is not singular when fitted with REML...any suggestion?Mediant
Not really surprising, especially with a highly parameterized/unstable model. Can you post a reproducible example to CrossValidated or [email protected] ?Disjunction
Thanks a lot, Ben. I would try to do that... I agree with you, I think the the model is too highly parameterized. Odd enough, it still has much lower AICc than other models were one of the factors have been modelled as linear and quadratic covariates. That means that I will need to fit the whole dataset in the reproducible example.Mediant
Done, I provided a reproducible example at Stack Overflow: stackoverflow.com/q/60892398/13099627?sem=2 – AsierMediant
@BenBolker While I agree that what you suggest is the easiest solution, the restricted likelihood of a model without any random effects is fairly straightforward to compute, see my answer.Protrude
P
1

While I agree that with Ben that the simplest solution is to set REML=FALSE, the maximum REML likelihood for a model without random effects is well defined and is fairly straightforward to compute via the well known relation

enter image description here

between the ordinary profile likelihood function and the restricted likelihood.

The following code simulates data for which the estimated variance of the random intercept of a LMM ends up at 0 such that the maximum restricted log likelihood of the LMM should be equal to the restricted likelihood of the model without any random effects included.

The restricted likelihood of the LM is computed via the above formula and evaluates to the same value as that of the LMM.

An even simpler alternative is to use glmmTMB:

library(lme4)
#> Loading required package: Matrix
# simulate some toy data for which the LMM ends up at the boundary
set.seed(5)
n <- 100 # the sample size
x <- rnorm(n) 
y <- rnorm(n)
group <- factor(rep(1:10,10))

# fit the LMM via REML
mod1 <- lmer(y ~ x + (1|group), REML=TRUE, control=lmerControl(boundary.tol=1e-8))
#> boundary (singular) fit: see ?isSingular
logLik(mod1)
#> 'log Lik.' -147.8086 (df=4)

# fit a model without random effects and compute its maximum REML log likelihood
mod0 <- lm(y ~ x)
p <- length(coef(mod0)) # number of fixed effect parameters
X <- model.matrix(mod0) # the fixed effect design matrix
sigma.REML <- summary(mod0)$sigma # REMLE of sigma
# the maximum ordinary log likelihood evaluated at the REML estimates
logLik.lm.at.REML <- sum(dnorm(residuals(mod0), 0, sigma.REML, log=TRUE))
# the restricted log likelihood of the model without random effects (via above formula)
logLik.lm.at.REML + p/2*log(2*pi) - 1/2*(- p*log(sigma.REML^2) + determinant(crossprod(X))$modulus)
#> [1] -147.8086
#> attr(,"logarithm")
#> [1] TRUE

library(glmmTMB)
data <- data.frame(y,x,group)
logLik(glmmTMB(y~x, family = gaussian(), data=data, REML=TRUE))
#> 'log Lik.' -147.8086 (df=3)
logLik(glmmTMB(y~x+(1|group), family = gaussian(), data=data, REML=TRUE))
#> 'log Lik.' -147.8086 (df=4)
Protrude answered 24/11, 2021 at 12:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.