Anova table with full model in one line in R
Asked Answered
C

4

5

I am fitting a linear model in R with three variables like so

cube_mod <- lm(y ~ x + x_2 + x_3)

I then use the anova function to display the results of analysis of variance with and get the following table

anova(cube_mod)
Analysis of Variance Table

Response: y
          Df Sum Sq Mean Sq  F value   Pr(>F)    
x          1     21      21   0.0083 0.928881    
x_2        1 658209  658209 254.2771 2.26e-10 ***
x_3        1  64967   64967  25.0977 0.000191 ***
Residuals 14  36240    2589                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The table shows the F-test for each variable separately, but I want the following table which shows only the F-test for the full model.

Analysis of Variance Table

Response: y
          Df Sum Sq Mean Sq  F value   Pr(>F)    
Model      3 723197  241066    93.13        0 
Residuals 14  36240    2589                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Is there a simple way to get this table from a linear model object?

Caprice answered 24/7, 2023 at 13:56 Comment(2)
You are doing analysis ov variance so it will break it down as shown above. The F-test for the full model is given in the summary of the model. ie do summary(cube_mode) and you will have all the values you want to create the table.Rodrigues
There is also the Anova function from the car package : car::Anova(cube_mod)Lynxeyed
C
5

1) Using the built-in anscombe data.frame

Model <- as.matrix(anscombe[6:8])
anova(lm(y1 ~ Model, anscombe))

giving:

Analysis of Variance Table

Response: y1
          Df Sum Sq Mean Sq F value  Pr(>F)  
Model      3 24.285  8.0948  3.3355 0.08577 .
Residuals  7 16.988  2.4269                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

2) or in terms of the lm object, fm, as discussed in the comments

fm <- lm(y1 ~ y2 + y3 + y4, anscombe)

Model <- model.matrix(fm)
anova(update(fm, . ~ Model))
# same output as above

3) Another approach is to ose aov1 from sasLM:

library(sasLM)
aov1(y1 ~ y2 + y3 + y4, anscombe)[c(1, 5), ]
##           Df   Sum Sq  Mean Sq  F value     Pr(>F)
## MODEL      3 24.28449 8.094828 3.335479 0.08576858
## RESIDUALS  7 16.98821 2.426887       NA         NA

Update

Added approach using fm, simplified it a bit and switched model to use y2, y3 and y4 as independent variables since x1, x2 and x3 are all the same in anscombe. Also added solution using sasLM package.

anscombe
##    x1 x2 x3 x4    y1   y2    y3    y4
## 1  10 10 10  8  8.04 9.14  7.46  6.58
## 2   8  8  8  8  6.95 8.14  6.77  5.76
## 3  13 13 13  8  7.58 8.74 12.74  7.71
## 4   9  9  9  8  8.81 8.77  7.11  8.84
## 5  11 11 11  8  8.33 9.26  7.81  8.47
## 6  14 14 14  8  9.96 8.10  8.84  7.04
## 7   6  6  6  8  7.24 6.13  6.08  5.25
## 8   4  4  4 19  4.26 3.10  5.39 12.50
## 9  12 12 12  8 10.84 9.13  8.15  5.56
## 10  7  7  7  8  4.82 7.26  6.42  7.91
## 11  5  5  5  8  5.68 4.74  5.73  6.89
Cosmic answered 24/7, 2023 at 14:32 Comment(2)
+1 This is really neat! Model <- model.matrix(formula, data) might be better since it can easily incorporates factors.Hawker
This is great. Though note that OP states Is there a simple way to get this table from a linear model object? Meaning they already have the linear object. How to get the Anova from the object they already have is the issueRodrigues
M
3

I use the example data in @G.Grothendieck's answer.


You can compare the model with an intercept-only null model within anova().

Model <- lm(y1 ~ y2 + y3 + y4, anscombe)
anova(update(Model, . ~ 1), Model)

# Analysis of Variance Table
# 
# Model 1: y1 ~ 1
# Model 2: y1 ~ y2 + y3 + y4
#   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
# 1     10 41.273                              
# 2      7 16.988  3    24.285 3.3355 0.08577 .

It shows the same statistics of F-test as in summary(Model) and anova(Model).

summary(Model)
# ...skip
# F-statistic: 3.335 on 3 and 7 DF,  p-value: 0.08577

anova(Model)
# Response: y1
#           Df  Sum Sq Mean Sq F value  Pr(>F)  
# y2         1 23.2162 23.2162  9.5663 0.01749 *
# y3         1  0.0487  0.0487  0.0200 0.89139  
# y4         1  1.0196  1.0196  0.4201 0.53755  
# Residuals  7 16.9882  2.4269
Marengo answered 24/7, 2023 at 15:21 Comment(1)
This is a neat way to do the comparison to the null modelRodrigues
S
2

Try supernova function from R package supernova something like

library(supernova)
supernova(lm(mpg ~ disp + cyl, data = mtcars))
Analysis of Variance Table (Type III SS)
 Model: mpg ~ disp + cyl

                               SS df      MS      F   PRE     p
 ----- --------------- | -------- -- ------- ------ ----- -----
 Model (error reduced) |  855.307  2 427.653 45.808 .7596 .0000
  disp                 |   37.594  1  37.594  4.027 .1219 .0542
   cyl                 |   46.418  1  46.418  4.972 .1464 .0337
 Error (from model)    |  270.740 29   9.336                   
 ----- --------------- | -------- -- ------- ------ ----- -----
 Total (empty model)   | 1126.047 31  36.324  
Standoffish answered 24/7, 2023 at 14:40 Comment(0)
H
1

You can manually calculate it:


fit <- lm(mpg ~ wt + qsec+as.factor(cyl), mtcars)
temp <- anova(fit)

out <- temp
n <- nrow(temp)
out$Df <- with(temp,c(sum(Df[1:(n-1)]),Df[n],rep(NA_real_,n-2)))
out$`Sum Sq` <- with(temp,c(sum(`Sum Sq`[1:(n-1)]),`Sum Sq`[n],rep(NA_real_,n-2)))
out$`Mean Sq` <- with(out,out$`Sum Sq`/out$Df)
out$`F value` <- c(out$`Mean Sq`[1]/out$`Mean Sq`[2],rep(NA_real_,n-1))
out$`Pr(>F)` <- c(pf(out$`F value`[1],out$Df[1],out$Df[2],lower.tail = FALSE),rep(NA_real_,n-1))
out <- out[1:2,]
rownames(out) <- c("Model","Residuals")
out

Analysis of Variance Table

Response: mpg
          Df Sum Sq Mean Sq F value    Pr(>F)    
Model      4 953.94 238.484  37.413 1.208e-10 ***
Residuals 27 172.11   6.374                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
                
Hawker answered 24/7, 2023 at 14:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.