Checking Type III ANOVA results [duplicate]
Asked Answered
B

0

1

Setting aside the debate about Type III ANOVA and the Principle of Marginality and all that...

I've set up two models whose sum of squares should be different (and Type III ANOVA would test that difference). Here's the code:

library(car)
library(openintro)
data(hsb2)
hsb2$gender <- factor(hsb2$gender)
contrasts(hsb2$gender) <- "contr.sum"
contrasts(hsb2$ses) <- "contr.sum"
math_gender_int <- lm(math ~ gender + gender:ses, data = hsb2)
math_gender_ses_int <- lm(math ~ gender + ses + gender:ses, data = hsb2)

Now I should be able to see a difference in the sum of squares between these two models. After all, the "full" model has one more term in it:

anova(math_gender_int, math_gender_ses_int)

But the output shows this:

Analysis of Variance Table

Model 1: math ~ gender + gender:ses
Model 2: math ~ gender + ses + gender:ses
  Res.Df   RSS Df  Sum of Sq F Pr(>F)
1    194 15858                       
2    194 15858  0 -1.819e-12      

What's going on here?

Boeotian answered 31/3, 2017 at 17:20 Comment(7)
If ses and gender are both factor variables, then lm turns these into dummy variables in the model. When two categorical variables are interacted, and every interacted set is observed, then the main effects of the categoricals are irrelevant as the interaction "saturates" the model.Malliemallin
It's unclear to me why these two models should have the same number of coefficients when one model is clearly nested inside the other model.Boeotian
@lmo: I posted at the same time you did. I see what you're saying. Is this an R thing or a stats thing? It seems that I should be able to "manually" check that Type III ANOVA results this way. But if R saturates the model with the interaction term, I'm not sure how else to go about it.Boeotian
This is a math/stats thing. Consider two binary variables: color {"blue", "red"} and speed {"fast", "slow"}. The interaction would contain four levels blue:fast, blue:slow, red:fast, red:slow. If these interaction terms are included, then there is no remaining variation to identify color and speed by themselves. The interaction variables cover every possibility.Malliemallin
In Fox's notation, I am looking for SS(alpha | beta, gamma) = SS(alpha, beta, gamma) - SS(beta, gamma). From what you're saying, though, there would never be a difference between SS(alpha, beta, gamma) and SS(beta, gamma).Boeotian
@李哲源ZheyuanLi: "Don't just judge from question title." I'm not sure what you mean. The question title indicates that I'm trying to check the results of Type III ANOVA using incremental F-tests that compare what I thought were two completely different (but nested) models. I take your point about R not being able to do it directly without manually modifying the model matrix. I'll try that. Thanks.Boeotian
@李哲源ZheyuanLi: Dropping the columns from the model matrix worked like a charm. Thanks!Boeotian

© 2022 - 2024 — McMap. All rights reserved.