I do not want main effect because it is collinear with a finer factor fixed effect, so it is annoying to have these NA
.
In this example:
lm(y ~ x * z)
I want the interaction of x
(numeric) and z
(factor), but not the main effect of z
.
I do not want main effect because it is collinear with a finer factor fixed effect, so it is annoying to have these NA
.
In this example:
lm(y ~ x * z)
I want the interaction of x
(numeric) and z
(factor), but not the main effect of z
.
R documentation of ?formula
says:
The ‘*’ operator denotes factor crossing: ‘a * b’ interpreted as ‘a + b + a : b
So it sounds like that dropping main effect is straightforward, by just doing one of the following:
a + a:b ## main effect on `b` is dropped
b + a:b ## main effect on `a` is dropped
a:b ## main effects on both `a` and `b` are dropped
Oh, really? No no no (too simple, too naive). In reality it depends on the variable class of a
and b
.
This kind of behavior is due to a magic function called model.matrix.default
, which constructs a design matrix from a formula. A numerical variable is just included as it is into a column, but a factor variable is automatically coded as many dummy columns. It is exactly this dummy recoding that is a magic. It is commonly believed that we can enable or disable contrasts to control it, but not really. We lose control of contrasts even in this simplest example. The problem is that model.matrix.default
has its own rule when doing dummy encoding, and it is very sensitive to how you specify the model formula. It is exactly for this reason that we can't drop main effect when an interaction between two factors exists.
From your question, x
is numeric and z
is a factor. You can specify a model with interaction but not with main effect of z
by
y ~ x + x:z
Since x
is numeric, it is equivalent to do
y ~ x:z
The only difference here is parametrization (or how model.matrix.default
does dummy encoding). Consider a small example:
set.seed(0)
y <- rnorm(10)
x <- rnorm(10)
z <- gl(2, 5, labels = letters[1:2])
fit1 <- lm(y ~ x + x:z)
#Coefficients:
#(Intercept) x x:zb
# 0.1989 -0.1627 -0.5456
fit2 <- lm(y ~ x:z)
#Coefficients:
#(Intercept) x:za x:zb
# 0.1989 -0.1627 -0.7082
From the names of the coefficients we see that in the 1st specification, z
is contrasted so its 1st level "a" is not dummy encoded, while in the 2nd specification, z
is not contrasted and both levels "a" and "b" are dummy encoded. Given that both specifications ends up with three coefficients, they are really equivalent (mathematically speaking, the design matrix in two cases have the same column space) and you can verify this by comparing their fitted values:
all.equal(fit1$fitted, fit2$fitted)
# [1] TRUE
So why is z
contrasted in the first case? Because otherwise we have two dummy columns for x:z
, and the sum of these two columns are just x
, aliased with the existing model term x
in the formula. In fact, in this case even if you require that you don't want contrasts, model.matrix.default
will not obey:
model.matrix.default(y ~ x + x:z,
contrast.arg = list(z = contr.treatment(nlevels(z), contrasts = FALSE)))
# (Intercept) x x:zb
#1 1 0.7635935 0.0000000
#2 1 -0.7990092 0.0000000
#3 1 -1.1476570 0.0000000
#4 1 -0.2894616 0.0000000
#5 1 -0.2992151 0.0000000
#6 1 -0.4115108 -0.4115108
#7 1 0.2522234 0.2522234
#8 1 -0.8919211 -0.8919211
#9 1 0.4356833 0.4356833
#10 1 -1.2375384 -1.2375384
So why in the 2nd case is z
not contrasted? Because if it is, we loose the effect of level "a" when constructing interaction. And even if you require a contrast, model.matrix.default
will just ignore you:
model.matrix.default(y ~ x:z,
contrast.arg = list(z = contr.treatment(nlevels(z), contrasts = TRUE)))
# (Intercept) x:za x:zb
#1 1 0.7635935 0.0000000
#2 1 -0.7990092 0.0000000
#3 1 -1.1476570 0.0000000
#4 1 -0.2894616 0.0000000
#5 1 -0.2992151 0.0000000
#6 1 0.0000000 -0.4115108
#7 1 0.0000000 0.2522234
#8 1 0.0000000 -0.8919211
#9 1 0.0000000 0.4356833
#10 1 0.0000000 -1.2375384
Oh, amazing model.matrix.default
. It is able to make the right decision!
Let me reiterate it: There is no way to drop main effect when interaction is present.
I will not provide extra example here, as I have one in Why do I get NA coefficients and how does lm
drop reference level for interaction. See the "Contrasts for interaction" section over there. In short, all the following specifications give the same model (they have the same fitted values):
~ year:treatment
~ year:treatment + 0
~ year + year:treatment
~ treatment + year:treatment
~ year + treatment + year:treatment
~ year * treatment
And in particular, the 1st specification leads to an NA
coefficient.
So once the RHS of ~
contains an year:treatment
, you can never ask model.matrix.default
to drop main effects.
People inexperienced with this behavior are to be surprised when producing ANOVA tables.
model.matrix.default
Some people consider model.matrix.default
annoying as it does not appear to have a consistent manner in dummy encoding. A "consistent manner" in their view is to always drop the 1st factor level. Well, no problem, you can bypass model.matrix.default
by manually doing the dummy encoding, and feed the resulting dummy matrix as a variable to lm
, etc.
However, you still need model.matrix.default
's help to easily do dummy encoding for a (yes, only one) factor variable. For example, for the variable z
in our previous example, its full dummy encoding is the following, and you can retain all or some of its columns for regression.
Z <- model.matrix.default(~ z + 0) ## no contrasts (as there is no intercept)
# za zb
#1 1 0
#2 1 0
#3 1 0
#4 1 0
#5 1 0
#6 0 1
#7 0 1
#8 0 1
#9 0 1
#10 0 1
#attr(,"assign")
#[1] 1 1
#attr(,"contrasts")
#attr(,"contrasts")$z
#[1] "contr.treatment"
Back to our simple example, if we don't want contrasts for z
in y ~ x + x:z
, we can do
Z2 <- Z[, 1:2] ## use "[" to remove attributes of `Z`
lm(y ~ x + x:Z2)
#Coefficients:
#(Intercept) x x:Z2za x:Z2zb
# 0.1989 -0.7082 0.5456 NA
Not surprisingly we see an NA
(because colSums(Z2)
is aliased with x
). And if we want to enforce contrasts in y ~ x:z
, we can do either of the following:
Z1 <- Z[, 1]
lm(y ~ x:Z1)
#Coefficients:
#(Intercept) x:Z1
# 0.34728 -0.06571
Z1 <- Z[, 2]
lm(y ~ x:Z1)
#Coefficients:
#(Intercept) x:Z1
# 0.2318 -0.6860
And the latter case is probably what contefranz is trying to do.
However, I do not really recommend this kind of hacking. When you pass a model formula to lm
, etc, model.matrix.default
is trying to give you the most sensible construction. Also, in reality we want to do prediction with a fitted model. If you have done dummy encoding yourself, you would have a hard time when providing newdata
to predict
.
This is a very good explanation, but let me add one more thing when in the process of selecting significant predictors.
Let's consider again the following model:
fit1 <- lm(y ~ x + x:z)
#Coefficients:
#(Intercept) x x:zb
# 0.1989 -0.1627 -0.5456
Suppose that the main effect x
is not statistically significant and you want to get rid of it. The most intuitive thing, for me at least, is to write the second model as above:
fit2 <- lm(y ~ x:z)
#Coefficients:
#(Intercept) x:za x:zb
# 0.1989 -0.1627 -0.7082
which eventually puts back the main effect masked as the interaction with the baseline level of the factor. Now, the only solution I was able to find in order to really not include main effects is to exploit lm.fit
which, as you all know, does not return an object of class lm
, but a list
. So the question is: do you know any method to get rid of main effects without loosing the lm
class?
x:zb
coefficients different? Is there a difference in the interpretation of the x:zb
interaction between the models? I just now posted this question on SE (stats.stackexchange.com/q/280265/161806) that seems related, care to take a look? –
Doall Using lm.fit but getting an lm class object: I'm not a programmer but a simple adaptation of the lm-function worked for me: I added the two arguments needed by lm.fit (the model matrix and the response variable) and replaced (in the lm-function) the x and y (used for lm.fit within lm) by my model matrix Xmat and response:
lm.2 <- function (formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
contrasts = NULL, offset, ..., Xmat=NA, response=NA)
{
ret.x <- x
ret.y <- y
cl <- match.call()
mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data", "subset", "weights", "na.action",
"offset"), names(mf), 0L)
mf <- mf[c(1L, m)]
mf$drop.unused.levels <- TRUE
mf[[1L]] <- quote(stats::model.frame)
mf <- eval(mf, parent.frame())
if (method == "model.frame")
return(mf)
else if (method != "qr")
warning(gettextf("method = '%s' is not supported. Using 'qr'",
method), domain = NA)
mt <- attr(mf, "terms")
y <- model.response(mf, "numeric")
w <- as.vector(model.weights(mf))
if (!is.null(w) && !is.numeric(w))
stop("'weights' must be a numeric vector")
offset <- as.vector(model.offset(mf))
if (!is.null(offset)) {
if (length(offset) != NROW(y))
stop(gettextf("number of offsets is %d, should equal %d (number of observations)",
length(offset), NROW(y)), domain = NA)
}
if (is.empty.model(mt)) {
x <- NULL
z <- list(coefficients = if (is.matrix(y)) matrix(NA_real_,
0, ncol(y)) else numeric(), residuals = y, fitted.values = 0 *
y, weights = w, rank = 0L, df.residual = if (!is.null(w)) sum(w !=
0) else if (is.matrix(y)) nrow(y) else length(y))
if (!is.null(offset)) {
z$fitted.values <- offset
z$residuals <- y - offset
}
}
else {
z <- if (is.null(w))
lm.fit(Xmat, response, offset = offset, singular.ok = singular.ok,
...)
else lm.wfit(Xmat, response, w, offset = offset, singular.ok = singular.ok,
...)
}
class(z) <- c(if (is.matrix(y)) "mlm", "lm")
z$na.action <- attr(mf, "na.action")
z$offset <- offset
z$contrasts <- attr(x, "contrasts")
z$xlevels <- .getXlevels(mt, mf)
z$call <- cl
z$terms <- mt
if (model)
z$model <- mf
if (ret.x)
z$x <- x
if (ret.y)
z$y <- y
if (!qr)
z$qr <- NULL
z
}
Then, create the model matrix as usual but delet the columns that you don't want:
Xmat <- model.matrix(~x + x:z, data=mydata)
Xmat <- Xmat[,-(colnames(Xmat)=="x")]
mod <- lm.2(~x + x:z, data=mydata, Xmat=Xmat, response=mydata$y)
All this can certainly be more elaborated, but for me it worked -> it returns an lm-object that can be used with summary, resid, sim, plot ...
best Pius
© 2022 - 2024 — McMap. All rights reserved.
fz <- factor(z)
(just for notation), thenx:fz
should work in one sense (it partitions the variability in the data differently), but it will construct a model that is equivalent in terms of complexity, goodness-of-fit, etc. tox*fz
. – Wilfredwilfredalm(y ~ x * z - x - z)
. This will only estimate the interaction effect. – Ephram