How to get average marginal effects (AMEs) with standard errors of a multinomial logit model?

Asked 7/1, 2019 at 18:7 Answered 24/11, 2022 at 17:33

Solved r multinomial mlogit marginal-effects

I want to get the average marginal effects (AME) of a multinomial logit model with standard errors. For this I've tried different methods, but they haven't led to the goal so far.

Best attempt

My best attempt was to get the AMEs by hand using mlogit which I show below.

library(mlogit)
ml.d <- mlogit.data(df1, choice="Y", shape="wide")  # shape data for `mlogit()`
ml.fit <- mlogit(Y ~ 1 | D + x1 + x2, reflevel="1", data=ml.d)  # fit the model

# coefficient names
c.names <- all.vars(ml.fit$call)[2:4]

# get marginal effects
ME.mnl <- sapply(c.names, function(x) 
  stats::effects(ml.fit, covariate=x, data=ml.d), 
  simplify=FALSE) 

# get AMEs
(AME.mnl <- t(sapply(ME.mnl, colMeans)))
#              1            2            3           4          5
# D  -0.03027080 -0.008806072 0.0015410569 0.017186531 0.02034928
# x1 -0.02913234 -0.015749598 0.0130577842 0.013240212 0.01858394
# x2 -0.02724650 -0.005482753 0.0008575982 0.005331181 0.02654047

I know these values are the correct ones. However, I could not get the correct standard errors by simply doing the columns' standard deviations:

# standard errors - WRONG!
(AME.mnl.se <- t(sapply(E.mnl, colSdColMeans)))

(Note: colSdColMeans() for columns' SD is provided here.)

Accordingly this also led me to the wrong t-values:

# t values - WRONG!
AME.mnl / AME.mnl.se
#             1          2          3         4         5
# D  -0.7110537 -0.1615635 0.04013228 0.4190057 0.8951484
# x1 -0.7170813 -0.2765212 0.33325968 0.3656893 0.8907836
# x2 -0.7084573 -0.1155825 0.02600653 0.1281190 0.8559794

Whereas I know the correct t-values for this case are these:

# D  -9.26 -1.84  0.31 4.29 8.05   
# x1 -6.66 -2.48  1.60 1.50 3.22  
# x2 -2.95 -0.39  0.06 0.42 3.21

I learned that there should be a "delta method", but I only found some code for a very special case with interactions at Cross Validated.

Failed attempts

1.) Package margins doesn't seem to be able to handle "mlogit" objects:

library(margins)
summary(margins(ml.fit))

2.) There's another package for mlogits, nnet,

library(nnet) 
ml.fit2 <- multinom(Y ~ D + x1 + x2, data=df1)
summary(ml.fit2)

but margins can't handle this correctly either:

> summary(margins(ml.fit2))
 factor     AME SE  z  p lower upper
      D -0.0303 NA NA NA    NA    NA
     x1 -0.0291 NA NA NA    NA    NA
     x2 -0.0272 NA NA NA    NA    NA

3.) There's also a package around that claims to calculate "Average Effects for Multinomial Logistic Regression Models",

library(DAMisc)
mnlChange2(ml.fit2, varnames="D", data=df1)

but I couldn't get a drop of milk out of it, since the function yields just nothing (even not with the function's example).

How now can we get AMEs with standard errors / t-statistics of a multinomial logit model with R?

Data

df1 <- structure(list(Y = c(3, 4, 1, 2, 3, 4, 1, 5, 2, 3, 4, 2, 1, 4, 
1, 5, 3, 3, 3, 5, 5, 4, 3, 5, 4, 2, 5, 4, 3, 2, 5, 3, 2, 5, 5, 
4, 5, 1, 2, 4, 3, 1, 2, 3, 1, 1, 3, 2, 4, 2, 2, 4, 1, 5, 3, 1, 
5, 2, 3, 4, 2, 4, 5, 2, 4, 1, 4, 2, 1, 5, 3, 2, 1, 4, 4, 1, 5, 
1, 1, 1, 4, 5, 5, 3, 2, 3, 3, 2, 4, 4, 5, 3, 5, 1, 2, 5, 5, 1, 
2, 3), D = c(12, 8, 6, 11, 5, 14, 0, 22, 15, 13, 18, 3, 5, 9, 
10, 28, 9, 16, 17, 14, 26, 18, 18, 23, 23, 12, 28, 14, 10, 15, 
26, 9, 2, 30, 18, 24, 27, 7, 6, 25, 13, 8, 4, 16, 1, 4, 5, 18, 
21, 1, 2, 19, 4, 2, 16, 17, 23, 15, 13, 21, 24, 14, 27, 6, 20, 
6, 19, 8, 7, 23, 11, 11, 1, 22, 21, 4, 27, 6, 2, 9, 18, 30, 26, 
22, 10, 1, 4, 7, 26, 15, 26, 18, 30, 1, 11, 29, 25, 3, 19, 15
), x1 = c(13, 12, 4, 3, 16, 16, 15, 13, 1, 15, 10, 16, 1, 17, 
7, 13, 12, 6, 8, 16, 16, 11, 7, 16, 5, 13, 12, 16, 17, 6, 16, 
9, 14, 16, 15, 5, 7, 2, 8, 2, 9, 9, 15, 13, 9, 4, 16, 2, 11, 
13, 11, 6, 4, 3, 7, 4, 12, 2, 16, 14, 3, 13, 10, 11, 10, 4, 11, 
16, 8, 12, 14, 9, 4, 16, 16, 12, 9, 10, 6, 1, 3, 8, 7, 7, 5, 
16, 17, 10, 4, 15, 10, 8, 3, 13, 9, 16, 12, 7, 4, 11), x2 = c(12, 
19, 18, 19, 15, 12, 15, 16, 15, 11, 12, 16, 17, 14, 12, 17, 17, 
16, 12, 20, 11, 11, 15, 14, 18, 10, 14, 13, 10, 14, 18, 18, 18, 
17, 18, 14, 16, 19, 18, 16, 18, 14, 17, 10, 16, 12, 16, 15, 11, 
18, 19, 15, 19, 11, 16, 10, 20, 14, 10, 12, 10, 15, 13, 15, 11, 
20, 11, 12, 16, 16, 11, 15, 11, 11, 10, 10, 16, 11, 20, 17, 20, 
17, 16, 11, 18, 19, 18, 14, 17, 11, 16, 11, 18, 14, 15, 16, 11, 
14, 11, 13)), class = "data.frame", row.names = c(NA, -100L))

Allin answered 7/1, 2019 at 18:7 Comment(5)

Is the provided df1 complete? I'm getting t-values that are similar to the true ones but not the same. – Tabb 7/1, 2019 at 20:50

@JuliusVainora Yes, df1 is complete. One can get the "true" t-values using Stata's mlogit. – Allin 7/1, 2019 at 21:11

You may want to consider the marginaleffects package (disclaimer: I am the author). Here is a vignette specifically about these kinds of models: vincentarelbundock.github.io/marginaleffects/articles/… – Extortioner 24/11, 2022 at 13:15

@Extortioner Thanks, this (your) package can't handle interactions, though, tried marginaleffects::marginaleffects(nnet::multinom(Y ~ D + x1*x2, data=df1)) |> summary() – Allin 24/11, 2022 at 14:9

@jay-sf @Allin The package does in fact support marginal effects for nnet::multinom models with interactions. I posted a new answer showing that your specific example works. – Extortioner 24/11, 2022 at 17:34

We can do something very similar to what is done in your linked answer. In particular, first we want a function that would compute AMEs at a given vector of coefficients. For that we can define

AME.fun <- function(betas) {
  tmp <- ml.fit
  tmp$coefficients <- betas
  ME.mnl <- sapply(c.names, function(x) 
    effects(tmp, covariate = x, data = ml.d), simplify = FALSE)
  c(sapply(ME.mnl, colMeans))
}

where the second half is yours, while in the first one I use a trick to take the same ml.fit object and to change its coefficients. Next we find the jacobian with

require(numDeriv)
grad <- jacobian(AME.fun, ml.fit$coef)

and apply the delta method. Square roots of the diagonal of grad %*% vcov(ml.fit) %*% t(grad) is what we want. Hence,

(AME.mnl.se <- matrix(sqrt(diag(grad %*% vcov(ml.fit) %*% t(grad))), nrow = 3, byrow = TRUE))
#             [,1]        [,2]        [,3]        [,4]        [,5]
# [1,] 0.003269320 0.004788536 0.004995723 0.004009762 0.002527462
# [2,] 0.004375795 0.006348496 0.008168883 0.008844684 0.005763966
# [3,] 0.009233616 0.014048212 0.014713090 0.012702188 0.008261734
AME.mnl / AME.mnl.se
#            1          2          3         4        5
# D  -9.259050 -1.8389907 0.30847523 4.2861720 8.051269
# x1 -6.657611 -2.4808393 1.59847852 1.4969683 3.224159
# x2 -2.950794 -0.3902812 0.05828811 0.4197057 3.212458

which coincides with Stata's results.

Tabb answered 7/1, 2019 at 21:25 Comment(2)

This is really a great answer, muchas gracias! – Allin 7/1, 2019 at 21:34

Thank you also for this answer as I am trying to solve a similar problem. However, when I tried to reproduce the results, I couldn't get the jacobian to running as it shows the error Error in if (rhs %in% c(1, 3)) { : argument is of length zero. Even after following the solution provided in #25832229 to modify a line of code in the effects function, I had no success. Do you know what the problem is? – Overt 2/4, 2021 at 15:45

If you use vce="bootstraps" within margin function then it provides SE with Confidence interval as well summary(margins(ml.fit2,vce="bootstraps"))

Eggnog answered 24/11, 2022 at 10:2 Comment(1)

Thanks, this package can't handle interactions, though, which is a major caveat, try e.g. summary(margins::margins(nnet::multinom(Y ~ D + x1*x2, data=df1), vce="bootstrap")). – Allin 24/11, 2022 at 10:7

The terminology for “marginal effects” is very inconsistent across disciplines. Since you refer to the margins package, I assume that you use the expression “Average Marginal Effects” in the same that that the margins developers used it, which is the result of this procedure:

Compute the slope of the outcome with respect to D for every row in the original dataset (unit-level marginal effects).
Take the average of the unit-level slopes (average marginal effect)

In models like nnet::multinom, the slopes will be different for every level of the outcome variable. There will thus be one average marginal effect per level, per regressor.

Using the marginaleffects package and the data you supplied, we get:

library(nnet)
library(marginaleffects)

mod <- nnet::multinom(Y ~ D + x1*x2, data=df1, trace = FALSE)

marginaleffects(mod) |> summary()

       Group Term    Effect Std. Error z value   Pr(>|z|)      2.5 %    97.5 %
    1      1    D -0.027558   0.004183 -6.5878 4.4625e-11 -3.576e-02 -0.019359
    2      1   x1 -0.026789   0.003916 -6.8411 7.8596e-12 -3.446e-02 -0.019114
    3      1   x2 -0.026542   0.009812 -2.7051 0.00682871 -4.577e-02 -0.007311
    4      2    D -0.012115   0.004702 -2.5766 0.00997729 -2.133e-02 -0.002899
    5      2   x1 -0.018223   0.006017 -3.0287 0.00245619 -3.002e-02 -0.006430
    6      2   x2 -0.007045   0.013101 -0.5377 0.59078427 -3.272e-02  0.018633
    7      3    D  0.001536   0.005877  0.2614 0.79380433 -9.982e-03  0.013054
    8      3   x1  0.012451   0.008775  1.4189 0.15592516 -4.748e-03  0.029650
    9      3   x2  0.002193   0.015573  0.1408 0.88801728 -2.833e-02  0.032715
    10     4    D  0.016300   0.004325  3.7689 0.00016399  7.823e-03  0.024776
    11     4   x1  0.018111   0.008789  2.0606 0.03934167  8.845e-04  0.035338
    12     4   x2  0.013543   0.013266  1.0208 0.30733424 -1.246e-02  0.039544
    13     5    D  0.021837   0.003387  6.4479 1.1343e-10  1.520e-02  0.028475
    14     5   x1  0.014449   0.005402  2.6749 0.00747469  3.862e-03  0.025037
    15     5   x2  0.017851   0.009072  1.9677 0.04909878  7.048e-05  0.035631

    Model type:  multinom 
    Prediction type:  probs

Extortioner answered 24/11, 2022 at 17:33 Comment(0)

Best attempt

Failed attempts

Data

Recommended topics

Hot tags