multinomial mixed logit model mlogit r-package
Asked Answered
O

1

7

I discovered the mlogit-package for multinomial logit models in search of estimating a multinomial mixed logit model. After reading the excellent vignette I discovered that I could not apply my data on any of the described examples.

I now write in hope of help with my problem and created a minimal example to illustrate my situation.

The Problem is as follows: There are words with the consonant 'Q' somewhere. Now an experiment was conducted with people who were tasked to listen to these words and say if they heard a Q, an U or some OTHER consonant. This has to modeled in dependence of some factors like syllable position or real/non-real-word.

In the minimal example I created 4 people and their answers with the syllable position.

library(mlogit)
library(nnet)
set.seed(1234)
data <- data.frame(personID = as.factor(sample(1:4, 40, replace=TRUE)),
               decision = as.factor(sample(c("Q","U", "other"), 40, replace=TRUE)),
               syllable = as.factor(sample(1:4, 40, replace=TRUE)))
summary(data)
 personID  decision  syllable
 1:11     other:10   1:18    
 2:10     Q    :18   2: 9    
 3:10     U    :12   3: 5    
 4: 9                4: 8 

As far as I know nnet's multinomfunction does not cover mixed models.

modNnet1 <- multinom(decision ~ syllable, data=data)

First I used the mlogit.data-function to reshape the file. After discussion with a colleague we came to the conclusion that there is no alternative.specific.variable.

 dataMod <- mlogit.data(data, shape="wide", choice="decision", id.var="personID")

 mod1 <- mlogit(formula = decision ~ 0|syllable,
           data = dataMod,
           reflevel="Q", rpar=c(personID="n"), panel=TRUE)
  Error in names(sup.coef) <- names.sup.coef : 
    'names' attribute [1] must be the same length as the vector [0]

 mod2 <- mlogit(formula = decision ~ personID|syllable,
           data = dataMod,
           reflevel="Q", rpar=c(personID="n"), panel=TRUE)
  Error in solve.default(H, g[!fixed]) : 
     Lapack routine dgesv: system is exactly singular: U[3,3] = 0

No I do not know what to do, so I ask for help in here. But I believe this kind of problem can be solved with mlogit and I just don't see it yet ;)

Oaken answered 25/2, 2014 at 16:20 Comment(0)
M
5

The rpar argument accepts only alternative-specific variables. There is no need to specify the person-specific id in the model formula -- this is handled by including id.var = something in the mlogit.data command. For example, if you had an alternative specific covariate acov, you could allow random slopes for acov across a panel:

N = 200
dat <- data.frame(personID = as.factor(sample(1:4, N, replace=TRUE)),
               decision = as.factor(sample(c("Q","U", "other"), N, replace=TRUE)),
               syllable = as.factor(sample(1:4, N, replace=TRUE)),
               acov.Q = rnorm(N), acov.U = rnorm(N), acov.other = rnorm(N))
dataMod <- mlogit.data(dat, shape="wide", choice="decision", id.var="personID", varying = 4:6)
mlogit(formula = decision ~ acov|syllable, rpar = c(acov = "n"), panel = T, data = dataMod)

It seems you are trying to fit a model with a random, person-specific intercept for each alternative (not random slopes). Unfortunately, I don't think you can do so in mlogit (but see this post).

One option that would work to fit random intercepts in the absence of alternative-specific covariates is MCMCglmm.

library(MCMCglmm)
priors = list(R = list(fix = 1, V = 0.5 * diag(2), n = 2),
              G = list(G1 = list(V = diag(2), n = 2)))
m <- MCMCglmm(decision ~ -1 + trait + syllable,
              random = ~ idh(trait):personID,
              rcov = ~ us(trait):units,
              prior = priors,
              nitt = 30000, thin = 20, burnin = 10000,
              family = "categorical",
              data = dat)

Relevant issues are prior selection, convergence of Markov chains, etc. Florian Jaeger's lab's blog has a short tutorial on multinomial models via MCMCglmm that you might find helpful, in addition to the MCMCglmm documentation.

Mandamus answered 25/2, 2014 at 22:45 Comment(4)
Do you by chance know how to implement an effect coding of the syllable variable? I tried dat$syllableEff <- C(dat$syllable, sum,3) and used the formula decision ~ -1 + trait + syllableEff but it did not seem to work. If you don't know, I'll post a new topicOaken
@Oaken You could construct the contrasts manually. After defining syllableEff as in your comment: mmC <- model.matrix(decision ~ 0 + syllable, data = dat) %*% contrasts(dat$syllableEff); colnames(mmC) <- c("s1", "s2", "s3"); dat <- data.frame(dat, mmC) Then use the formula, decision ~ -1 + trait + s1 + s2 + s3 with data = dat.Mandamus
@Oaken Just to follow up with a (hopefully) clearer example, consider the following linear model based off of the simulated data above: dat$y <- rnorm(N); dat$trait <- factor(sample(1:3, N, replace = T)); mod1 <- lm(y ~ 0+trait+syllableEff, data = dat); mod2 <- lm(y ~ 0+trait+mmC, data = dat). The models are identical: coef(mod1) == coef(mod2). Thus creating dummy variables by hand just bypasses an automated step. Obviously the response is multinomial in your case, but the same linear model describes the latent variable (the linear predictor) underlying the multinomial response.Mandamus
Thank you very much Nate. The first comment helped me solving my problem and the second helped in understanding what goes on behind. (Even understood what trait does a little better ;) )Oaken

© 2022 - 2024 — McMap. All rights reserved.