Formatting data for mlogit
Asked Answered
C

2

4

I am having a murderous time getting my data set in shape for a multinomial logit analysis via mlogit. My data set is available from the url in the code below.

I'm getting the following error:

Error in row.names<-.data.frame(*tmp*, value = c("1.Accessible", "1.Accessible", : duplicate 'row.names' are not allowed

I've checked elsewhere and this problem seems to come up. I've tried playing with the alt.levels as opposed to the alt.var argument, and that doesn't work.

#Loadpackages 
library(RCurl)
library(mlogit)
library(tidyr)
library(dplyr)
#URL where data is stored
dat.url<-   'https://raw.githubusercontent.com/sjkiss/Survey/master/mlogit.out.csv'
#Get data
dat<-read.csv(dat.url)
#Complete cases only as it seems mlogit cannot handle missing values or tied data which in this case you might get because of median imputation
dat<-dat[complete.cases(dat),]
#Tidy data to get it into long format
dat.out<-dat %>%
gather(Open, Rank, -c(1,9:12)) 
#Try to replicate code on pp.26-27 of http://cran.r-   project.org/web/packages/mlogit/vignettes/mlogit.pdf
mlogit.out<-mlogit.data(dat.out, shape='long',alt.var='Open',choice='Rank', id.var='X',ranked=TRUE)
#Try this option as per a discussion on stackexchange
mlogit.out<-mlogit.data(dat.out,     shape='long',alt.levels='Open',choice='Rank', id.var='X',ranked=TRUE)
Carbazole answered 15/6, 2015 at 18:15 Comment(1)
Ah. Use reshape/reshape2/cast packages. You're giving me vague flashbacks to when I similarly spent a couple of days trying to massage my data into mlogit's form, wrangling with reshape/reshape2/cast. Finally I found out that on my particular problem, mlogit underperformed other algorithms. Oh how I laughed. Good times, good times.Unders
L
-1
dat.out<-dat %>%
gather(Open, Rank, -c(1,9:12)) %>%    
arrange(X, Open, Rank)
    mlogit.out<-mlogit.data(dat.out, shape='long',alt.var='Open',choice='Rank', ranked=TRUE,child.var='X')

head(mlogit.out)
              X economic gender  age                     Job        Open  Rank
1.Accessible  1        5   Male 1970 Professional journalist  Accessible FALSE
1.Information 1        5   Male 1970 Professional journalist Information FALSE
1.Responsive  1        5   Male 1970 Professional journalist  Responsive  TRUE
1.Debate      1        5   Male 1970 Professional journalist      Debate FALSE
1.Officials   1        5   Male 1970 Professional journalist   Officials FALSE
1.Social      1        5   Male 1970 Professional journalist      Social FALSE
Loricate answered 15/6, 2015 at 21:37 Comment(0)
K
0

My suggestion is that you try the multinom() function in the nnet package. It doesn't require the special format of mlogit or mnlogit.

library(RCurl)
library(nnet)

Data<-getURL("https://raw.githubusercontent.com/sjkiss/Survey/master/mlogit.out.csv")
Data<-read.csv(text=Data,header=T)
Data<-na.omit(Data) # Get rid of NA's
Data<-as.data.frame(Data)
# relevel the dependent variable (must be a factor)
Data$Job<-factor(Data$Job)
# Using "Online Blogger" as the reference, substitute with your choice
Data$Job<-relevel(Data$Job,ref="Online blogger")
# Run the multinomial logistic regression
# (seems like an awful lot of variables btw)
Data<-multinom(formula=Job~Accessible+Information+Responsive+Debate+Officials+Social+Trade.Offs+economic+gender+age,data=Data)
Kaikaia answered 15/6, 2015 at 19:16 Comment(7)
One of the reasons I wanted to use the mlogit package is because it explicitly deals with ranked data in the vignette. My data are ranked data although all the covariates are individual level covariates. None are alternative specific covariates. Would I be able to treat this as a straight multinomial logistic regression in that case?Carbazole
Do you mean that the dependent variable is ordinal? If so, you can still use multinomial logistic regression there will just be a slight loss of power compared to ordinal logistic regression. If you want to run an easy ordinal logistic regression you can use polr() in the MASS package. It doesn't matter if the independent variables are ordinal.Kaikaia
You just need to order by X, Open, Rank. See my answer.Loricate
Andy, it's ordinal yes, but also ranked. So, if a person assigns 1 to item, they cannot assign one to any other item.Carbazole
Andy, how would I format the data set above to use with multinom()? I cannot get the effects() command to work in the mlogit environment.Carbazole
@Carbazole see my modified answerKaikaia
Note that multinom() uses the Begg & Gray approximation, that is, it splits your multinomial regression with J categories into J-1 logits. This is known to be less efficient than the simultaneous fitting (mlogit() or mnlogit()) under some conditions.Collectanea
L
-1
dat.out<-dat %>%
gather(Open, Rank, -c(1,9:12)) %>%    
arrange(X, Open, Rank)
    mlogit.out<-mlogit.data(dat.out, shape='long',alt.var='Open',choice='Rank', ranked=TRUE,child.var='X')

head(mlogit.out)
              X economic gender  age                     Job        Open  Rank
1.Accessible  1        5   Male 1970 Professional journalist  Accessible FALSE
1.Information 1        5   Male 1970 Professional journalist Information FALSE
1.Responsive  1        5   Male 1970 Professional journalist  Responsive  TRUE
1.Debate      1        5   Male 1970 Professional journalist      Debate FALSE
1.Officials   1        5   Male 1970 Professional journalist   Officials FALSE
1.Social      1        5   Male 1970 Professional journalist      Social FALSE
Loricate answered 15/6, 2015 at 21:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.