Bootstrap Multinomial regression in R
Asked Answered
C

2

6

I am trying to bootstrap a simple multinomial regression in R, and I am getting an error:

Error in is.data.frame(data) : object 'd' not found

What is really strange is that I am using the same code (adjusted to this particular problem) as in a tutorial for boot package at Quick-R, and that same code also worked when I am using different function (like lm()). For sure, I am doing something stupid, but I do not see what. Please, if anyone can help, I would appreciate a lot.

This is an example:

require(foreign)
require(nnet)
require(boot)

# an example for multinomial logistic regression
ml = read.dta('http://www.ats.ucla.edu/stat/data/hsbdemo.dta')
ml = ml[,c(5,7,3)]

bs <- function(formula, data, indices) {
    d = data[indices,] # allows boot to select sample
    fit = multinom(formula, data=d)
    s = summary(fit)
    return(list(fit$coefficients, fit$standard.errors))
}

# 5 replications
results = list()
results <- boot(
    data=ml, statistic=bs, R=5, parallel='multicore',
    formula=prog~write
)
Carrero answered 16/10, 2015 at 12:17 Comment(2)
Please post the output of traceback() in your question.Subjoinder
I did some sleuthing, and this might lead to a solution: there is something going in with the environment (seen by printing str(fit) from inside the bs-function. It is set to global, as you pass a formula and formula's have the enivronment in which they were created (see ?formula). Don't see how to fix it now though.Ruscio
D
2

The error happens in the summary() part, also the object returned by multinom() does not have coefficients and standard.errors. It seems, that summary.multinom() in turn calculates the hessian from your data, d, which for some reason (probably a scoping issue) cannot be found. A quick fix is to add Hess = TRUE:

bs <- function(formula, data, indices) {
  d = data[indices,] # allows boot to select sample
  fit = multinom(formula, data=d, Hess = TRUE)
  s = summary(fit)
  return( cbind(s$coefficients, s$standard.errors) )
}

# 5 replications
results = list()
results <- boot(
  data=ml, statistic=bs, R=5, parallel='multicore',
  formula=prog~write
)
Defense answered 16/10, 2015 at 15:33 Comment(0)
M
1

Multinomial logistic regression returns a matrix of coefficients using the coef() function. This differs from a lm or glm model which returns a vector of coefficients.

library(foreign)     # read.dta()
library(nnet)        # multinom()
require(boot)        # boot()

# an example for multinomial logistic regression
ml = read.dta('http://www.ats.ucla.edu/stat/data/hsbdemo.dta')
ml = ml[,c(5,7,3)]

names(ml)

bs <- function(formula, data, indices) {
  d = data[indices,] # allows boot to select sample
  fit = multinom(formula, data=d, maxit=1000, trace=FALSE)
  #s = summary(fit)
  #return(list(fit$coefficients, fit$standard.errors))

  estimates <- coef(fit)
  return(t(estimates))
}

# enable parallel

library(parallel)
cl <- makeCluster(2)
clusterExport(cl, "multinom")

# 10000 replications
set.seed(1984)

results <- boot(
  data=ml, statistic=bs, R=10000, parallel = "snow", ncpus=2, cl=cl,
  formula=prog~write
)

# label the estimates

subModelNames <- colnames(results$t0)
varNames <- rownames(results$t0)

results$t0

estNames <- apply(expand.grid(varNames,subModelNames),1,function(x) paste(x,collapse="_"))

estNames

colnames(results$t) <- estNames

# summary of results

library(car)

summary(results)

confint(results, level=0.95, type="norm")
confint(results, level=0.95, type="perc")
confint(results, level=0.95, type="bca")

# plot the results

hist(results, legend="separate")
Materialist answered 30/12, 2015 at 4:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.