I am running elastic net regularization in caret using glmnet
I pass sequence of values to trainControl
for alpha and lambda, then I perform repeatedcv
to get the optimal tunings of alpha and lambda.
Here is an example where the optimal tunings for alpha and lambda are 0.7 and 0.5 respectively:
age <- c(4, 8, 7, 12, 6, 9, 10, 14, 7, 6, 8, 11, 11, 6, 2, 10, 14, 7, 12, 6, 9, 10, 14, 7)
gender <- make.names(as.factor(c(1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1)))
bmi_p <- c(0.86, 0.45, 0.99, 0.84, 0.85, 0.67, 0.91, 0.29, 0.88, 0.83, 0.48, 0.99, 0.80, 0.85,
0.50, 0.91, 0.29, 0.88, 0.99, 0.84, 0.80, 0.85, 0.88, 0.99)
m_edu <- make.names(as.factor(c(0, 1, 1, 2, 2, 3, 2, 0, 1, 1, 0, 1, 2, 2, 1, 2, 0, 1, 1, 2, 2, 0 , 1, 0)))
p_edu <- make.names(as.factor(c(0, 2, 2, 2, 2, 3, 2, 0, 0, 0, 1, 2, 2, 1, 3, 2, 3, 0, 0, 2, 0, 1, 0, 1)))
f_color <- make.names(as.factor(c("blue", "blue", "yellow", "red", "red", "yellow",
"yellow", "red", "yellow","blue", "blue", "yellow", "red", "red", "yellow",
"yellow", "red", "yellow", "yellow", "red", "blue", "yellow", "yellow", "red")))
asthma <- make.names(as.factor(c(1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1)))
x <- data.frame(age, gender, bmi_p, m_edu, p_edu, f_color, asthma)
tuneGrid <- expand.grid(alpha = seq(0, 1, 0.05), lambda = seq(0, 0.5, 0.05))
fitControl <- trainControl(method = 'repeatedcv', number = 3, repeats = 5, classProbs = TRUE, summaryFunction = twoClassSummary)
model.test <- caret::train(asthma ~ age + gender + bmi_p + m_edu + p_edu + f_color, data = x, method = "glmnet",
family = "binomial", trControl = fitControl, tuneGrid = tuneGrid,
metric = "ROC")
My question?
When I run as.matrix(coef(model.test$finalModel))
which I would assume give me the coefficients corresponding to the best model, I get 100 different sets of coefficients.
So how do I get the coefficients corresponding to the best tuning?
I've seen this recommendation to get the best model coef(model.test$finalModel, model.test$bestTune$lambda)
However, this returns NULL coefficients, and In any case, would only be returning the best tunings related to lambda, and not to alpha in addition.
After searching everywhere on the internet, all I can find now which points me in the direction of the correct answer is this blog post, which says that model.test$finalModel
returns the model corresponding to the best alpha tuning, and coef(model.test$finalModel, model.caret$bestTune$lambda)
returns the set of coefficients corresponding to the best values of lambda. If this is true then this is the answer to my question. However, as this is a single blog post, and I can't find anything else to back up this claim, I am still skeptical. Can anyone validate this claim that model.test$finalModel
returns the model corresponding to the best alpha?? If so then this question would be solved. Thanks!
coef(model.test$finalModel, model.test$finalModel$lambdaOpt)
gives the coefs corresponding to the best alpha in addition to lambda? is that explicitly stated somewhere? It's strange and seems arbitrary thatfinalModel
would provide the coefficients for all sub-optimal fits of lambda but not the coefficients for all sub-optimal fits of alpha. – Seaton