Why can't pass only 1 coulmn to glmnet when it is possible in glm function in R?

Asked 24/3, 2015 at 11:14 Answered 19/12, 2019 at 17:55

Why there is no possibility to pass only 1 explanatory variable to model in glmnet function from glmnet package when it is possible in glm function from base? Code and error are below:

> modelX<-glm( ifelse(train$cliks <1,0,1)~(sparseYY[,40]), family="binomial")
> summary(modelX)

Call:
glm(formula = ifelse(train$cliks < 1, 0, 1) ~ (sparseYY[, 40]), 
    family = "binomial")

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-0.2076  -0.2076  -0.2076  -0.2076   2.8641  

Coefficients:
               Estimate Std. Error  z value Pr(>|z|)    
(Intercept)    -3.82627    0.00823 -464.896   <2e-16 ***
sparseYY[, 40] -0.25844    0.15962   -1.619    0.105    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 146326  on 709677  degrees of freedom
Residual deviance: 146323  on 709676  degrees of freedom
AIC: 146327

Number of Fisher Scoring iterations: 6

> modelY<-glmnet( y =ifelse(train$cliks <1,0,1), x =(sparseYY[,40]), family="binomial"  )
Błąd wif (is.null(np) | (np[2] <= 1)) stop("x should be a matrix with 2 or more columns")

Biancabiancha answered 24/3, 2015 at 11:14 Comment(3)

It should be noted that you can bind an all 0 column to a one column x variable and glmnet will yield the appropriate 1st coefficient and a coefficient of zero for the all 0 column. x = cbind(sparseYY[, 40], 0) – Gothenburg 27/9, 2017 at 20:28

The glmnet package implements regularization methods. What would be the purpose of applying LASSO or rigde to fit a model with only one explanatory variable? Why would you want to shrink your one coefficient (ridge) or set it equal to zero (LASSO)? These methods only start to make sense at k >= 2. – Quevedo 2/1, 2018 at 11:1

@AlvaroFuentes fair enough. My mind had to be limited that day.. – Biancabiancha 11/1, 2018 at 11:51

Here is an answer I got to this question from the maintainer of the package (Trevor Hastie):

glmnet is designed to select variables from a (large) collection. Allowing for 1 variable would have created a lot of edge case programming, and I was not interested in doing that. Sorry!

Tolle answered 19/12, 2019 at 17:55 Comment(0)

I don't know why, but it's some kind of internal limitation. It does not have to do with the family as Roman claimed above.

glmnet(x = as.matrix(iris[2:4]), y = as.matrix(iris[1]))
## long output
glmnet(x = as.matrix(iris[1]), y = as.matrix(iris[1]))
Error in glmnet(x = as.matrix(iris[2]), y = as.matrix(iris[1])) : 
  x should be a matrix with 2 or more columns

It's a simple check in the code https://github.com/cran/glmnet/blob/master/R/glmnet.R#L20

Colombia answered 20/9, 2016 at 14:27 Comment(0)

-2

Because the documentation says so.

For family="binomial" should be either a factor with two levels, or a two-column matrix of counts or proportions (the second column is treated as the target class; for a factor, the last level in alphabetical order is the target class).

You have two options. Either construct a matrix where two columns represent counts, or, convert x into a factor with two levels.

Solecism answered 24/3, 2015 at 11:22 Comment(2)

Still doesn't work > modelY<-glmnet( y =as.factor(ifelse(train$cliks <1,0,1)), x =as.factor(sparseYY[,40]), + family="binomial" ) Błąd wif (is.null(np) | (np[2] <= 1)) stop("x should be a matrix with 2 or more columns") : – Biancabiancha 24/3, 2015 at 11:38

@MarcinKosinski without a reproducible example, I'm afraid I can't help out any further. Perhaps you could try constructing a full dataset prior to passing it to glmnet? – Dorice 25/3, 2015 at 8:1

Recommended topics

Hot tags