default lambda sequence in glmnet for cross-validation

According to Friedman, Hastie & Tibshirani (2010) 'strategy is to select a minimum value lambda_min = epsilon * lambda_max, and construct a sequence of K values of lambda decreasing from lambda_max to lambda_min on the log scale. Typical values are epsilon = 0.001 and K = 100.'

The following example generates data, calculates the lambda path and compares it to that of glmnet:

## Load library and generate some data to illustrate:
library("glmnet")
set.seed(1)
n <- 100
x <- matrix(rnorm(n*20), n, 20)
y <- rnorm(n)

## Standardize variables: (need to use n instead of (n-1) as denominator)
mysd <- function(z) sqrt(sum((z-mean(z))^2)/length(z))
sx <- scale(x, scale = apply(x, 2, mysd))
sx <- as.matrix(sx, ncol = 20, nrow = 100)

## Calculate lambda path (first get lambda_max):
lambda_max <- max(abs(colSums(sx*y)))/n
epsilon <- .0001
K <- 100
lambdapath <- round(exp(seq(log(lambda_max), log(lambda_max*epsilon), 
                            length.out = K)), digits = 10)
lambdapath

## Compare with glmnet's lambda path:
fitGLM <- glmnet(sx, y)
fitGLM$lambda

Note that glmnet does not compute solutions for all 100 (default) lambda values though, it stops earlier. Not sure what the rules for stopping are.

Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1), 1.

Recommended topics

Hot tags