glmnet error (nulldev == 0) stop("y is constant; gaussian glmnet fails at standardization step")
Asked Answered
V

3

6

I am running the following (truncated) code using glmnet in R

# do a lot of things to create the design matrix called x.design

> glmnet(x.design, y, thresh=1e-11)

where x.design is a n x p design matrix where n > p and y is a n x 1 vector of responses obtained using kernel density estimation. Both x.design and y contain real entries. I get the following error message when I run my code:

Error in if (nulldev == 0) stop("y is constant; gaussian glmnet fails at 
standardization step") : missing value where TRUE/FALSE needed 

I have visited and read

Running glmnet package in R, getting error "missing value where TRUE/FALSE needed", maybe due to missing values?

however I could not figure out a way to fix to my issue.

Could someone suggest a solution please?

Valonia answered 18/2, 2019 at 21:46 Comment(4)
'dput(x.design)' and 'dput(y)' are too large to copy and paste. x.design is a 658 x 15 matrix and y is a 658 x 1 vector.Valonia
Please, copy and paste the output of str(x.design) and str(y).Paquette
@MarcoSandri : Thanks you, I figured out the issue with your help. After typing in 'str(y)' I discovered that the kernel density estimate produced some NaN's in the estimate.Valonia
Nice to help you...!Paquette
K
4

It seems that your response vector y is constant. GLMNET tries to standardize it (maybe substract the mean, then divide by current stddev), and cannot because the stddev is 0. Print y and its variance to be sure.

You should also check your kernel estimation procedure.

Krisha answered 22/2, 2019 at 4:39 Comment(1)
turns out the issue was that y had NaN's in itValonia
G
2

Try removing nulls from your data by --> na.omit(data)

Geostatic answered 5/10, 2019 at 3:12 Comment(0)
G
2

A more general answer to this question is that glmnet does not handle any type of missing values like other "regressions" functions in R (be it NAs, NaNs or otherwise) as described here for instance. It only works with complete cases in that sense.

So, the solution I propose to the error message above is to remove all rows from the input matrix x.design that correspond to non numeric values in the response vector y. Something like this would do, for instance:

x.design <- x.design[grep("\\d", y)]

This code simply uses regular expressions to select rows of the response vector that contain digits (literal numbers) and subsets the input matrix according to those rows (rows that the glmnet function can actually use).

Then you also subset your response vector the same way and you are good to go (naturally, it is important to subset the response vector after the input matrix):

y <- y[grep("\\d", y)]
Gypsophila answered 7/5, 2023 at 18:6 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.