Singular gradient error during bootstrapped nls fit to bad data

I have a dataset containing an independent variable and a set of dependent variables. I'd like to fit a function to each set of independent variables, using a bootstrapped nonlinear least squares procedure. In some cases, the independent variables are 'good quality', i.e. fit the function reasonably well. In other cases, they're noisy.

In all cases, I can use nls() to get an estimate of the parameters. However, when the data are noisy, a bootstrap throws the error Error in nls(...) : singular gradient. I can understand why nls fitting to noisy data would fail, e.g. by failing to converge after too many iterations, but I don't understand why it is a singular gradient error, and why I only get it resampled datasets of poor quality.

Code:

require(ggplot2)
require(plyr)
require(boot)

# Data are in long form: columns are 'enzyme', 'x', and 'y'
enz <- read.table("http://dl.dropbox.com/s/ts3ruh91kpr47sj/SE.txt", header=TRUE)

# Nonlinear formula to fit to data
mmFormula <- formula(y ~ (x*Vmax) / (x + Km))

nls is perfectly capable of fitting the data (even if in some cases, like a, I doubt the model fits the data.

# Use nls to fit mmFormula to the data - this works well enough
fitDf <- ddply(enz, .(enzyme), function(x) coefficients(nls(mmFormula, x, start=list(Km=100, Vmax=0.5))))

# Create points to plot for the simulated fits
xGrid <- 0:200
simFits <- dlply(fitDf, .(enzyme), function(x) data.frame(x=xGrid, y=(xGrid * x$Vmax)/(xGrid + x$Km)))
simFits <- ldply(simFits, identity) 

ggplot() + geom_point(data=enz, aes(x=x, y=y)) + geom_line(data=simFits, aes(x=x, y=y)) + 
  facet_wrap(~enzyme, scales="free_y") + aes(ymin=0)

NLS fits of mmFormula to data

Bootstrapping works fine for the good quality data:

# Function to pass to bootstrap; returns coefficients of nls fit to formula
nlsCoef <- function(df, i) {
  KmGuess <- median(df$x)
  VmaxGuess <- max(df$y)
  dfSamp <- df[i,]
  nlsCoef <- coefficients(nls(mmFormula, dfSamp, start=list(Km=100, Vmax=0.5)))
}

eBoot <- boot(subset(enz, enzyme=="e"), nlsCoef, R=1000) #No error

But not for the poor quality data

dBoot <- boot(subset(enz, enzyme=="d"), nlsCoef, R=10)
> Error in nls(mmFormula, dfSamp, start = list(Km = KmGuess, Vmax = VmaxGuess)) : 
   singular gradient

What is causing this error? And what should I do about it, given that I'd like to use plyr to simultaneously perform lots of bootstrap simulations?

#modified function #returns NAs if fit is not sucessfull #does global assignment to store bootstrap permutations nlsCoef <- function(df, i) { KmGuess <- median(df$x) VmaxGuess <- max(df$y) dfSamp <- df[i,] fit <- NULL try(fit <- nls(mmFormula, dfSamp, start=list(Km=100, Vmax=0.5))) if(!is.null(fit)){ res <- coef(fit) } else{ res <- c(Km=NA,Vmax=NA) } istore[k,] <<- i k <<- k+1 res } n <- 100 istore <- matrix(nrow=n+1,ncol=9) k <- 1 dat <- subset(enz, enzyme=="d") dBoot <- boot(dat, nlsCoef, R=n) #permutations that create samples that cannot be fitted nais <- istore[-1,][is.na(dBoot$t[,1]),] #look at first bootstrap sample #that could not be fitted samp <- dat[nais[1,],] plot(y~x,data=samp) fit <- nls(mmFormula, samp, start=list(Km=100, Vmax=0.5)) #error

Recommended topics

Hot tags