I have a dataset containing an independent variable and a set of dependent variables. I'd like to fit a function to each set of independent variables, using a bootstrapped nonlinear least squares procedure. In some cases, the independent variables are 'good quality', i.e. fit the function reasonably well. In other cases, they're noisy.
In all cases, I can use nls()
to get an estimate of the parameters. However, when the data are noisy, a bootstrap throws the error Error in nls(...) : singular gradient
. I can understand why nls
fitting to noisy data would fail, e.g. by failing to converge after too many iterations, but I don't understand why it is a singular gradient error, and why I only get it resampled datasets of poor quality.
Code:
require(ggplot2)
require(plyr)
require(boot)
# Data are in long form: columns are 'enzyme', 'x', and 'y'
enz <- read.table("http://dl.dropbox.com/s/ts3ruh91kpr47sj/SE.txt", header=TRUE)
# Nonlinear formula to fit to data
mmFormula <- formula(y ~ (x*Vmax) / (x + Km))
nls
is perfectly capable of fitting the data (even if in some cases, like a
, I doubt the model fits the data.
# Use nls to fit mmFormula to the data - this works well enough
fitDf <- ddply(enz, .(enzyme), function(x) coefficients(nls(mmFormula, x, start=list(Km=100, Vmax=0.5))))
# Create points to plot for the simulated fits
xGrid <- 0:200
simFits <- dlply(fitDf, .(enzyme), function(x) data.frame(x=xGrid, y=(xGrid * x$Vmax)/(xGrid + x$Km)))
simFits <- ldply(simFits, identity)
ggplot() + geom_point(data=enz, aes(x=x, y=y)) + geom_line(data=simFits, aes(x=x, y=y)) +
facet_wrap(~enzyme, scales="free_y") + aes(ymin=0)
Bootstrapping works fine for the good quality data:
# Function to pass to bootstrap; returns coefficients of nls fit to formula
nlsCoef <- function(df, i) {
KmGuess <- median(df$x)
VmaxGuess <- max(df$y)
dfSamp <- df[i,]
nlsCoef <- coefficients(nls(mmFormula, dfSamp, start=list(Km=100, Vmax=0.5)))
}
eBoot <- boot(subset(enz, enzyme=="e"), nlsCoef, R=1000) #No error
But not for the poor quality data
dBoot <- boot(subset(enz, enzyme=="d"), nlsCoef, R=10)
> Error in nls(mmFormula, dfSamp, start = list(Km = KmGuess, Vmax = VmaxGuess)) :
singular gradient
What is causing this error? And what should I do about it, given that I'd like to use plyr
to simultaneously perform lots of bootstrap simulations?
KmGuess
) by first fitting Lineweaver-Burk usinglm
. – Fasciatenls
fits (without bootstrapping) work fine with relatively bad starting guesses, e.g. Km=100, Vmax=0.5; b.) when I change the bootstrap function to those same starting guesses I get the same error, and c.) I think bad starting guesses usually cause a failure-to-converge error rather than a singular gradient error. – Weldnls2
can help with that). A different optimization algorithm might also help. But if the data strongly violates the model, it is impossible to fit and that could happen during the bootstrap. – Fasciatex
distribution? – Kommunarsk