How to find good start values for nls function?
Asked Answered
T

3

7

I don't understand why I can't have a nls function for these data. I have tried with a lot of different start values and I have always the same error.

Here is what I have been doing:

expFct2 = function (x, a, b,c)
{
  a*(1-exp(-x/b)) + c  
}
vec_x <- c(77.87,87.76,68.6,66.29)
vec_y <- c(1,1,0.8,0.6)
dt <- data.frame(vec_x=vec_x,vec_y=vec_y)
ggplot(data = dt,aes(x = vec_x, y = vec_y)) +  geom_point() + 
     geom_smooth(data=dt, method="nls", formula=y~expFct2(x, a, b, c),
       se=F, start=list(a=1, b=75, c=-5)

I have always this error:

Error in method(formula, data = data, weights = weight, ...) : 
  singular gradient
Trujillo answered 13/3, 2012 at 21:45 Comment(0)
B
9

This can be written with two linear parameters (.lin1 and .lin2) and one nonlinear parameter (b) like this:

a*(1-exp(-x/b)) + c  
= (a+c) - a * exp(-x/b)
= .lin1 + .lin2 * exp(-x/b)

where .lin1 = a+c and .lin2 = -a (so a = - .lin2 and c = .lin1 + .lin2) This lets us use "plinear" which only requires specification of a starting value for the single nonlinear parameter (eliminating the problem of how to set the starting values for the other parameters) and which converges despite the starting value of b=75 being far from that of the solution:

nls(y ~ cbind(1, exp(-x/b)), start = list(b = 75), alg = "plinear")

Here is the result of a run from which we can see from the size of .lin2 that the problem is badly scaled:

> x <- c(77.87,87.76,68.6,66.29)
> y <- c(1,1,0.8,0.6)
> nls(y ~ cbind(1, exp(-x/b)), start = list(b = 75), alg = "plinear")
Nonlinear regression model
  model:  y ~ cbind(1, exp(-x/b)) 
   data:  parent.frame() 
         b      .lin1      .lin2 
 3.351e+00  1.006e+00 -1.589e+08 
 residual sum-of-squares: 7.909e-05

Number of iterations to convergence: 9 
Achieved convergence tolerance: 9.887e-07 
> R.version.string
[1] "R version 2.14.2 Patched (2012-02-29 r58660)"
> win.version()
[1] "Windows Vista (build 6002) Service Pack 2"

EDIT: added sample run and comment on scaling.

Bradawl answered 13/3, 2012 at 23:43 Comment(2)
With this I get b .lin1 .lin2 3.351e+00 1.006e+00 -1.589e+08 and when I calculate a and c, I have: nls(vec_y ~ expFct2(vec_x, a, b, c), start = list(a=1.589e+08, b=75, c=-158899999),control=nls.control(maxiter = 200)) I have this error: Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates. I don't understand whyTrujillo
Normally when running nonlinear optimization you want the parameters to be in roughly the same range of magnitude. Have added a sample run showing the problem. Transform your parameters so that this does not happen. The advantage of the plinear approach is that its relatively clear how to transform to linearity and now that we see what it gives we know we have to transform our parameters further and which ones. Vincent has already shown how to do that.Bradawl
F
10

Fitting a three-parameter nonlinear model to four data points is going to be moderately challenging in any case, although in this case the data are well behaved. Point #1 is that your starting value for your c parameter (-5) was way off. Drawing a picture of the curve corresponding to your starting parameters (see below) would help you understand this (so would recognizing that the curve you get will range from c at its minimum to c+a at its maximum, and the range of your data is from 0.6 to 1 ...)

However, even with a better starting guess I found myself fussing with control parameters (i.e. control=nls.control(maxiter=200)), followed by more warnings -- nls is not known for its robustness. So I tried the SSasympOff model, which implements a self-starting version of the curve you want to fit.

start1 <- list(a=1, b=75, c=-5)
start2 <- list(a=0.5, b=75, c=0.5)  ## a better guess

pfun <- function(params) {
  data.frame(vec_x=60:90,
             vec_y=do.call(expFct2,c(list(x=60:90),params)))
}
library(ggplot2)
ggplot(data = dt,aes(x = vec_x, y = vec_y)) +  geom_point() +
  geom_line(data=pfun(start1))+
  geom_line(data=pfun(start2),colour="red")+
  geom_smooth(data=dt, method="nls", formula=y~SSasympOff(x, a, b, c),
              se=FALSE)

My advice in general is that it's easier to figure out what's going on and fix problems if you fit nls outside of geom_smooth and construct the curve you want to add using predict.nls ...

More generally, the way to get good starting parameters is to understand the geometry of the function you are fitting, and which parameters control which aspects of the curve. As I mentioned above, c is the minimum value of the shifted saturating-exponential curve, a is the range, and b is a scale parameter (you can see that when x=b, the curve is 1-exp(-1) or approximately 2/3 of the way from the minimum to the maximum). Either a bit of algebra and calculus (i.e. taking limits), or playing around with the curve() function, are good ways to gather this information.

Fusty answered 13/3, 2012 at 23:16 Comment(1)
Thank you for your answer. I didn't know SSasympOff function. But how can I find the value of a,b and c to my function? If i'm doing getInitial(vec_y ~ SSasympOff(vec_x, 0.5, 75, 0.5), data = dt),this isn't the values for my equation.Trujillo
B
9

This can be written with two linear parameters (.lin1 and .lin2) and one nonlinear parameter (b) like this:

a*(1-exp(-x/b)) + c  
= (a+c) - a * exp(-x/b)
= .lin1 + .lin2 * exp(-x/b)

where .lin1 = a+c and .lin2 = -a (so a = - .lin2 and c = .lin1 + .lin2) This lets us use "plinear" which only requires specification of a starting value for the single nonlinear parameter (eliminating the problem of how to set the starting values for the other parameters) and which converges despite the starting value of b=75 being far from that of the solution:

nls(y ~ cbind(1, exp(-x/b)), start = list(b = 75), alg = "plinear")

Here is the result of a run from which we can see from the size of .lin2 that the problem is badly scaled:

> x <- c(77.87,87.76,68.6,66.29)
> y <- c(1,1,0.8,0.6)
> nls(y ~ cbind(1, exp(-x/b)), start = list(b = 75), alg = "plinear")
Nonlinear regression model
  model:  y ~ cbind(1, exp(-x/b)) 
   data:  parent.frame() 
         b      .lin1      .lin2 
 3.351e+00  1.006e+00 -1.589e+08 
 residual sum-of-squares: 7.909e-05

Number of iterations to convergence: 9 
Achieved convergence tolerance: 9.887e-07 
> R.version.string
[1] "R version 2.14.2 Patched (2012-02-29 r58660)"
> win.version()
[1] "Windows Vista (build 6002) Service Pack 2"

EDIT: added sample run and comment on scaling.

Bradawl answered 13/3, 2012 at 23:43 Comment(2)
With this I get b .lin1 .lin2 3.351e+00 1.006e+00 -1.589e+08 and when I calculate a and c, I have: nls(vec_y ~ expFct2(vec_x, a, b, c), start = list(a=1.589e+08, b=75, c=-158899999),control=nls.control(maxiter = 200)) I have this error: Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates. I don't understand whyTrujillo
Normally when running nonlinear optimization you want the parameters to be in roughly the same range of magnitude. Have added a sample run showing the problem. Transform your parameters so that this does not happen. The advantage of the plinear approach is that its relatively clear how to transform to linearity and now that we see what it gives we know we have to transform our parameters further and which ones. Vincent has already shown how to do that.Bradawl
I
2

I struggle to find an interpretation to your parameters: a is a slope, b the speed of convergence, and a+c the limit, but c by itself does not seem to mean much. After reparametrizing your function, the problem disappears.

f <- function (x, a,b,c) a + c * exp(-x/abs(b))
nls(y~f(x, a, b, c), data=dt, start=list(a=1, b=75, c=-5), trace=TRUE)

However, the value of c looks very, very high: that is probably why the model initially failed to converge.

Nonlinear regression model
  model:  y ~ f(x, a, b, c) 
   data:  dt 
         a          b          c 
 1.006e+00  3.351e+00 -1.589e+08 
 residual sum-of-squares: 7.909e-05

Number of iterations to convergence: 9 
Achieved convergence tolerance: 2.232e-06 

Here is another, more reasonable parametrization of the same function.

g <- function (x, a,b,c) a * (1-exp(-(x-c)/abs(b)))
nls(y~g(x, a, b, c), data=dt, start=list(a=1, b=75, c=-5), trace=TRUE)

Nonlinear regression model
  model:  y ~ g(x, a, b, c) 
   data:  dt 
     a      b      c 
 1.006  3.351 63.257 
 residual sum-of-squares: 7.909e-05

Number of iterations to convergence: 10 
Achieved convergence tolerance: 1.782e-06 
Immitigable answered 13/3, 2012 at 23:41 Comment(4)
Ok, but how from here, I can find a start value for my function, because If I'am doing like this: nls(vec_y ~ expFct2(vec_x, a, b, c), start = list(a=1.006, b=3.351, c=63.257),control=nls.control(maxiter = 200), I have this error:Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimatesTrujillo
My suggestion was to reparametrize your function, first to separate the effect of the various parameters, second, more importantly, to ensure that the optimal values we are looking for have the same order of magnitude (if one is 100,000,000 times larger than the others, you should expect problems). After this reparametrization, the optimization is longer sensitive to the initial values.Immitigable
How did you get this a * (1-exp(-(x-c)/abs(b))) from this a*(1-exp(-x/b)) + c ?Trujillo
(Since I have changed the parameters, I should have renamed them: this is indeed confusing. Let us use primes for the new parameters.) If you expand those expressions, both a' * (1-exp(-(x-c')/abs(b'))) and a*(1-exp(-x/b)) + c, are of the form A + B * exp(C*x).Immitigable

© 2022 - 2024 — McMap. All rights reserved.