I asked this question a while ago. I am not sure whether I should post this as an answer or a new question. I do not have an answer but I "solved" the problem by applying the Levenberg-Marquardt algorithm using nls.lm
in R and when the solution is at the boundary, I run the trust-region-reflective algorithm (TRR, implemented in R) to step away from it. Now I have new questions.
From my experience, doing this way the program reaches the optimal and is not so sensitive to the starting values. But this is only a practical method to step aside from the issues I encounterd using nls.lm
and also other optimization functions in R. I would like to know why nls.lm
behaves this way for optimization problems with boundary constraints and how to handle the boundary constraints when using nls.lm
in practice.
Following I gave an example illustrating the two issues using nls.lm
.
- It is sensitive to starting values.
- It stops when some parameter reaches the boundary.
A Reproducible Example: Focus Dataset D
library(devtools)
install_github("KineticEval","zhenglei-gao")
library(KineticEval)
data(FOCUS2006D)
km <- mkinmod.full(parent=list(type="SFO",M0 = list(ini = 0.1,fixed = 0,lower = 0.0,upper =Inf),to="m1"),m1=list(type="SFO"),data=FOCUS2006D)
system.time(Fit.TRR <- KinEval(km,evalMethod = 'NLLS',optimMethod = 'TRR'))
system.time(Fit.LM <- KinEval(km,evalMethod = 'NLLS',optimMethod = 'LM',ctr=kingui.control(runTRR=FALSE)))
compare_multi_kinmod(km,rbind(Fit.TRR$par,Fit.LM$par))
dev.print(jpeg,"LMvsTRR.jpeg",width=480)
The differential equations that describes the model/system is:
"d_parent = - k_parent * parent"
"d_m1 = - k_m1 * m1 + k_parent * f_parent_to_m1 * parent"
In the graph on the left is the model with initial values, and in the middle is the fitted model using "TRR"(similar to the algorithm in Matlab lsqnonlin
function ), on the right is the fitted model using "LM" with nls.lm
. Looking at the fitted parameters(Fit.LM$par
) you will find that one fitted parameter(f_parent_to_m1
) is at the boundary 1
. If I change the starting value for one parameter M0_parent
from 0.1 to 100, then I got the same results using nls.lm
and lsqnonlin
.I have many cases like this one.
newpars <- rbind(Fit.TRR$par,Fit.LM$par)
rownames(newpars)<- c("TRR(lsqnonlin)","LM(nls.lm)")
newpars
M0_parent k_parent k_m1 f_parent_to_m1
TRR(lsqnonlin) 99.59848 0.09869773 0.005260654 0.514476
LM(nls.lm) 84.79150 0.06352110 0.014783294 1.000000
Except for the above problems, it often happens that the Hessian returned by nls.lm
is not invertable(especially when some parameters are on the boundary) so I cannot get an estimation of the covariance matrix. On the other hand, the "TRR" algorithm(in Matlab) almost always give an estimation by calculating the Jacobian at the solution point. I think this is useful but I am also sure that R optimization algorithms(the ones I have tried) did not do this for a reason. I would like to know whether I am wrong by using the Matlab way of calculating the covariance matrix to get standard error for the parameter estimates.
One last note, I claimed in my previous post that the Matlab lsqnonlin
outperforms R's optimization functions in almost all cases. I was wrong. The "Trust-Region-Reflective" algorithm used in Matlab is in fact slower(sometimes much slower) if also implemented in R as you can see from the above example. However, it is still more stable and reaches a better solution than the R's basic optimization algorithms.