I'm using the SLSQP solver in scipy.minimize to solve a constrained optimization problem. Very often the solver will try parameter values that violate the constraints. When these constraints are violated, the objective function returns a nan
. This would seem to pose problems, as my approximated Jacobian is full of nan
's nearly every time it is recalculated. More often than not, the optimization terminates in exit mode 8: Positive directional derivative for linesearch
. I suspect the nan
's in the approximated Jacobian to be the scource of this. My question then is how does scipy.minimize handle nan
's? Are they benign, or should they be converted to a large (or even infinite) number? To the best of my knowledge, this information is not covered anywhere in the Scipy documentation.
There are checks in scipy
for nans
depending on which search algorithm you use. You'll have to check the source of each search algorithm. It generally doesn't affect minimisers (unless you use non-discriminatory methods) but it really messes up maximisation. In general, scipy
lands up using numpy
arrays. The best way to understand what happens is with the following simple example:
>>> x = [-np.nan, np.nan, 1, 2, 3, np.nan] # some random sequence of numbers and nans
>>> np.sort(x)
array([ 1., 2., 3., nan, nan, nan])
The np.nan
is always seen as the largest number thus, you have to account for this explicitly in your search algorithm such that these solutions are rejected from future iterations. As to interpreting +/- nans
see this if the backend implementations are in fortran - which is sometimes the case.
There is a very advanced minimization routine called Minuit which is used in the particle physics community that is similar to the routine you mention. They both use quasi Newton methods to estimate the second derivative in order to try to "jump" to the minimum in the fewest number of iterations.
These methods typically do not handle boundary value problems, and there are a whole different class of algorithms devoted to minimizing functions with constraints.
That being said, it is possible in Minuit to set parameter boundaries. The way this is achieved in Minuit is pretty clever. Essentially each parameter is mapped "internally" to:
p_int = arcsin(2*(p_ext-a)/(b-a)-1)
and
p_ext = a + ((b-a)/2)*(sin(p_int)+1)
where a
and b
are the upper and lower limits respectively. See the Minuit manual here for more details.
I suspect you could do something similar assuming you have linear bounds on each of your parameters.
© 2022 - 2024 — McMap. All rights reserved.