Jacobian and Hessian inputs in `scipy.optimize.minimize`
Asked Answered
S

3

16

I am trying to understand how the "dogleg" method works in Python's scipy.optimize.minimize function. I am adapting the example at the bottom of the help page.

The dogleg method requires a Jacobian and Hessian argument according to the notes. For this I use the numdifftools package:

import numpy as np
from scipy.optimize import minimize
from numdifftools import Jacobian, Hessian

def fun(x,a):
    return (x[0] - 1)**2 + (x[1] - a)**2

x0 = np.array([2,0]) # initial guess
a = 2.5

res = minimize(fun, x0, args=(a), method='dogleg',
               jac=Jacobian(fun)([2,0]), hess=Hessian(fun)([2,0]))

print(res)

Edit:

If I make a change as suggested by a post below,

res = minimize(fun, x0, args=a, method='dogleg',
               jac=Jacobian(lambda x: fun(x,a)),
               hess=Hessian(lambda x: fun(x,a)))

I get an error TypeError: <lambda>() takes 1 positional argument but 2 were given. What am I doing wrong?

Also is it correct to calculate the Jacobian and Hessian at the initial guess x0?

Sleight answered 14/12, 2016 at 7:33 Comment(0)
V
14

That error is coming from the calls to Jacobian and Hessian, not in minimize. Replacing Jacobian(fun) with Jacobian(lambda x: fun(x, a)) and similarly for Hessian should do the trick (since now the function being differentiated only has a single vector argument).

One other thing: (a) is just a, if you want it to be a tuple use (a,).

import numpy as np
from scipy.optimize import minimize
from numdifftools import Jacobian, Hessian

def fun(x, a):
    return (x[0] - 1) **2 + (x[1] - a) **2

def fun_der(x, a):
    return Jacobian(lambda x: fun(x, a))(x).ravel()

def fun_hess(x, a):
    return Hessian(lambda x: fun(x, a))(x)

x0 = np.array([2, 0]) # initial guess
a = 2.5

res = minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
print(res)
Voltaic answered 14/12, 2016 at 7:41 Comment(6)
Thanks. If I make those changes to Jacobian and Hessian I get an error ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all(). Also if I want to specify more than 1 argument, would I then need to use something like args=(a,b,c)?Sleight
In this particular case you actually want to pass in a callable Jacobian function so you should leave off the ([2, 0]) (I edited my response to reflect this).Voltaic
OK... If I do that, I get TypeError: <lambda>() takes 1 positional argument but 2 were given. Then if I try using lambda x,a: fun(x, a) as the Jacobian and Hessian inputs I get ValueError: incompatible dimensions.Sleight
Thinking a bit more about this: there's really no reason you would ever want to do this anyway. If you don't have a closed-form derivative you should be using a method that doesn't require one; numerically computing the derivatives like this is extremely inefficient.Voltaic
Fair point, but I specifically need to solve my problem using the "dogleg" algorithm in Python (which requires the Jacobian and Hessian). I'm not concerned with efficiency of the code yet, I just want to know how the scipy.optimize.minimize(method='dogleg') function works.Sleight
I've updated the code to show exactly the steps you need to follow. The args parameter is passed to both the Jacobian and Hessian as well, so you have to wrap those in a function that properly handles that argument as well.Voltaic
C
12

I get that this is a toy example, but I would like to point out that using a tool like Jacobian or Hessian to calculate the derivatives instead of deriving the function itself is fairly costly. For example with your method:

x0 = np.array([2, 0])
a = 2.5
%timeit minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
100 loops, best of 3: 13.6 ms per loop

But you could calculate the derivative functions as such:

def fun_der(x, a):
    dx = 2 * (x[0] - 1)
    dy = 2 * (x[1] - a)
    return np.array([dx, dy]

def fun_hess(x, a):
    dx = 2
    dy = 2
    return np.diag([dx, dy])

%timeit minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
1000 loops, best of 3: 279 µs per loop

As you can see that is almost 50x faster. It really starts to add up with complex functions. As such I always try to derive the functions explicitly myself, regardless of how difficult that may be. One fun example is the kernel based implementation of Inductive Matrix Completion.

argmin --> sum((A - gamma_u(X) Z gamma_v(Y))**2 - lambda * ||Z||**2)
where gamma_u = (1/sqrt(m_x)) * [cos(UX), sin(UX)] and
gamma_v = (1/sqrt(m_y)) * [cos(VY), sin(VY)]
X.shape = n_x, p; Y.shape = n_y, q; U.shape = m_x, p; V.shape = m_y, q; Z.shape = 2m_x, 2m_y

Calculating the gradient and hessian from this equation is extremely unreasonable in comparison to explicitly deriving and utilizing those functions. So as @bnaul pointed out, if your function does have closed form derivates you really do want to calculate and use them.

Crasis answered 9/9, 2017 at 7:7 Comment(2)
don't know numdifftools, but couldn't you calculate jacf = Jacobian(lambda x: fun(x, 2.5)) just once, then minimize( ... jac=jacf, hess=hessf ) ?Logic
@Logic I think you meant to direct that towards bnaul I don't use numdifftools or these methods. As I mention, I like to derive the functions directly if possible to avoid the overhead of repeatedly calculating them at each iteration.Crasis
S
1

You can use autograd instead

import numpy as np
from scipy.optimize import minimize
from autograd import jacobian, hessian

def fun(x, a):
    return (x[0] - 1) **2 + (x[1] - a) **2

def fun_der(x, a):
    return jacobian(lambda x: fun(x, a))(x).ravel()

def fun_hess(x, a):
    return hessian(lambda x: fun(x, a))(x)

x0 = np.array([2, 0]) # initial guess
a = 2.5

res = minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
print(res)
Shedd answered 30/11, 2021 at 8:7 Comment(1)
Would be great with some more explanation on this answer, just posting code has limited usefulness.Weaken

© 2022 - 2024 — McMap. All rights reserved.