How to return cost, grad as tuple for scipy's fmin_cg function
Asked Answered
L

1

7

How can I make scipy's fmin_cg use one function that returns cost and gradient as a tuple? The problem with having f for cost and fprime for gradient, is that I might have to perform an operation twice (very costly) by which the grad and cost is calculated. Also, sharing the variables between them could be troublesome.

In Matlab however, fmin_cg works with one function that returns cost and gradient as tuple. I don't see why scipy's fmin_cg cannot provide such convenience.

Thanks in advance...

Lucrative answered 2/7, 2013 at 16:38 Comment(0)
P
6

You can use scipy.optimize.minimize with jac=True. If that's not an option for some reason, then you can look at how it handles this situation:

class MemoizeJac(object):
    """ Decorator that caches the value gradient of function each time it
    is called. """
    def __init__(self, fun):
        self.fun = fun
        self.jac = None
        self.x = None

    def __call__(self, x, *args):
        self.x = numpy.asarray(x).copy()
        fg = self.fun(x, *args)
        self.jac = fg[1]
        return fg[0]

    def derivative(self, x, *args):
        if self.jac is not None and numpy.alltrue(x == self.x):
            return self.jac
        else:
            self(x, *args)
            return self.jac

This class wraps a function that returns function value and gradient, keeping a one-element cache and checks that to see if it already knows its result. Usage:

fmemo = MemoizeJac(f, fprime)
xopt = fmin_cg(fmemo, x0, fmemo.derivative)

The strange thing about this code is that it assumes f is always called before fprime (but not every f call is followed by an fprime call). I'm not sure if scipy.optimize actually guarantees that, but the code can be easily adapted to not make that assumption, though. Robust version of the above (untested):

class MemoizeJac(object):
    def __init__(self, fun):
        self.fun = fun
        self.value, self.jac = None, None
        self.x = None

    def _compute(self, x, *args):
        self.x = numpy.asarray(x).copy()
        self.value, self.jac = self.fun(x, *args)

    def __call__(self, x, *args):
        if self.value is not None and numpy.alltrue(x == self.x):
            return self.value
        else:
            self._compute(x, *args)
            return self.value

    def derivative(self, x, *args):
        if self.jac is not None and numpy.alltrue(x == self.x):
            return self.jac
        else:
            self._compute(x, *args)
            return self.jac
Palmary answered 2/7, 2013 at 17:12 Comment(4)
+1 Nice way of caching the last call's return! It wouldn't be too hard to overcome that last potential limitation (f called before fprime), right?Moltke
@Jaime: no, just repeat the trick used in derivative. See updated answer.Palmary
Waw, this is such an amazing solution, I just tested my code with something like minimize(fun =self._cost_grad, x0=initial_theta, method='Newton-CG', options = {'maxiter' :20, 'disp':True}, jac =True,args=(X, n_features, n_samples)) , and I got awesome results. The parameter fun expects a function that returns (cost, grad) as tuple, and method can be simply changed to perform l_bfgs, bfgs, cg or any optimizer out there in scipy. Thanks a million! I'm surprised this answer isn't prevalent.Lucrative
@IssamLaradji: well, the SciPy folks already solved the problem by the new scipy.optimize.minimize wrapper.Palmary

© 2022 - 2024 — McMap. All rights reserved.