Can scipy.optimize minimize functions of complex variables at all and how?
Asked Answered



I am trying to minimize a function of a complex (vector) variable using scipy.optimize. My results so far indicate that it may not be possible. To investigate the problem, I have implemented a simple example - minimize the 2-norm of a complex vector with an offset:

import numpy as np
from scipy.optimize import fmin

def fun(x):
    return np.linalg.norm(x - 1j * np.ones(2), 2)

sol = fmin(fun, x0=np.ones(2) + 0j)

The output is

Optimization terminated successfully.
         Current function value: 2.000000
         Iterations: 38
         Function evaluations: 69

>>> sol
array([-2.10235293e-05,  2.54845649e-05])

Clearly, the solution should be

array([0.+1.j, 0.+1.j])

Disappointed with this outcome, I have also tried scipy.optimize.minimize:

from scipy.optimize import minimize

def fun(x):
    return np.linalg.norm(x - 1j * np.ones(2), 1)

sol = minimize(fun, x0=np.ones(2) + 0j)

The output is

>>> sol
      fun: 2.0
 hess_inv: array([[ 9.99997339e-01, -2.66135332e-06],
       [-2.66135332e-06,  9.99997339e-01]])
      jac: array([0., 0.])
  message: 'Optimization terminated successfully.'
     nfev: 24
      nit: 5
     njev: 6
   status: 0
  success: True
        x: array([6.18479071e-09+0.j, 6.18479071e-09+0.j])

Not good either. I have tried specifying all of the possible methods for minimize (supplying the Jacobian and Hessian as necessary), but none of them reach the correct result. Most of them cause ComplexWarning: Casting complex values to real discards the imaginary part, indicating that they cannot handle complex numbers correctly.

Is this possible at all using scipy.optimize?

If so, I would very much appreciate if someone can tell me what I am doing wrong.

If not, do you perhaps have suggestions for alternative optimization tools (for Python) that allow this?

Steamboat answered 6/7, 2018 at 13:3 Comment(0)

The minimization methods of SciPy work with real arguments only. But minimization on the complex space Cn amounts to minimization on R2n, the algebra of complex numbers never enters the consideration. Thus, adding two wrappers for conversion from Cn to R2n and back, you can optimize over complex numbers.

def real_to_complex(z):      # real vector of length 2n -> complex of length n
    return z[:len(z)//2] + 1j * z[len(z)//2:]

def complex_to_real(z):      # complex vector of length n -> real of length 2n
    return np.concatenate((np.real(z), np.imag(z)))

sol = minimize(lambda z: fun(real_to_complex(z)), x0=complex_to_real(np.ones(2) + 0j))
print(real_to_complex(sol.x))   # [-7.40376620e-09+1.j -8.77719406e-09+1.j]

You mention Jacobian and Hessian... but minimization only makes sense for real-valued functions, and those are never differentiable with respect to complex variables. The Jacobian and Hessian would have to be computed over R2n anyway, treating the real and imaginary parts as separate variables.

Cachepot answered 6/7, 2018 at 16:20 Comment(3)
I am not sure I completely understand: are you saying that I cannot minimize $\|x - c\|_2$ if $x$ and/or $c$ are complex? The norm of a complex vector seems to make sense to me and as such the minimizer of this small example is $x = c$.Steamboat
I'm not saying that. I'm saying the function being minimized is f(x) = ||x-c||, which is real-valued, hence not differentiable in the complex sense.Cachepot
You might want to look for the Wirtingel derivative. In, the author explained, "Due to the property that the non-trivial real-valued functions are not analytic, stationary points of the real-valued f(z) cannot be obtained by searching for points z where the derivative f'(z) is zero. However, we can detect stationary points z of f(z) by a vanishing differential df." As for the optimization algorithm in complex variables, you can read this article:

I have needed to minimize the departure of a complex valued model function upon complex valued parameters, over a real domain.

A toy example:

def f(x, a, b):
    ab = complex(a,b);
    return np.exp(x*ab)

And suppose that I have data DATA for x = np.arange(N). Note that x is real.

What I did was this:

def helper(x, a, b):
    return abs(f(x,a,b) - DATA[x])

and then I can use curve_fit():

curve_fit(helper, np.arange(N), np.zeros(N), p0 = [1,0])

What is happening is this: By subtracting the data from the model function, the new "ideal" output is all zeroes, which can be (must be) real in order for curve_fit() to work. The complex parameter ab = a + jb has been broken into its real and imaginary parts. The helper() function returns the absolute value of the difference between the model and the data.

A critical issue is that curve_fit() doesn't evaluate any other x values than those you give it. Otherwise DATA[x] would fail.

Note that by using abs() I'm achieving an L1 fit (more or less). One could just as well use abs()**2 to get an L2 fit ... but why one would use L1 or L2 is a topic for another day.

You could fret, "suppose that the x[] aren't integers (but are real)?" which my code requires. Well, that's doable, simply by putting them into an array, and indexing that. There's probably some clever hack using a dictionary that would address this issue, too.

Sorry about the code formatting; haven't figured out the markup yet.

Durant answered 12/8, 2022 at 1:2 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.