Multivariate Bisection Method
Asked Answered
D

7

9

I need an algorithm to perform a 2D bisection method for solving a 2x2 non-linear problem. Example: two equations f(x,y)=0 and g(x,y)=0 which I want to solve simultaneously. I am very familiar with the 1D bisection ( as well as other numerical methods ). Assume I already know the solution lies between the bounds x1 < x < x2 and y1 < y < y2.

In a grid the starting bounds are:

    ^
    |   C       D
y2 -+  o-------o
    |  |       |
    |  |       |
    |  |       |
y1 -+  o-------o
    |   A       B
    o--+------+---->
       x1     x2

and I know the values f(A), f(B), f(C) and f(D) as well as g(A), g(B), g(C) and g(D). To start the bisection I guess we need to divide the points out along the edges as well as the middle.

    ^
    |   C   F   D
y2 -+  o---o---o
    |  |       |
    |G o   o M o H
    |  |       |
y1 -+  o---o---o
    |   A   E   B
    o--+------+---->
       x1     x2

Now considering the possibilities of combinations such as checking if f(G)*f(M)<0 AND g(G)*g(M)<0 seems overwhelming. Maybe I am making this a little too complicated, but I think there should be a multidimensional version of the Bisection, just as Newton-Raphson can be easily be multidimed using gradient operators.

Any clues, comments, or links are welcomed.

Dourine answered 18/8, 2010 at 15:18 Comment(3)
I don't quite understand. What are your equations? It can't be f(x,y)=0 since f(A)=f(B)=f(C)=f(D)=0 and the same for g.Pearsall
I am plotting the domain of the functions above in (x,y). Think of it as a surface map of where solutions might exist. The actual values of f and g are not shown in the diagram above. At each point on the map my functions f and g have a value and I am trying to find which point on the map makes them both zero at the same time.Dourine
@jalexiou: Okay, thanks. It really is two surfaces then. I get it now.Pearsall
D
6

I just stumbled upon the answer to this from geometrictools.com and C++ code.

edit: the code is now on github.

Title

Dourine answered 8/1, 2014 at 22:17 Comment(0)
S
7

Sorry, while bisection works in 1-d, it fails in higher dimensions. You simply cannot break a 2-d region into subregions using only information about the function at the corners of the region and a point in the interior. In the words of Mick Jagger, "You can't always get what you want".

Scratchy answered 18/8, 2010 at 16:23 Comment(4)
Can you elaborate more on "you can't". What kind of information is missing? Think of the fact that f(x,y)=0 is a curve on the grid above and g(x,y)=0 is another curve. Where the two curves intersect is my solution. Think of the functions as one-to-one monotonic but non-linear.Dourine
First of all, there is no reason to presume that for some general problem, that this pair of zero contours are in any sense monotonic. Perhaps you know that, but if so, then why did you not bother to tell us? As it is, I think it most likely that you have no such information. I can give you a very simple function where the zero contour is a circle. z(x,y) = x^2 + y^2 - 1.Scratchy
this seems wrong, 2D bisection is not impossible. see ja72 answerMosul
I think what @Scratchy is saying that given the stated constraints there's no guarantee there is a solution, and it may not even be provable. If there's some indication (like they functions are continuous and they must cross somewhere) then that should be stated in the problem.Consignee
D
6

I just stumbled upon the answer to this from geometrictools.com and C++ code.

edit: the code is now on github.

Title

Dourine answered 8/1, 2014 at 22:17 Comment(0)
V
5

I would split the area along a single dimension only, alternating dimensions. The condition you have for existence of zero of a single function would be "you have two points of different sign on the boundary of the region", so I'd just check that fro the two functions. However, I don't think it would work well, since zeros of both functions in a particular region don't guarantee a common zero (this might even exist in a different region that doesn't meet the criterion).

For example, look at this image:

alt text

There is no way you can distinguish the squares ABED and EFIH given only f() and g()'s behaviour on their boundary. However, ABED doesn't contain a common zero and EFIH does.

This would be similar to region queries using eg. kD-trees, if you could positively identify that a region doesn't contain zero of eg. f. Still, this can be slow under some circumstances.

Vertex answered 18/8, 2010 at 16:4 Comment(1)
If the solution is guaranteed to be withing the starting range (x1..x2,y1..y2), then it has to exists within one of the subgroups (assuming the functions are continious. There has to be an algorithm out there that finds on which quadrant the solution is.Dourine
S
2

If you can assume (per your comment to woodchips) that f(x,y)=0 defines a continuous monotone function y=f2(x), i.e. for each x1<=x<=x2 there is a unique solution for y (you just can't express it analytically due to the messy form of f), and similarly y=g2(x) is a continuous monotone function, then there is a way to find the joint solution.

If you could calculate f2 and g2, then you could use a 1-d bisection method on [x1,x2] to solve f2(x)-g2(x)=0. And you can do that by using 1-d bisection on [y1,y2] again for solving f(x,y)=0 for y for any given fixed x that you need to consider (x1, x2, (x1+x2)/2, etc) - that's where the continuous monotonicity is helpful -and similarly for g. You have to make sure to update x1-x2 and y1-y2 after each step.

This approach might not be efficient, but should work. Of course, lots of two-variable functions don't intersect the z-plane as continuous monotone functions.

Susa answered 18/8, 2010 at 22:58 Comment(1)
this is great method, but I am not sure if it retains the properties of bisection. I want a method that guarantees convergence, albeit not very efficient. Again, I know the problem is bounded because of physical laws. I will check it out though.Dourine
T
2

Let f_1(x,y), f_2(x,y) be two functions which are continuous and monotonic with respect to x and y. The problem is to solve the system f_1(x,y) = 0, f_2(x,y) = 0.

The alternating-direction algorithm is illustrated below. Here, the lines depict sets {f_1 = 0} and {f_2 = 0}. It is easy to see that the direction of movement of the algorithm (right-down or left-up) depends on the order of solving the equations f_i(x,y) = 0 (e.g., solve f_1(x,y) = 0 w.r.t. x then solve f_2(x,y) = 0 w.r.t. y OR first solve f_1(x,y) = 0 w.r.t. y and then solve f_2(x,y) = 0 w.r.t. x).

Given the initial guess, we don't know where the root is. So, in order to find all roots of the system, we have to move in both directions.

Talbot answered 22/10, 2019 at 21:20 Comment(10)
Simple and effective, like a 2D version of the single point iteration method for roots.Dourine
Like the Gauss-Seidel method for linear equations, the convergence depends critically on picking a strongly dominant variable in the assignment of equation to variable to isolate for. Else you can as easily spiral away from the common root. Now finding the correct assignment can be as hard as finding the root itself. Also, if there is convergence, the rate of it is slow, linear. Applying Newton this close to the root will converge faster. And finally, how is this related to the bisection method or any other shrinking root enclosure?Ursa
@LutzL Thank You for Your responce. In 2D case, the assignment of equation to variable at each double step of the algorithm is uniquely defined from 1) the choice of direction of searching a root and 2) signs of functions f_1 and f_2 which are conserved during the iterations. Convergence of this algorithm might not be so fast as Newton's method. Although, Newton's method requires computation of derivatives of f_1 and f_2 but the functions may be known with a significant error (e.g. from numerical solution of PDE). So, this generalization of 1D bisection will be more stable.Talbot
What guarantees a sign change in a small interval in the chosen direction in any of the functions? How do you chose the interval without an excessive number of function evaluations?Ursa
@LutzL This algorithm allows to control the residuals: |f_i| < eps. The interval length can be estimated from maximum value of derivative df_i/dx or df_i/dy. To obtain such estimate, we may pick the interval length on similar problems with known solution.Talbot
What is a "similar problem", how do you obtain it algorithmically, how do you guarantee that this similarity holds until a root is found? // There is no cheap way to find this maximum value, optimizing a function or its derivative is as difficult as finding the roots, you have thus "enhanced" the given problem by one that is at least as complicated.Ursa
Dear Dr. @LutzL, The problem of finding parameters (x_1, x_2, ..., x_n) corresponding to prescribed measurements {y_1, y_2, ..., y_m} of field u, M_i[u({x_j})] = y_i, leads to nonlinear algebraic system {f_i(x_1, x_2, ..., x_n) = 0, i=1..n} with monotonic and continuous functions f_i. So, if M_i are fixed, we may test this functional dependence in a number of points {x = (x_1, ..., x_n)} and get necessary information about it. (continued...)Talbot
@LutzL (...) There is a simpler way to control the error. We may start from some initial guess and go to a domain of specified signs of f_i, for example {f_1(x,y) > 0, f_2(x,y) < 0}. Then we move right (in the figure), and after achieving small residual we may do several steps of small length to go away from the domain, and we'll get a point in the other domain {f_1(x,y) < 0, f_2(x,y) > 0}. (continued...)Talbot
@LutzL (...) So, if x' belongs to the left domain and x'' belongs to the right domain, then we may iteratively refine the interval [x', x''] by using our algorithm, this is a way to control the accuracy of the approximation. Great thanks for Your feedback, it was very useful for us.Talbot
@ja72 yes, the system f1(x,y)=0, f2(x,y)=0 may be written for convenience as g1(x)=g2(x) where f1(x,g1(x))≡0, f2(x,g2(x))≡0, so this method is just a generalization of method of successive iteration for x=g(x), here g1,g2 are decreasing and g is increasing. But in 2D case of monotonic f1 and f2 this algorithm is an overkill because this problem can be solved with common bisection method: if g1(x')<g2(x') and g1(x'')>g2(x'') then the interval [x',x''] may be shortened in a standard way. However, the monotonic algorithm may be used if the interval [x',x''] contains more than one root.Talbot
C
2

I'm not much experient on optimization, but I built a solution to this problem with a bisection algorithm like the question describes. I think is necessary to fix a bug in my solution because it compute tow times a root in some cases, but i think it's simple and will try it later.

EDIT: I seem the comment of jpalecek, and now I anderstand that some premises I assumed are wrong, but the methods still works on most cases. More especificaly, the zero is garanteed only if the two functions variate the signals at oposite direction, but is need to handle the cases of zero at the vertices. I think is possible to build a justificated and satisfatory heuristic to that, but it is a little complicated and now I consider more promising get the function given by f_abs = abs(f, g) and build a heuristic to find the local minimuns, looking to the gradient direction on the points of the middle of edges.

Introduction

Consider the configuration in the question:

    ^
    |   C       D
y2 -+  o-------o
    |  |       |
    |  |       |
    |  |       |
y1 -+  o-------o
    |   A       B
    o--+------+---->
       x1     x2

There are many ways to do that, but I chose to use only the corner points (A, B, C, D) and not middle or center points liky the question sugests. Assume I have tow function f(x,y) and g(x,y) as you describe. In truth it's generaly a function (x,y) -> (f(x,y), g(x,y)).

The steps are the following, and there is a resume (with a Python code) at the end.

Step by step explanation

  1. Calculate the product each scalar function (f and g) by them self at adjacent points. Compute the minimum product for each one for each direction of variation (axis, x and y).
Fx = min(f(C)*f(B), f(D)*f(A))
Fy = min(f(A)*f(B), f(D)*f(C))

Gx = min(g(C)*g(B), g(D)*g(A))
Gy = min(g(A)*g(B), g(D)*g(C))

It looks to the product through tow oposite sides of the rectangle and computes the minimum of them, whats represents the existence of a changing of signal if its negative. It's a bit of redundance but work's well. Alternativaly you can try other configuration like use the points (E, F, G and H show in the question), but I think make sense to use the corner points because it consider better the whole area of the rectangle, but it is only a impression.

  1. Compute the minimum of the tow axis for each function.
F = min(Fx, Fy)
G = min(Gx, Gy)

It of this values represents the existence of a zero for each function, f and g, within the rectangle.

  1. Compute the maximum of them:
max(F, G)

If max(F, G) < 0, then there is a root inside the rectangle. Additionaly, if f(C) = 0 and g(C) = 0, there is a root too and we do the same, but if the root is in other corner we ignore him, because other rectangle will compute it (I want to avoid double computation of roots). The statement bellow resumes:

guaranteed_contain_zeros = max(F, G) < 0 or (f(C) == 0 and g(C) == 0)

In this case we have to proceed breaking the region recursively ultil the rectangles are as small as we want.

Else, may still exist a root inside the rectangle. Because of that, we have to use some criterion to break this regions ultil the we have a minimum granularity. The criterion I used is to assert the largest dimension of the current rectangle is smaller than the smallest dimension of the original rectangle (delta in the code sample bellow).

Resume

This Python code resume:

def balance_points(x_min, x_max, y_min, y_max, delta, eps=2e-32):

    width = x_max - x_min
    height = y_max - y_min
    x_middle = (x_min + x_max)/2
    y_middle = (y_min + y_max)/2

    Fx = min(f(C)*f(B), f(D)*f(A))
    Fy = min(f(A)*f(B), f(D)*f(C))
    Gx = min(g(C)*g(B), g(D)*g(A))
    Gy = min(g(A)*g(B), g(D)*g(C))

    F = min(Fx, Fy)
    G = min(Gx, Gy)

    largest_dim = max(width, height)
    guaranteed_contain_zeros = max(F, G) < 0 or (f(C) == 0 and g(C) == 0)

    if guaranteed_contain_zeros and largest_dim <= eps:
        return [(x_middle, y_middle)]
    elif guaranteed_contain_zeros or largest_dim > delta:
        if width >= height:
            return balance_points(x_min, x_middle, y_min, y_max, delta) + balance_points(x_middle, x_max, y_min, y_max, delta)
        else:
            return balance_points(x_min, x_max, y_min, y_middle, delta) + balance_points(x_min, x_max, y_middle, y_max, delta)
    else:
        return []

Results

I have used a similar code similar in a personal project (GitHub here) and it draw the rectangles of the algorithm and the root (the system have a balance point at the origin): Rectangles

It works well.

Improvements

In some cases the algorithm compute tow times the same zero. I thinh it can have tow reasons:

  • I the case the functions gives exatly zero at neighbour rectangles (because of an numerical truncation). In this case the remedy is to incrise eps (increase the rectangles). I chose eps=2e-32, because 32 bits is a half of the precision (on 64 bits archtecture), then is problable that the function don't gives a zero... but it was more like a guess, I don't now if is the better. But, if we decrease much the eps, it extrapolates the recursion limit of Python interpreter.
  • The case in witch the f(A), f(B), etc, are near to zero and the product is truncated to zero. I think it can be reduced if we use the product of the signals of f and g in place of the product of the functions.
  • I think is possible improve the criterion to discard a rectangle. It can be made considering how much the functions are variating in the region of the rectangle and how distante the function is of zero. Perhaps a simple relation between the average and variance of the function values on the corners. In another way (and more complicated) we can use a stack to store the values on each recursion instance and garantee that this values are convergent to stop recursion.
Cranage answered 27/6, 2021 at 22:16 Comment(0)
T
1

This is a similar problem to finding critical points in vector fields (see http://alglobus.net/NASAwork/topology/Papers/alsVugraphs93.ps).

If you have the values of f(x,y) and g(x,y) at the vertexes of your quadrilateral and you are in a discrete problem (such that you don't have an analytical expression for f(x,y) and g(x,y) nor the values at other locations inside the quadrilateral), then you can use bilinear interpolation to get two equations (for f and g). For the 2D case the analytical solution will be a quadratic equation which, according to the solution (1 root, 2 real roots, 2 imaginary roots) you may have 1 solution, 2 solutions, no solutions, solutions inside or outside your quadrilateral.

If instead you have analytic functions of f(x,y) and g(x,y) and want to use them, this is not useful. Instead you could divide your quadrilateral recursively, however as it was already pointed out by jpalecek (2nd post), you would need a way to stop your divisions by figuring out a test that would assure you would have no zeros inside a quadrilateral.

Teapot answered 15/12, 2011 at 19:51 Comment(1)
Thanks, I will try the bi-linear solution. I will also look into my NR book for inspiration.Dourine

© 2022 - 2024 — McMap. All rights reserved.