How to interpolate a line between two other lines in python
Asked Answered
P

4

21

Note: I asked this question before but it was closed as a duplicate, however, I, along with several others believe it was unduely closed, I explain why in an edit in my original post. So I would like to re-ask this question here again.

Does anyone know of a python library that can interpolate between two lines. For example, given the two solid lines below, I would like to produce the dashed line in the middle. In other words, I'd like to get the centreline. The input is a just two numpy arrays of coordinates with size N x 2 and M x 2 respectively.

enter image description here

Furthermore, I'd like to know if someone has written a function for this in some optimized python library. Although optimization isn't exactly a necessary.

Here is an example of two lines that I might have, you can assume they do not overlap with each other and an x/y can have multiple y/x coordinates.

array([[ 1233.87375018,  1230.07095987],
       [ 1237.63559365,  1253.90749041],
       [ 1240.87500801,  1264.43925132],
       [ 1245.30875975,  1274.63795396],
       [ 1256.1449357 ,  1294.48254424],
       [ 1264.33600095,  1304.47893299],
       [ 1273.38192911,  1313.71468591],
       [ 1283.12411536,  1322.35942538],
       [ 1293.2559388 ,  1330.55873344],
       [ 1309.4817002 ,  1342.53074698],
       [ 1325.7074616 ,  1354.50276051],
       [ 1341.93322301,  1366.47477405],
       [ 1358.15898441,  1378.44678759],
       [ 1394.38474581,  1390.41880113]])

array([[ 1152.27115094,  1281.52899302],
       [ 1155.53345506,  1295.30515742],
       [ 1163.56506781,  1318.41642169],
       [ 1168.03497425,  1330.03181319],
       [ 1173.26135672,  1341.30559949],
       [ 1184.07110925,  1356.54121651],
       [ 1194.88086178,  1371.77683353],
       [ 1202.58908737,  1381.41765447],
       [ 1210.72465255,  1390.65097106],
       [ 1227.81309742,  1403.2904646 ],
       [ 1244.90154229,  1415.92995815],
       [ 1261.98998716,  1428.56945169],
       [ 1275.89219696,  1438.21626352],
       [ 1289.79440676,  1447.86307535],
       [ 1303.69661656,  1457.50988719],
       [ 1323.80994319,  1470.41028655],
       [ 1343.92326983,  1488.31068591],
       [ 1354.31738934,  1499.33260989],
       [ 1374.48879779,  1516.93734053],
       [ 1394.66020624,  1534.54207116]])

Visualizing this we have: enter image description here

So my attempt at this has been using the skeletonize function in the skimage.morphology library by first rasterizing the coordinates into a filled in polygon. However, I get branching at the ends like this:

enter image description here

Pallaten answered 28/2, 2018 at 20:21 Comment(4)
This is probably an underspecified problem - i.e. how is "middle" defined here? (Or worse, when the curves aren't nice and convex like this.)Wist
Do the lines every intersect? Will any x coordinate have multiple y coordinates?Bamford
Can we work with straight-only lines ?Scarify
I have updated the question to answer some of your questions. In terms of the definition of "middle", I'm not so sure how to define it myself. Are there multiple ways to interpret it?Pallaten
T
34

First of all, pardon the overkill; I had fun with your question. If the description is too long, feel free to skip to the bottom, I defined a function that does everything I describe.

Your problem would be relatively straightforward if your arrays were the same length. In that case, all you would have to do is find the average between the corresponding x values in each array, and the corresponding y values in each array.

So what we can do is create arrays of the same length, that are more or less good estimates of your original arrays. We can do this by fitting a polynomial to the arrays you have. As noted in comments and other answers, the midline of your original arrays is not specifically defined, so a good estimate should fulfill your needs.

Note: In all of these examples, I've gone ahead and named the two arrays that you posted a1 and a2.

Step one: Create new arrays that estimate your old lines

Looking at the data you posted:

the data

These aren't particularly complicated functions, it looks like a 3rd degree polynomial would fit them pretty well. We can create those using numpy:

import numpy as np

# Find the range of x values in a1
min_a1_x, max_a1_x = min(a1[:,0]), max(a1[:,0])
# Create an evenly spaced array that ranges from the minimum to the maximum
# I used 100 elements, but you can use more or fewer. 
# This will be used as your new x coordinates
new_a1_x = np.linspace(min_a1_x, max_a1_x, 100)
# Fit a 3rd degree polynomial to your data
a1_coefs = np.polyfit(a1[:,0],a1[:,1], 3)
# Get your new y coordinates from the coefficients of the above polynomial
new_a1_y = np.polyval(a1_coefs, new_a1_x)

# Repeat for array 2:
min_a2_x, max_a2_x = min(a2[:,0]), max(a2[:,0])
new_a2_x = np.linspace(min_a2_x, max_a2_x, 100)
a2_coefs = np.polyfit(a2[:,0],a2[:,1], 3)
new_a2_y = np.polyval(a2_coefs, new_a2_x)

The result:

Fitted Arrays

That's not bad so bad! If you have more complicated functions, you'll have to fit a higher degree polynomial, or find some other adequate function to fit to your data.

Now, you've got two sets of arrays of the same length (I chose a length of 100, you can do more or less depending on how smooth you want your midpoint line to be). These sets represent the x and y coordinates of the estimates of your original arrays. In the example above, I named these new_a1_x, new_a1_y, new_a2_x and new_a2_y.

Step two: calculate the average between each x and each y in your new arrays

Then, we want to find the average x and average y value for each of our estimate arrays. Just use np.mean:

midx = [np.mean([new_a1_x[i], new_a2_x[i]]) for i in range(100)]
midy = [np.mean([new_a1_y[i], new_a2_y[i]]) for i in range(100)]

midx and midy now represent the midpoint between our 2 estimate arrays. Now, just plot your original (not estimate) arrays, alongside your midpoint array:

plt.plot(a1[:,0], a1[:,1],c='black')
plt.plot(a2[:,0], a2[:,1],c='black')
plt.plot(midx, midy, '--', c='black')
plt.show()

And voilà:

final product

This method still works with more complex, noisy data (but you have to fit the function thoughtfully):

noisy data

As a function:

I've put the above code in a function, so you can use it easily. It returns an array of your estimated midpoints, in the format you had your original arrays in.

The arguments: a1 and a2 are your 2 input arrays, poly_deg is the degree polynomial you want to fit, n_points is the number of points you want in your midpoint array, and plot is a boolean, whether you want to plot it or not.

import matplotlib.pyplot as plt
import numpy as np

def interpolate(a1, a2, poly_deg=3, n_points=100, plot=True):

    min_a1_x, max_a1_x = min(a1[:,0]), max(a1[:,0])
    new_a1_x = np.linspace(min_a1_x, max_a1_x, n_points)
    a1_coefs = np.polyfit(a1[:,0],a1[:,1], poly_deg)
    new_a1_y = np.polyval(a1_coefs, new_a1_x)

    min_a2_x, max_a2_x = min(a2[:,0]), max(a2[:,0])
    new_a2_x = np.linspace(min_a2_x, max_a2_x, n_points)
    a2_coefs = np.polyfit(a2[:,0],a2[:,1], poly_deg)
    new_a2_y = np.polyval(a2_coefs, new_a2_x)

    midx = [np.mean([new_a1_x[i], new_a2_x[i]]) for i in range(n_points)]
    midy = [np.mean([new_a1_y[i], new_a2_y[i]]) for i in range(n_points)]

    if plot:
        plt.plot(a1[:,0], a1[:,1],c='black')
        plt.plot(a2[:,0], a2[:,1],c='black')
        plt.plot(midx, midy, '--', c='black')
        plt.show()

    return np.array([[x, y] for x, y in zip(midx, midy)])

[EDIT]:

I was thinking back on this question, and I overlooked a simpler way to do this, by "densifying" both arrays to the same number of points using np.interp. This method follows the same basic idea as the line-fitting method above, but instead of approximating lines using polyfit / polyval, it just densifies:

min_a1_x, max_a1_x = min(a1[:,0]), max(a1[:,0])
min_a2_x, max_a2_x = min(a2[:,0]), max(a2[:,0])

new_a1_x = np.linspace(min_a1_x, max_a1_x, 100)
new_a2_x = np.linspace(min_a2_x, max_a2_x, 100)

new_a1_y = np.interp(new_a1_x, a1[:,0], a1[:,1])
new_a2_y = np.interp(new_a2_x, a2[:,0], a2[:,1])

midx = [np.mean([new_a1_x[i], new_a2_x[i]]) for i in range(100)]
midy = [np.mean([new_a1_y[i], new_a2_y[i]]) for i in range(100)]

plt.plot(a1[:,0], a1[:,1],c='black')
plt.plot(a2[:,0], a2[:,1],c='black')
plt.plot(midx, midy, '--', c='black')
plt.show()

enter image description here

Tophet answered 1/3, 2018 at 1:8 Comment(3)
@sacul Any chance you would know the C# equivalent? I found the Math.NET package has most, if not all of the functions but they're not exactly the same so I'm not sure which will convert.Subdual
Sorry, I'm not so good in C#, but just from the Math.NET documentation, it looks like Fit type of PolynomialFunc in MathNet.Numerics might be able to take care of a lot of the function fitting, particularly the Polynomial function, which would essentially replace np.polyfit in my code. The rest is more straightforward, but I wouldn't be such a good source for help there...Tophet
Ok. That's a huge help! Thanks.Subdual
E
12

The "line between two lines" is not so well defined. You can obtain a decent though simple solution by triangulating between the two curves (you can triangulate by progressing from vertex to vertex, choosing the diagonals that produce the less skewed triangle).

Then the interpolated curve joins the middles of the sides.

enter image description here

Empressement answered 28/2, 2018 at 21:36 Comment(3)
Hmm this could work, I can just densify the points in each line to get a more accurate solution. What do you mean by "the less skewed triangle"?Pallaten
@JustinLiang: if the curves are known by equations, the answer will be different.Empressement
@JustinLiang "I can just densify the points": the median curve needn't be more accurate/smooth than the given curves. So there is no need to densify.Empressement
D
2

I work with rivers, so this is a common problem. One of my solutions is exactly like the one you showed in your question--i.e. skeletonize the blob. You see that the boundaries have problems, so what I've done that seems to work well is to simply mirror the boundaries. For this approach to work, the blob must not intersect the corners of the image.

You can find my implementation in RivGraph; this particular algorithm is in rivers/river_utils.py called "mask_to_centerline".

Here's an example output showing how the ends of the centerline extend to the desired edge of the object:

enter image description here

Dinka answered 5/6, 2020 at 17:8 Comment(0)
V
2

sacuL's solution almost worked for me, but I needed to aggregate more than just two curves. Here is my generalization for sacuL's solution:

def interp(*axis_list):
    min_max_xs = [(min(axis[:,0]), max(axis[:,0])) for axis in axis_list]

    new_axis_xs = [np.linspace(min_x, max_x, 100) for min_x, max_x in min_max_xs]
    new_axis_ys = [np.interp(new_x_axis, axis[:,0], axis[:,1]) for axis, new_x_axis in zip(axis_list, new_axis_xs)]

    midx = [np.mean([new_axis_xs[axis_idx][i] for axis_idx in range(len(axis_list))]) for i in range(100)]
    midy = [np.mean([new_axis_ys[axis_idx][i] for axis_idx in range(len(axis_list))]) for i in range(100)]

    for axis in axis_list:
        plt.plot(axis[:,0], axis[:,1],c='black')
    plt.plot(midx, midy, '--', c='black')
    plt.show()

If we now run an example:

a1 = np.array([[x, x**2+5*(x%4)] for x in range(10)])
a2 = np.array([[x-0.5, x**2+6*(x%3)] for x in range(10)])
a3 = np.array([[x+0.2, x**2+7*(x%2)] for x in range(10)])
interp(a1, a2, a3)

we get the plot:enter image description here

Varve answered 30/5, 2022 at 8:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.