Efficient matching of two arrays (how to use KDTree) - McMap

About

Efficient matching of two arrays (how to use KDTree)

Asked 20/3, 2013 at 13:52 Answered 17/10, 2014 at 12:51

python numpy pandas scipy kdtree

L

1

2

I have two 2d arrays, obs1 and obs2. They represent two independent measurement series, and both have dim0 = 2, and slightly different dim1, say obs1.shape = (2, 250000), and obs2.shape = (2, 250050). obs1[0] and obs2[0] signify time, and obs1[1] and obs2[1] signify some spatial coordinate. Both arrays are (more or less) sorted by time. The times and coordinates should be identical between the two measurement series, but in reality they aren't. Also, not each measurement from obs1 has a corresponding value in obs2 and vice-versa. Another problem is that there might be a slight offset in the times.

I'm looking for an efficient algorithm to associate the best matching value from obs2 to each measurement in obs1. Currently, I do it like this:

define dt = some_maximum_time_difference
define dx = 3
j = 0
i = 0
matchresults = np.empty(obs1.shape[1])
for j in obs1.shape[1]:
    while obs1[0, j] - obs2[0, j] < dt:
        i += 1
    matchresults[j] = i - dx + argmin(abs(obs1[1, i] - obs2[1, i-dx:i+dx+1]))

This yields good results. However, it is extremely slow, running in a loop.

I would be very thankful for ideas on how to improve this algorithm speed-wise, e.g. using KDtree or something similar.

Ludwog answered 20/3, 2013 at 13:52 Comment(5)

Can you please expand a bit on

first_obs2_index_with_time_difference_lessthan_delta_t     resulting_i = i + argmin(abs(obs1[1, i] - obs2[1, i-delta_x:i+delta_x+1])

? Assuming that you are not performing another loop here, I don't believe KDTrees would speed things up too much. – Folklore 20/3, 2013 at 13:58

@MikyDinescu thanks for your question; I updated the algorithm in the original post. – Ludwog 20/3, 2013 at 14:6

can you post a bit of data, and your expected results? – Mcleroy 20/3, 2013 at 17:57

I was wondering of anybody replied to you. Thanks for posting more details but unfortunately it's still unclear to me how your current algo. works.. I'm not a python developer so maybe somebody else will chime in. I would suggest though to present your algorithm in pseudo code instead. Or maybe ask on a more CS stack exchange.. – Folklore 21/3, 2013 at 1:24

@Ludwog did you check the answer below? – Staccato 14/3, 2015 at 10:50

S

1

Using cKDTree for this case would look like:

from scipy.spatial import cKDTree

obs2 = array with shape (2, m)
obs1 = array with shape (2, n)

kdt = cKDTree(obs2.T)
dist, indices = kdt.query(obs1.T)

where indices will contain the column indices in obs2 corresponding to each observation in obs1. Note that I had to transpose obs1 and obs2.

Staccato answered 17/10, 2014 at 12:51 Comment(0)

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.