KDTree for longitude/latitude
Asked Answered
B

2

17

Are there any packages in Python that allow one to do kdtree-like operations for longitude/latitudes on the surface of a sphere? (this would need to take into account the spherical distances properly, as well as the wraparound in longitude).

Beezer answered 11/5, 2012 at 10:8 Comment(0)
L
8

A binary search tree cannot handle the wraparound of the polar representation by design. You might need to transform the coordinates to a 3D cartesian space and then apply your favorite search algorithm, e.g., kD-Tree, Octree etc.

Alternatively, if you could limit the input range of coordinates to a small region on the surface, you could apply an appropriate map projection to this region, i.e., one that does not distort the shape of your area too much, and apply a standard binary search tree on these no-wrap-around cartesian map coordinates.

Lambent answered 15/4, 2013 at 12:58 Comment(2)
By "wraparound" are you talking about longitudinally? If so, can you explain why can binary tree not handle it but a 3D space could?Eelgrass
@Eelgrass I was referring to ordered binary trees like kd-tree, that split the dataset at hyperplanes into subspaces. When the coordinates wrap around, e.g., as angles in polar coordinates do, this doesn't work as relevant points might be at both ends. Apparantly there are binary trees that subdivide based on distance calculations, instead of splits at hyperplanes, e.g., Ball Tree apparantly. So this might actually be an option that I was not aware of at the time.Lambent
C
12

I believe that the BallTree from scikit-learn with the Haversine metric should do the trick for you.

As an example:

from sklearn.neighbors import BallTree
import numpy as np
import pandas as pd

cities = pd.DataFrame(data={
    'name': [...],
    'lat': [...],
    'lon': [...]
})

query_lats = [...]
query_lons = [...]

bt = BallTree(np.deg2rad(cities[['lat', 'lon']].values), metric='haversine')
distances, indices = bt.query(np.deg2rad(np.c_[query_lats, query_lons]))

nearest_cities = cities['name'].iloc[indices]

Note this returns distances assuming a sphere of radius 1 - to get the distances on the earth multiply by radius = 6371km

see:

Clavicorn answered 16/6, 2019 at 15:0 Comment(1)
Works great except for 1 thing, note that the distances calculated by the haversine metric assumes a sphere of radius 1, so you'll need to multiply by radius = 6371km to get the real distances. scikit-learn.org/stable/modules/generated/… I think the overall calculations/time saved may not be much different between this and the convert to 3D cartesian in moooeeeep's answer, but this should save some space in not having to store the 3D co-ordinatesSubak
L
8

A binary search tree cannot handle the wraparound of the polar representation by design. You might need to transform the coordinates to a 3D cartesian space and then apply your favorite search algorithm, e.g., kD-Tree, Octree etc.

Alternatively, if you could limit the input range of coordinates to a small region on the surface, you could apply an appropriate map projection to this region, i.e., one that does not distort the shape of your area too much, and apply a standard binary search tree on these no-wrap-around cartesian map coordinates.

Lambent answered 15/4, 2013 at 12:58 Comment(2)
By "wraparound" are you talking about longitudinally? If so, can you explain why can binary tree not handle it but a 3D space could?Eelgrass
@Eelgrass I was referring to ordered binary trees like kd-tree, that split the dataset at hyperplanes into subspaces. When the coordinates wrap around, e.g., as angles in polar coordinates do, this doesn't work as relevant points might be at both ends. Apparantly there are binary trees that subdivide based on distance calculations, instead of splits at hyperplanes, e.g., Ball Tree apparantly. So this might actually be an option that I was not aware of at the time.Lambent

© 2022 - 2024 — McMap. All rights reserved.