I have a similar problem and it seems that using SciPy's cKDTree for fast nearest-points lookups together with GeoPy for geographic distance calculation works fine.
In [1]: import numpy as np
In [2]: from scipy.spatial import cKDTree
In [3]: from geopy import Point, distance
In [4]: points = np.random.sample((100000, 2)) * 180 - 90 # make 100k random lat-long points
In [5]: index = cKDTree(points)
In [6]: %time lat_long_dist, inds = index.query(points[234], 20)
CPU times: user 118 µs, sys: 164 µs, total: 282 µs
Wall time: 248 µs
In [7]: points_geopy = [Point(*p) for p in points]
In [8]: %time geo_dists = [distance.great_circle(points_geopy[234], points_geopy[i]) for i in inds]
CPU times: user 244 µs, sys: 218 µs, total: 462 µs
Wall time: 468 µs
In [9]: geo_dists
Out[9]:
[Distance(0.0),
Distance(29.661520907955524),
...
Distance(156.5471729956897),
Distance(144.7528417712309)]
A bit of extra work is necessary to get all points within a radius.
I tried Shapely's STRtree, but got far worse performance (I installed with pip install shapely[vectorized]
).
points
? Which is not quite as efficient as PostGIS indexing of points - perhaps the database approach would be more efficient. – Su