Python module for storing and querying geographical coordinates
Asked Answered
S

7

19

Is there a Python module where I can create objects with a geographical location coordinate (latitude and longitude), and query all the objects for ones which are within a 5km distance (i.e. radius) of a given coordinate?

I've been trying to store the latitude and longitude as keys in dictionaries (as they're indexed by key) and use some distance finding algorithms to query them. But this feels like a horrible hack.

Essentially something like PostGIS for PostgreSQL, but all within my Python app's memory.

Su answered 31/1, 2012 at 11:31 Comment(0)
M
20

Yes, try geopy.

import geopy
import geopy.distance

pt1 = geopy.Point(48.853, 2.349)
pt2 = geopy.Point(52.516, 13.378)

dist = geopy.distance.distance(pt1, pt2).km
# 878.25

afterwards you can query your lists of points:

[pt for pt in points if geopy.distance.distance(orig, pt).km < 5.]
Monocoque answered 31/1, 2012 at 11:36 Comment(2)
Thank you for the answer. I had skipped over goepy because I thought it was entirely for geocoding. With the query example, would it not loop through every object in points? Which is not quite as efficient as PostGIS indexing of points - perhaps the database approach would be more efficient.Su
@Jonathan - you're right, it will iterate over the whole list of points. I don't know about any possibility of indexing within geopy. Maybe you find something.Monocoque
P
6

I know this isn't exactly what you meant, but you could use GeoDjango with an in-memory SQLite database. It's a full set of GIS tools exposed as a Web application, which makes it a Swiss Army knife for rapidly developing GIS applications, especially for small ad hoc queries.

Prothallus answered 31/1, 2012 at 12:0 Comment(0)
V
2

The usual approach in GIS is to create a buffer around the point of interest and query the intersection. As @RyanDalton suggests, if you plan to do a lot of geolocation stuff, use Shapely, the GIS API for Python. It is good to know about Shapely even if you still want a spatial index (see below). Here is how to create buffers in Shapely:

distance = 3
center = Point(1, 1)
pts = [Point(1.1, 1.2),Point(1.2,1.2)]
center_buf = a.buffer(distance)
#filters the points list according to whether they are contained in the list
contained = filter(center_buf.contains,pts)

You can index your points yourself (let's say by longitude for example) if you don't have many. Otherwise you can also use the Rtree package, check the link called Using Rtree as a cheapo spatial database!

Volding answered 1/2, 2012 at 13:53 Comment(0)
S
1

Your dictionary idea doesn't sound that bad, though you will need to check points that fall under 'neighbouring' dictionary keys as well.

If you can't find the right tool, and like coding algorithms, you could implement a binary space partition tree which afaik is a less hacky way of achieving a similar thing.

Sulphurous answered 31/1, 2012 at 12:20 Comment(0)
M
1

You can use SQLite which has an Rtree extension for doing exactly that kind of storage and querying. This approach is useful if your data is larger than the memory you want to use or you want to save and manipulate data between program runs. The actual storage and query code is in C which means it has to be compiled, but the benefit is extra performance over pure Python solutions like geopy. Either pysqlite or APSW will work for the SQLite access. (Disclosure: I am the APSW author.)

Million answered 2/2, 2012 at 9:55 Comment(0)
I
1

I have a similar problem and it seems that using SciPy's cKDTree for fast nearest-points lookups together with GeoPy for geographic distance calculation works fine.

In [1]: import numpy as np
In [2]: from scipy.spatial import cKDTree
In [3]: from geopy import Point, distance
In [4]: points = np.random.sample((100000, 2)) * 180 - 90   # make 100k random lat-long points
In [5]: index = cKDTree(points)
In [6]: %time lat_long_dist, inds = index.query(points[234], 20)
CPU times: user 118 µs, sys: 164 µs, total: 282 µs
Wall time: 248 µs
In [7]: points_geopy = [Point(*p) for p in points]
In [8]: %time geo_dists = [distance.great_circle(points_geopy[234], points_geopy[i]) for i in inds]
CPU times: user 244 µs, sys: 218 µs, total: 462 µs
Wall time: 468 µs
In [9]: geo_dists
Out[9]: 
[Distance(0.0),
 Distance(29.661520907955524),
 ...
 Distance(156.5471729956897),
 Distance(144.7528417712309)]

A bit of extra work is necessary to get all points within a radius.

I tried Shapely's STRtree, but got far worse performance (I installed with pip install shapely[vectorized]).

Inandin answered 4/3, 2019 at 15:39 Comment(0)
D
0

Have you looked at Shapely? It has some methods to query objects within a distance. Take a look at the Binary Spatial Predicates. It might just be what you are looking for.

Doyenne answered 31/1, 2012 at 23:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.