Calculate all distances between two GeoDataFrame (of points) in GeoPandas
Asked Answered
S

1

9

This is quite simple case, but I did not find any easy way to do it so far. The idea is to get a set of distances between all the points defined in a GeoDataFrame and the ones defined in another GeoDataFrame.

import geopandas as gpd
import pandas as pd

# random coordinates
gdf_1 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0, 0], [0, 90, 120]))
gdf_2 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0], [0, -90]))
print(gdf_1)
print(gdf_2)

#  distances are calculated elementwise
print(gdf_1.distance(gdf_2))

This produces the element-wise distance between points in gdf_1 and gdf_2 that share the same index (with also a warning because the two GeoSeries do not have the same index, which will be my case).

                geometry
0    POINT (0.000 0.000)
1   POINT (0.000 90.000)
2  POINT (0.000 120.000)
                    geometry
0    POINT (0.00000 0.00000)
1  POINT (0.00000 -90.00000)
/home/seydoux/anaconda3/envs/chelyabinsk/lib/python3.8/site-packages/geopandas/base.py:39: UserWarning: The indices of the two GeoSeries are different.
  warn("The indices of the two GeoSeries are different.")
0      0.0
1    180.0
2      NaN

The question is; how is it possible to get a series of all points to points distances (or at least, the unique combinations of the index of gdf_1 and gdf_2 since it is symmetric).

EDIT

  • In this post, the solution is given for a couple of points; but I cannot find a straightforward way to combine all points in two datasets.

  • In this post only element-wise operations are proposed.

  • An analogous question was also raised on the GitHub repo of geopandas. One of the proposed solution is to use the apply method, without any detailed answer.

Schnapp answered 9/11, 2020 at 14:58 Comment(7)
Did you search? I recall many question/answers regarding calculating distances between all combinations of two sets of coordinates (geodetic or otherwise) that reside in arrays, lists, DataFrames. Your question is either too broad or probably a duplicate; and maybe off topic with the request for other libraries.Lunula
Yes, I did. I will put all related posts in the question. No answer for the combination case I raise here.Penury
The problem you are trying to solve is applying a function to all combinations of coordinates between two dataframes? And the part you are stuck on is getting the combinations?Lunula
That is correct @wwii. I am wondering (1) if such function would exist already or (2) how to combine all combinations of coordinates between two dataframes.Penury
Related:Distance matrix between two point layers,Lunula
I don't have geopandas and I can't tell if the distance method will handle broadcasting but try this: gdf_1['geometry'].distance(gdf_2['geometry'].values[:,None])Lunula
Thanks for your help. I tried but it does not work (returns ValueError: 'data' should be a 1-dimensional array of geometry objects.). As in the answer provided by @martinfleis, a neat solution is to use the apply method.Penury
A
14

You have to apply over each geometry in first gdf to get distance to all geometric in second gdf.

import geopandas as gpd
import pandas as pd

# random coordinates
gdf_1 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0, 0], [0, 90, 120]))
gdf_2 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0], [0, -90]))

gdf_1.geometry.apply(lambda g: gdf_2.distance(g))
      0      1
0    0.0   90.0
1   90.0  180.0
2  120.0  210.0
Abcoulomb answered 9/11, 2020 at 16:40 Comment(2)
this is suuuuperslow :/Coben
this was suuuper slow for the polygons - it calculated I thin minimal distance between two areas (two boundaries) - which is not what I needed.I changed geometry to centroid and it worked instantly then :)Coben

© 2022 - 2024 — McMap. All rights reserved.