What is the most efficient way to convert numpy arrays to Shapely Points?
Asked Answered
M

4

13

I have a function that outputs a grid of points as x and y numpy arrays for interpolation, but before I interpolate, I want to use Geopandas to perform an intersection with my research boundary (otherwise half of my interpolation points fall in the ocean).

I'm generating points like this:

import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import Point

x = np.linspace(0,100,100)
y = np.linspace(0,100,100)
x, y = np.meshgrid(x, y)
x, y = x.flatten(), y.flatten()


f, ax = plt.subplots()

plt.scatter(x, y)
plt.axis('equal')
plt.show()

Is there an efficient way to convert these numpy arrays to shapely.Point([x, y]) so they can be placed in a geopandas geodataframe?

This is my current approach:

interp_points = []
index = 0
y_list = yi.tolist()
for x in xi.tolist():
    interp_points.append(Point(x,y_list[index]))
    index += 1

But it seems like converting to lists and then iterating is likely not a good approach for performance, and I have approximately 160,000 points.

Mcatee answered 21/6, 2018 at 15:10 Comment(0)
P
6

There is no built-in way to do this with shapely, so you need to iterate through the values yourself. For doing that, this should be a rather efficient way:

In [4]: from geopandas import GeoSeries

In [5]: s = GeoSeries(map(Point, zip(x, y)))

In [6]: s.head()
Out[6]: 
0                    POINT (0 0)
1     POINT (1.01010101010101 0)
2     POINT (2.02020202020202 0)
3     POINT (3.03030303030303 0)
4    POINT (4.040404040404041 0)
dtype: object

In [6]: %timeit GeoSeries(map(Point, zip(x, y)))
114 ms ± 8.45 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

(or slight alternative GeoSeries(list(zip(x, y))).map(Point))

See here for some example doing this: http://geopandas.readthedocs.io/en/latest/gallery/create_geopandas_from_pandas.html

There is some (stalled) work to include this in geopandas directly: https://github.com/geopandas/geopandas/pull/75

Printery answered 21/6, 2018 at 15:29 Comment(2)
Thank you, I'd been trying to make a solution using map() and zip but my approach failed for some reason. This is exactly what I was looking for!Mcatee
Just to note that this has eventually been implemented as points_from_xy(): see geopandas.org/en/stable/gallery/… for an example. Reference hereCutaneous
S
2

I think this is a good way:

import numpy as np        
from shapely import geometry

points_np_array = np.random.rand(50,2)
polygon_1 = geometry.Polygon(np.squeeze(points_np_array))
Schatz answered 29/10, 2020 at 20:47 Comment(0)
S
0

Better use this list comprehention:

[tuple(x) for x in arr.tolist()]

Schatz answered 31/10, 2020 at 19:4 Comment(0)
C
0

As of geopandas version 0.5.0 (April 25, 2019), you can use points_from_xy for that purpose:

# continuing from your example:
df = gpd.GeoDataFrame(geometry = gpd.points_from_xy(x, y))
df.plot()
plt.show()

(Embedding in a GeoSeries, i.e., gpd.GeoSeries(gpd.points_from_xy(x, y)), would work equally well, but I wanted to replicate your plot.)

There is an example in the GeoPandas gallery, and the full documentation is here.

Cutaneous answered 10/5 at 9:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.