How can I do a spatial join with the sf package using st_join()
Asked Answered
I

3

26

Here's a toy example I've been wrestling with

# Make points
point1 <- c(.5, .5)
point2 <- c(.6, .6)
point3 <- c(3, 3)
mpt <- st_multipoint(rbind(point1, point2, point3))  # create multipoint

# Make polygons
square1 <- rbind(c(0, 0), c(1, 0), c(1,1), c(0, 1), c(0, 0))
square2 <- rbind(c(0, 0), c(2, 0), c(2,2), c(0, 2), c(0, 0))
square3 <- rbind(c(0, 0), c(-1, 0), c(-1,-1), c(0, -1), c(0, 0))
mpol <- st_multipolygon(list(list(square1), list(square2), list(square2)))  # create multipolygon

# Convert to class 'sf'
pts <- st_sf(st_sfc(mpt))
polys <- st_sf(st_sfc(mpol))

# Determine which points fall inside which polygons
st_join(pts, polys, join = st_contains)

The last line produces

Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : 
  cannot coerce class "c("sfc_MULTIPOINT", "sfc")" to a data.frame

How can I do a spatial join to determine which points fall inside which polygons?

Insatiate answered 5/5, 2017 at 1:5 Comment(3)
Could you clarify what you mean by "spatial join" ? What would be the expected result ?Handily
Given a set of polygons and a set of points, create the mapping (PointId, PolygonId) that states which points are contained by which polygons.Insatiate
I recently wrote this tutorial for the sf package to help myself and others understand the basic concepts. Understanding the fundamentals is the key to solving specific problems like the one I had here.Insatiate
H
21

I'm also working my way around the features of the sf package, so apologies if this is not correct or there are better ways. I think one problem here is that if building the geometries like in your example you are not obtaining what you think:

> pts
Simple feature collection with 1 feature and 0 fields
geometry type:  MULTIPOINT
dimension:      XY
bbox:           xmin: 0.5 ymin: 0.5 xmax: 3 ymax: 3
epsg (SRID):    NA
proj4string:    NA
                     st_sfc.mpt.
1 MULTIPOINT(0.5 0.5, 0.6 0.6...

> polys
Simple feature collection with 1 feature and 0 fields
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: 0 ymin: 0 xmax: 2 ymax: 2
epsg (SRID):    NA
proj4string:    NA
                    st_sfc.mpol.
1 MULTIPOLYGON(((0 0, 1 0, 1 ...

You can see that you have only one "feature" both in pts and in polys. This means that you are building one "multipolygon" feature (that is, a polygon constituted by 3 parts), instead thatn three different polygons. The same goes for the points.

After digging a bit, I found this different (and in my opinion easier) way to build the geometries, using WKT notation:

polys <- st_as_sfc(c("POLYGON((0 0 , 0 1 , 1 1 , 1 0, 0 0))",
                     "POLYGON((0 0 , 0 2 , 2 2 , 2 0, 0 0 ))", 
                     "POLYGON((0 0 , 0 -1 , -1 -1 , -1 0, 0 0))")) %>% 
  st_sf(ID = paste0("poly", 1:3))    

pts <- st_as_sfc(c("POINT(0.5 0.5)",
                   "POINT(0.6 0.6)",
                   "POINT(3 3)")) %>%
  st_sf(ID = paste0("point", 1:3))

> polys
Simple feature collection with 3 features and 1 field
geometry type:  POLYGON
dimension:      XY
bbox:           xmin: -1 ymin: -1 xmax: 2 ymax: 2
epsg (SRID):    NA
proj4string:    NA
     ID                              .
1 poly1 POLYGON((0 0, 0 1, 1 1, 1 0...
2 poly2 POLYGON((0 0, 0 2, 2 2, 2 0...
3 poly3 POLYGON((0 0, 0 -1, -1 -1, ...

> pts
Simple feature collection with 3 features and 1 field
geometry type:  POINT
dimension:      XY
bbox:           xmin: 0.5 ymin: 0.5 xmax: 3 ymax: 3
epsg (SRID):    NA
proj4string:    NA
      ID              .
1 point1 POINT(0.5 0.5)
2 point2 POINT(0.6 0.6)
3 point3     POINT(3 3)

you can see that now both polys and pts have three features.

We can now find the "intersection matrix" using:

# Determine which points fall inside which polygons
pi <- st_contains(polys,pts, sparse = F) %>% 
  as.data.frame() %>% 
  mutate(polys = polys$ID) %>% 
  select(dim(pi)[2],1:dim(pi)[1])
colnames(pi)[2:dim(pi)[2]] = levels(pts$ID)

> pi
  polys point1 point2 point3
1 poly1   TRUE   TRUE  FALSE
2 poly2   TRUE   TRUE  FALSE
3 poly3  FALSE  FALSE  FALSE

meaning (as pointed out @symbolixau in the comments) that polygons 1 and 2 contain points 1 and 2, while polygon 3 doesn't contain any points. Point 3 is instead not contained in any polygon.

HTH.

Handily answered 5/5, 2017 at 14:3 Comment(4)
Very interesting. Hopefully others with chime in on this. I wish the sf documentation had better examples for st_join and building simple features from scratch.Insatiate
I too find "building" the features as lists of lists a bit cumbersome. However, it's true that I rarely need to build "geometries" from scratch, and the speed increase with respect to sp is really great !Handily
I think the result is saying points 1 and 2 are in both polygons 1 and 2 (they overlap), and point 3 isn't in any polygon, and polygon 3 doesn't contain any points. (I tried the same question using a different approach other than sf to help interpret the results)Jessejessee
If you are looking for better documentation on what the spatial functions, you can look here postgis.net/docs/reference.htmlVenus
H
5

I see a different output:

> # Determine which points fall inside which polygons
> st_join(pts, polys, join = st_contains)
Simple feature collection with 1 feature and 0 fields
geometry type:  MULTIPOINT
dimension:      XY
bbox:           xmin: 0.5 ymin: 0.5 xmax: 3 ymax: 3
epsg (SRID):    NA
proj4string:    NA
                        geometry
1 MULTIPOINT(0.5 0.5, 0.6 0.6...

was this with the most recent CRAN version of sf?

Hultgren answered 2/6, 2017 at 14:56 Comment(2)
Thanks for the reply and awesome package. At the time I wrote this question, it was the most recent version on CRAN. Now I'm on 0.4-3 and I still get an error in the same spot (although a different message). The culprit appears to be this call pts <- st_sf(st_sfc(mpt)) and I guess it's because a simple-features can't just have a geometry - it has to have data as well. That's why something like pts <- st_sf(Dummy=1, st_sfc(mpt)) prevents the error. I've recently gone through the vignettes in detail which has cleared up a lot of my confusion.Insatiate
Thanks - this has been corrected in the development version a while ago.Hultgren
F
-1

Note, the original set of multipoint and multipolygon can be 'cast' to point and polygon, without creating new objects:

st_contains(polys %>% st_cast("POLYGON"), pts %>% st_cast("POINT"), sparse = F)
#      [,1]  [,2]  [,3]
#[1,]  TRUE  TRUE FALSE
#[2,]  TRUE  TRUE FALSE
#[3,] FALSE FALSE FALSE
Furnivall answered 1/3, 2018 at 18:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.