I'm using some spatial data in R, and wondering whether use the packages/functions that rely on the old Spatial format (package sp) or the new package sf. I made this test, based on code found here.
The idea is to "identify all points falling within a maximum distance of xx meters with respect to each single point in a spatial points dataset".
library(tictoc)
# define a buffer distance and a toy data
maxdist <- 500
df<-data.frame(x = runif(10000, 0, 100000),y = runif(10000, 0, 100000),id = 1:10000)
# doing the analysis using sf
library(sf)
tic("sf")
pts <- df %>% st_as_sf(coords = c("x", "y"))
pts_buf <- st_buffer(pts, maxdist,nQuadSegs = 5)
int <- st_intersects(pts_buf, pts)
toc()
# doing the analysis using sp
library(sp)
library(rgeos)
tic("sp")
pts2 <- SpatialPointsDataFrame(df[,1:2],as.data.frame(df[,3]))
pts_buf2 <- gBuffer(pts2, byid=TRUE,width=maxdist)
int2 <- over(pts_buf2, pts2,returnList = TRUE)
toc()
# size of the objects
object.size(pts)<object.size(pts2)
object.size(pts_buf)<object.size(pts_buf2)
Using sf seems to be much better as faster (around 0.53 vs 2.1 seconds in my machine) and requiring less memory. There is one exception though. Why object pts is much larger than pts2? Is sf less efficient in storing a vector of points?