I have two dataframes, both of which contain latitude and longitude coordinates. The first dataframe is observations of events, where the location and time was recorded. The second dataframe is geographic features, where the location and info about the feature is recorded.
my_df_1 <- structure(list(START_LAT = c(-33.15, -35.6, -34.08333, -34.13333,
-34.31667, -47.38333, -47.53333, -34.08333, -47.38333, -47.15
), START_LONG = c(163, 165.18333, 162.88333, 162.58333, 162.76667,
148.98333, 148.66667, 162.9, 148.98333, 148.71667)), row.names = c(1175L,
528L, 1328L, 870L, 672L, 707L, 506L, 981L, 756L, 210L), class = "data.frame", .Names = c("START_LAT",
"START_LONG"))
my_df_2 <- structure(list(latitude = c(-42.7984, -34.195, -49.81, -35.417,
-28.1487, -44.657, -42.7898, -36.245, -39.1335, -31.8482), longitude = c(179.9874,
179.526, -176.68, 178.765, -168.0314, 174.695, -179.9873, 177.7873,
-170.0583, 173.2424), depth_top = c(935L, 2204L, 869L, 1973L,
4750L, 555L, 894L, 1500L, 4299L, 1303L)), row.names = c(580L,
1306L, 926L, 1102L, 60L, 1481L, 574L, 454L, 1168L, 144L), class = "data.frame", .Names = c("latitude",
"longitude", "depth_top"))
What I need to do, is for every observation in df1, I need to find out which feature in df2 is geographically closest. Ideally, I'd get a new column appended to df1 which every row is the closest feature from df2.
I worked through this question How to assign several names to lat-lon observations, but was unable to figure out how to match it to my data
The real dataframes have 1000s of rows, which is why I cant do this by hand
sf
andst_distance()
. Works great. For anyone else reading this solution usingUbuntu 16.04
, note thatsf
requires GDAL 2.x. You can follow the instructions here to install. – Oneida