Creating a distance matrix from a list of coordinates in R
Asked Answered
T

2

6

I have a csv file with a list of co-ordinate positions for over 2000 farms, with the following structure;

FarmID | Latidue | Longitude |  
------ |---------|-----------|  
   1   |    y1   |     x1    |
   2   |    y2   |     x2    |
   3   |    y3   |     x3    |

....... I want to to create a Euclidean Distance Matrix from this data showing the distance between all farm pairs so I get a resulting matrix like:

     1     |    2    |     3     |
-----------|---------|-----------|
1    0     |  2.236  |   3.162   |
2  2.236   |    0    |   2.236   |
3  3.162   |  2.236  |     0     |

With many more farms and coordinates in the data frame I need to to be able to somehow iterate over all of the farm pairs and create a distance matrix like the one above. Any help on how to do this in R would be appreciated. Thank you!

Talkie answered 11/5, 2017 at 2:18 Comment(1)
a reproducible example dataset would be niceOutherod
T
9

Here's a simple example:

farms <- data.frame(lat=runif(3), lng=runif(3))
dist(farms, diag=T, upper=T)

          1         2         3
1 0.0000000 0.9275424 0.6092271
2 0.9275424 0.0000000 0.3891079
3 0.6092271 0.3891079 0.0000000
Timon answered 11/5, 2017 at 2:40 Comment(0)
O
5

You have a list of geographic coordinates measured with latitude and longitude. These coordinates are measured in degrees, and a 1-degree distance (especially longitude) does not equate to the same actual distance (in meters) depending on where you are on the globe (much larger at the equator than at the poles).

@thc's solution indeed calculates euclidean distance, but in degrees and in a XY plane. They become geographically meaningless once the information about where they were measured is lost, so it may be misleading as there rarely is a situation where you would really want to get these values.

You probably want geodesic distances, as calculated here (with package geodist and more realistic lon lat values) and compared with euclidean degree distances:

library(geodist)
farms <- data.frame(latitude=runif(3,min=-90,max=90), longitude=runif(3,min=-180,max=180))
#euclidean distances in degrees:
dist(farms, diag=T, upper=T)
#geodesic distances: 
geodist(farms)

PS: Euclidean distances on the globe, but not following the surface of the earth, would be possible too but a different calculation again.

PPS: Note that over small areas that are approximately planar, these considerations will not matter. For global analyses, they do.

Outherod answered 18/11, 2021 at 1:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.