Identifying common route segments from GPS tracks

Asked 13/1, 2012 at 17:55 Answered 23/10, 2021 at 8:56

Say I've got a bunch of recorded GPS tracks. Some are from repeated trips over the same route, some are from completely unique routes, and some are distinct routes yet have some segments in common.

Given all this data, I want to:

identify the repeated trips over the same route
identify segments which are shared by multiple routes

I suppose 1 is really a special case of 2.

To give a concrete example: suppose you had daily GPS tracks of a large number of bicycle commuters. It would be interesting to extract from this data the most popular bicycle commuting corridors based on actual riding rather than from the cycling maps that are produced by local governments.

Are there published algorithms for doing this? How do they work? Pointers to papers and/or code greatly appreciated.

Soule answered 13/1, 2012 at 17:55 Comment(1)

This would be a lot easier with street data to snap the paths to. Can you use that? – Griswold 16/1, 2012 at 2:43

You can use 3D histogram find the most visited points on the map. Using that you can derive the most used paths.

Detail: Keep a 2D matrix count and initialize it to 0, X[i,j]=0. For each track, increment X[i,j]s on the path. Once you have processed all the tracks, threshold this matrix to min threshold (what is the minimum number of tracks for it to be a repeated trip?).

Some practical details: Assuming you have set of points through which path goes. You can find the set of points on the path between two such points with http://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm . You might want to draw a "thicker line" to account for the noisy nature of the data.

Scandent answered 13/1, 2012 at 18:55 Comment(4)

Geocoordinates (latitude, longitude) are continuous values, so X[i,j] will probably not work. – Sauceda 6/6, 2014 at 16:51

@Sauceda discretize. – Scandent 7/6, 2014 at 19:23

Your approach will not work if paths are not fairly straight lines. For example, suppose from biking data you find popular points A, B, C, where dist(A,B) < dist(A,C). However, the bike path may be physically A, C, and then B (like a "U"-shaped path). You need to keep track of the sequences, not just the points. – Sauceda 21/6, 2014 at 13:35

@Sauceda I am decently sure that gps data also contains time. After all gps satellites do is transmit time. That should alleviate the problem. – Scandent 24/6, 2014 at 7:50

This seems a very general question that pertains to many GIS problems.

After some search, it seems you would need to apply some algorithms for computing the similarity between any two routes. Wang et al. (2013) analyzes a number of such measures. However, all these measures are implemented by dynamic programming with time complexity of O(N1N2), where N1 and N2 are the number of points in the two routes.

A more efficient solution is provided by Mariescu-Istodor et al. (2017) in which each route is transformed into a set of cells in a predefined grid system. The paper also defines the measure of inclusion as the amount of one route contained inside the other, which seems to relate to the second point in your question.

Inelastic answered 23/10, 2021 at 8:56 Comment(0)

Recommended topics

Hot tags