data structure to support google/bing maps [closed]

Asked 22/1, 2010 at 4:22 Answered 30/12, 2011 at 8:9

algorithm data-structures driving-directions

I was wondering what the data structure is in an application like google/bing maps. How is it that the results are returned so quickly when searching for directions? what kind of algorithms are being used to determine this information?

thanks

Dich answered 22/1, 2010 at 4:22 Comment(0)

There are two parts to your question:

What kind of data structure is used to store the map information.
What kind of algorithm is used to "navigate" from source to destination.

To this, I would add another question:

How is Google/Bing able to "stream in" the data. So for example, you are able to zoom in from miles up to the ground level seamlessly, all the while maintaining the coordinate system.

I will attempt to address each question in order. Do note, that I do not work for the Google Maps or the Bing team, so quite obviously, this information might not be completely accurate. I am basing this off of the knowledge gained from a good CS course about data structures and algorithms.

Ans 1) The map is stored in an Edge Weighted Directed Graph. Locations on the map are Vertices and the path from one location to another (from one vertex to another) are the Edges.

Quite obviously, since there can be millions of vertices and an order of magnitude more edges, the really interesting thing would be the representation of this Edge Weighted Digraph.

I would say that this would be represented by some kind of Adjacency List and the reason I say so is because, if you imagine a map, it is essentially a sparse graph. There are only a few ways to get from one location to another. Think about your house! How many roads (edges in our case) lead to it? Adjacency Lists are good for representing sparse graphs, and adjacency matrix is good for representing dense graphs.

Of course, even though we are able to efficiently represent sparse graphs in memory, given the sheer number of Vertices and Edges, it would be impossible to store everything in memory at once. Hence, I would imagine some kind of a streaming library underneath.

To create an analogy for this, if you have ever played an open-world game like World of Warcraft / Syrim / GTA, you will observe that to a large part, there is no loading screen. But quite obviously, it is impossible to fit everything into memory at once. Thus using a combination of quad-trees and frustum culling algorithms, these games are able to dynamically load resources (terrain, sprites, meshes etc).

I would imagine something similar, but for Graphs. I have not put a lot of thought into this particular aspect, but to cook up a very basic system, one can imagine an in memory database, which they query and add/remove vertices and edges from the graph at run-time as needed. This brings us to another interesting point. Since vertices and edges need to be removed and added at run-time, the classic implementation of Adjacency List will not cut it.

In a classic implementation, we simply store a List (a Vector in Java) in each element of an array: Adj[]. I would imagine, a linked list in place of the Adj[] array and a binary search tree in place of List[Edge]. The binary search tree would facilitate O(log N) insertion and removal of nodes. This is extremely desirable since in the List implementation, while addition is O(1), removal is O(N) and when you are dealing with millions of edges, this is prohibitive.

A final point to note here is that until you actually start the navigation, there is "no" graph. Since there can be million of users, it doesn't make sense to maintain one giant graph for everybody (this would be impossible due to memory space requirement alone). I would imagine that as you stat the navigation process, a graph is created for you. Quite obviously, since you start from location A and go to location B (and possibly other locations after that), the graph created just for you should not take up a very large amount of memory (provided the streaming architecture is in place).

Ans 2) This is a very interesting question. The most basic algorithm for solving this problem would be Dijkstra Path Finding algorithm. Faster variations such as A* exist. I would imagine Dijkstra to be fast enough, if it could work properly with the streaming architecture discussed above. Dijkstra uses space proportional to V and time proportional to E lg V, which are very good figures, especially for sparse graphs. Do keep in mind, if the streaming architecture has not been nailed down, V and E will explode and the space and run-time requirements of Dijkstra will make it prohibitive.

Ans 1) Streaming question: Do not confuse this question with the streaming architecture discussed above. This is basically asking how the seamless zoom is achieved.

A good algorithm for achieving this is the Quad Tree algorithm (you can generalize this to n-tree). You store coarser images higher up in the tree and higher resolution images as you traverse down the tree. This is actually what KML (Keyhole) did with its mapping algorithm. Keyhole was a company that partnered with NVIDIA many years back to produce one of the first "Google Earth" like softwares.

The inspiration for Quad Tree culling, comes from modern 3D games, where it is used to quickly cull away parts of the scene which is not in the view frustum.

To further clarify this, imagine that you are looking at the map of USA from really high up. At this level, you basically split the map into 4 sections and make each section a child of the Quad Tree.

Now, as you zoom in, you zoom in on one of the sections (quite obviously you can zoom right in the center, so that your zoom actually touches all 4 sections, but for simplicity's sake, lets say you zoom in on one of the sections). So when you zoom in to one section, you traverse the 4 children of that section. These 4 children contain higher resolution data of its parent. You can then continue to zoom down till you hit a set of leaves, which contain the highest resolution data. To make the jump from one resolution to the next "seamless" a combination of blur and fading effects can be utilized.

As a follow-up to this post, I will try to add links to many of the concepts I put in here.

Carp answered 30/12, 2011 at 8:9 Comment(3)

Muhammad: if you want to add your email address(es) somewhere, I suggest you add it to your profile. You also seem to have two SO identities, which will confuse you and other people. – Zonked 30/12, 2011 at 17:49

Jon: You are right, I did accidentally create two profiles. How can I link the two? I got the "Teacher" badge on my unregistered profile :| – Carp 31/12, 2011 at 9:14

I suggest you send an email to [email protected] explaining your problem. – Zonked 31/12, 2011 at 15:35

For this sort of application, you would want some sort of database to represent map features and the connections between them, and would then need:

spatial indexing of the map feature database, so that it can be efficiently queried by 2D coordinates; and
a good way to search the connections to find a least-cost route, for some measure of cost (e.g. distance).

For 1, an example would be the R-tree data structure.

For 2, you need a graph search algorithm, such as A*.

Nib answered 23/1, 2010 at 23:0 Comment(0)

Look up a paper about Highway Dimension from google authors. The idea is to precompute the shortest path between important nodes and then route everything through those. You are not going to use residential streets to go from LA to Chicago save for getting on and off the freeway at both ends.

Raimondo answered 18/10, 2010 at 19:13 Comment(1)

In case people, would like a link dl.acm.org/citation.cfm?id=1873665 – Ferriferous 26/10, 2018 at 21:41

I'm not sure of the internal data structure, but it may be some kind of 2D coordinate based tree structure that only displays a certain number of levels. The levels would correspond to zoom factors, so you could ignore as insignificant things below, say, 5 levels below the current level, and things above the current level.

Regardless of how it's structured, here's how you can use it:

http://code.google.com/apis/maps/documentation/reference.html

Bevatron answered 22/1, 2010 at 4:51 Comment(0)

I would think of it as a computational geometry problem. When you click on a particular coordinate in the map and using that information, can get the latitude and longitude of that location. Based on the latitude and longitude and the level of zoom, the place can be identified.

Once you have identified the two places, the only problem for you is to identify the nearest route. Now this problem is finding the shortest path between two points, having polygon blocks between them(which correspond to the places which contains no roads) and the only possible connections are roads. This is a known problem and efficient algorithms exist to solve this.

I am not sure if this is what google is doing, but I hope they do something on these lines.

I am taking computational geometry this semester. Here is the course link: http://www.ams.sunysb.edu/~jsbm/courses/545/ams545.html. Check them if you are interested.

Wifeless answered 26/1, 2010 at 5:37 Comment(0)

-5

I was wondering what the data structure is in an application like google/bing maps.

To the user: XHTML/CSS/Javascript. Like any website.

On the server: who knows? Any Google devs around here? It certainly isn't PHP or ASP.net...

How is it that the results are returned so quickly when searching for directions?

Because Google spent years, manpower and millions of dollars on building up the architecture to get the fastest server reaction time possible?

What kind of algorithms are being used to determine this information?

A journey planner algorithm.

Kelikeligot answered 22/1, 2010 at 4:32 Comment(1)

Whoa. That Wikipedia article has Multiple Issues. – Certainty 22/1, 2010 at 13:43

Recommended topics

Hot tags