graph - What are the disadvantages if I replace each linked list in adjacency-list with hash table?
Asked Answered
M

4

21

In CLRS excise 22.1-8 (I am self learning, not in any universities)

Suppose that instead of a linked list, each array entry Adj[u] is a hash table containing the vertices v for which (u,v) ∈ E. If all edge lookups are equally likely, what is the expected time to determine whether an edge is in the graph? What disadvantages does this scheme have? Suggest an alternate data structure for each edge list that solves these problems. Does your alternative have disadvantages compared to the hash table?

So, if I replace each linked list with hash table, there are following questions:

  1. what is the expected time to determine whether an edge is in the graph?
  2. What are the disadvantages?
  3. Suggest an alternate data structure for each edge list that solves these problems
  4. Does your alternative have disadvantages compared to the hash table?

I have the following partial answers:

  1. I think the expected time is O(1), because I just go Hashtable t = Adj[u], then return t.get(v);
  2. I think the disadvantage is that Hashtable will take more spaces then linked list.

For the other two questions, I can't get a clue.

Anyone can give me a clue?

Mealtime answered 12/3, 2012 at 13:7 Comment(0)
B
11

It depends on the hash table and how it handles collisions, for example assume that in our hash table each entry points to a list of elements having the same key.

If the distribution of elements is sufficiently uniform, the average cost of a lookup depends only on the average number of elements per each list(load factor). so the average number of elements per each list is n/m where m is the size of our hash table.

  1. The expected time to determine whether an edge is in the graph is O(n/m)
  2. more space than linked list and more query time than adjacency matrix. If our hash table supports dynamic resizing then we would need extra time to move the elements between the old and new hash tables and if not we would need O(n) space for each hash table in order to have O(1) query time which results in O(n^2) space. also we have just checked expected query time, and In worst case we may have query time just like linked list(O(degree(u))) so it seems better to use adjacency matrix in order to have deterministic O(1) query time and O(n^2) space.
  3. read above
  4. yes, for example if we know that every vertices of our graph has at most d adjacent vertices and d less than n, then using hash table would need O(nd) space instead of O(n^2) and would have expected O(1) query time.
Botts answered 17/3, 2012 at 16:26 Comment(0)
G
14

The answer to question 3 could be a binary search tree.

In an adjacency matrix, each vertex is followed by an array of V elements. This O(V)-space cost leads to fast (O(1)-time) searching of edges.

In an adjacency list, each vertex is followed by a list, which contains only the n adjacent vertices. This space-efficient way leads to slow searching (O(n)).

A hash table is a compromise between the array and the list. It uses less space than V, but requires the handle of collisions in searching.

A binary search tree is another compromise -- the space cost is minimum as that of lists, and the average time cost in searching is O(lg n).

Glaudia answered 16/12, 2012 at 6:41 Comment(2)
I like your answer. I was thinking on the same lines, i.e. replacing list with a red-black/self-balancing tree.Memoirs
Or the list could be maintained to be in sorted order.Schleiermacher
B
11

It depends on the hash table and how it handles collisions, for example assume that in our hash table each entry points to a list of elements having the same key.

If the distribution of elements is sufficiently uniform, the average cost of a lookup depends only on the average number of elements per each list(load factor). so the average number of elements per each list is n/m where m is the size of our hash table.

  1. The expected time to determine whether an edge is in the graph is O(n/m)
  2. more space than linked list and more query time than adjacency matrix. If our hash table supports dynamic resizing then we would need extra time to move the elements between the old and new hash tables and if not we would need O(n) space for each hash table in order to have O(1) query time which results in O(n^2) space. also we have just checked expected query time, and In worst case we may have query time just like linked list(O(degree(u))) so it seems better to use adjacency matrix in order to have deterministic O(1) query time and O(n^2) space.
  3. read above
  4. yes, for example if we know that every vertices of our graph has at most d adjacent vertices and d less than n, then using hash table would need O(nd) space instead of O(n^2) and would have expected O(1) query time.
Botts answered 17/3, 2012 at 16:26 Comment(0)
I
1

Questions 3 and 4 are very open. Besides the thoughts from other two, one problem with hash table is that it's not an efficient data structure for scanning elements from the beginning to the end. In a real world, sometimes it's pretty common to enumerate all the neighbors for a given vertex (e.g., BFS, DFS), and that somehow compromises the use of a direct hash table.

One possible solution for this is to chain existing buckets in hash table together so that they form a doubly-linked list. Every time a new element is added, connect it to the end of the list; Whenever an element is removed, remove it from the list and fix the link relation accordingly. When you want to do an overall scan, just go through this list.

The drawback of this strategy, of course, is more space. There is a two-pointer overhead per element. Also, the addition/removal of an element takes more time to build/fix the link relation.

I'm not too worried about collisions. The hash table of a vertex stores its neighbors, each of which is unique. If its key is unique, there is no chance of collision.

Incommode answered 25/7, 2015 at 3:58 Comment(1)
"If its key is unique, there is no chance of collision." the whole point of collision is two different keys hash to the same value, not that the two keys are the sameBottom
M
0

I wanted to add another option that no one mentioned in any of the other answers. If the graph is static, i.e. the vertices and edges don't change once you have created the graph, you could use a hash table with perfect hashing instead of an adjacency list for each vertex. This would let you look up in worst case O(1) time whether there is an edge between vertices and uses only O(V+E) memory, so asymptotically the same memory use as a normal adjacency list. The advantage is that the O(1) search time to check if there is an edge between vertices is the worst case not the average case.

Monadism answered 27/8, 2022 at 22:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.