ConcurrentHashMap returns a weakly consistent iterator, why should we use it anyhow?
Asked Answered
S

3

11

I am reading the book Java Concurrecny in practice. On page 85 section 5.2.1 it talks about the ConcurrentHashMap and its advantages. However, in one part, the books claims that

the iterators returned by ConcurrentHashMap is weakly consistent. This means that this iterator can tolerate concurrent modification, traverses elements as they existed when iterator was constructed, and may (but not guaranteed to) reflect modifications to the collection after the construction of the iterator.

From why I understand the whole point of synchronization in concurrent programs is to allow thread accessing a shared resource in a consistent way, where as ConcurrentHashMap is not really fulfilling this. Then why using it at all?

Svelte answered 29/12, 2012 at 17:22 Comment(0)
H
19

The point is to avoid synchronization when you don't need it. If you don't mind seeing new elements in some cases and not seeing them in others, using ConcurrentHashMap's iterator can be significantly cheaper than either preventing other threads from adding items while you're iterating or taking a consistent snapshot when the iterator is created.

So yes, when you need synchronization and a consistent iterator, you'll need an alternative - but when you don't, you can take advantage of the more efficient iterator provided by ConcurrentHashMap.

Howe answered 29/12, 2012 at 17:25 Comment(5)
Does ConcurrentHashMap have synchronizations itself in its implementations?Svelte
@Hossein: I don't know offhand - but you can look at the source as easily as I can :) The docs suggest there may be synchronization in write access but not read: "However, even though all operations are thread-safe, retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access."Howe
@Hossein: IIRC the ConcurrentHashMap is lock-free implementation so it does not have synchronization in traditional sense but use atomics and special algorithms to avoid mutexes. It makes it faster as mutexes can be too heavyweight in some applications (setting asside the cost of it itself it allows only 1 thread running at a time) but it is not fully syncronized.Satterfield
@Maciej: ConcurrentHashMap is a lock striping implementation and the default concurrency level is 16.Bettinabettine
@PrinceJohnWesley: Ups. Sorry - my bad (I saw ConcurrentHashMap read ConcurrentSkipListMap).Satterfield
J
2

ConcurrentHashMap is supposed to be thread-safe, by contract. However, it is not supposed to be consistent across threads.

When you iterate over a ConcurrentHashMap, the iterator grabs a copy of the hash map at the moment you asked for it (and this copy is made in a thread-safe manner) and you iterate over that copy. And yes, nothing guarantees you that while you are iterating over that copy, some map entries won't be removed. But the map entries which would be removed in such a manner woud still exist in your iterator.

Johnsen answered 29/12, 2012 at 17:28 Comment(5)
Such entries might not would exists in iterator.Satterfield
No, "would". If it were "might" and it didn't, you'd get a ConcurrentModificationException.Johnsen
From Java SDK documentation "They do not throw ConcurrentModificationException`.".Satterfield
Yes, because at the time of the copy, all entries for a given hash bucket are copied over -- but they may be removed from the map while you iterate.Johnsen
The Iterator of ConcurrentHashMap doesn't work against a copy. It's possible that it will return entries that were added after the Iterator was created, and even after you have already received some elements from it. I have tried it and sometimes that happens in reality (Java 8). Otherwise it wouldn't be weakly consistent.Floribunda
S
2

Everything depends on how strong consistency you need. If you need strong one - than mutexes or other subset of operations would provide what you need.

However sometimes you need much weaker requirements but for example you need say lock-free property as you need to guarantee the throughput. On the other hand you may not need iterators at all or have much weaker restrictions on them.

So if all you need is map shared across threads ConcurrentHashMap (say place in theatre to person booking it) would provide what you need - and it may be much faster if you have many working threads as you avoid mutex synchronization. On the other hand the speed comes at a cost - the view provided by iterator does not necessary corresponds to state of collection at it's creation and it might miss some elements created afterwords (in general in multiprocessor systems happens-before relationship is tricky).

Edit: As Prince John Wesley pointed out I thought about ConcurrentSkipListMap while seeing and writing. ConcurrentHashMap. Most of the points still stand except the parts about being lock-free (as well as everything that comes from it like guaranteed throughput etc.).

Satterfield answered 29/12, 2012 at 17:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.