Is ConcurrentHashMap totally safe?
Asked Answered
G

6

60

this is a passage from JavaDoc regarding ConcurrentHashMap. It says retrieval operations generally do not block, so may overlap with update operations. Does this mean the get() method is not thread safe?

"However, even though all operations are thread-safe, retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access. This class is fully interoperable with Hashtable in programs that rely on its thread safety but not on its synchronization details.

Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove). Retrievals reflect the results of the most recently completed update operations holding upon their onset."

Gastronomy answered 19/2, 2013 at 0:18 Comment(1)
If I'm reading that right, it means that the retrieval will always return the results of the last update to complete -at the time when the retrieval started-. That means that a full update can occur between the time when a retrieval starts and completes and it will not change the outcome of said retrieval.Hiedihiemal
T
72

The get() method is thread-safe, and the other users gave you useful answers regarding this particular issue.

However, although ConcurrentHashMap is a thread-safe drop-in replacement for HashMap, it is important to realize that if you are doing multiple operations you may have to change your code significantly. For example, take this code:

if (!map.containsKey(key)) 
   return map.put(key, value);
else
   return map.get(key);

In a multi-thread environment, this is a race condition. You have to use the ConcurrentHashMap.putIfAbsent(K key, V value) and pay attention to the return value, which tells you if the put operation was successful or not. Read the docs for more details.


Answering to a comment that asks for clarification on why this is a race condition.

Imagine there are two threads A, B that are going to put two different values in the map, v1 and v2 respectively, having the same key. The key is initially not present in the map. They interleave in this way:

  • Thread A calls containsKey and finds out that the key is not present, but is immediately suspended.
  • Thread B calls containsKey and finds out that the key is not present, and has the time to insert its value v2.
  • Thread A resumes and inserts v1, "peacefully" overwriting (since put is threadsafe) the value inserted by thread B.

Now thread B "thinks" it has successfully inserted its very own value v2, but the map contains v1. This is really a disaster because thread B may call v2.updateSomething() and will "think" that the consumers of the map (e.g. other threads) have access to that object and will see that maybe important update ("like: this visitor IP address is trying to perform a DOS, refuse all the requests from now on"). Instead, the object will be soon garbage collected and lost.

Tremblay answered 19/2, 2013 at 0:33 Comment(9)
"not a thread-safe drop-in replacement for HashMap". Uh, it certainly is. You are correct that there are race conditions with multiple operations but that doesn't stop ConcurrentHashMap from being thread-safe.Bugs
For drop-in replacement I mean: you change your HashMap to a ConcurrentHashMap and all common operations on the map (like the handcrafted "put if absent" - I've seen it zillions of times) are automagically thread-safe.Tremblay
Agreed. But your statement "ConcurrentHashMap is not a thread-safe drop-in replacement for HashMap" is extremely misleading at the very least. I'd say it's incorrect.Bugs
I'd rework it to say "although ConcurrentHashMap is a drop in thread-safe replacement for HashMap, it is important to realize that if you are doing multiple operations..."Bugs
why the above code snippets has a race condition? is ConcurrentHashmap.put() is not thread safe?i am confusedLucinalucinda
with the same interleaving what would happen if we use putIfAbsent? v1 will be discarded? what if we want to allow overwriting the value with the same key? Do we need to have a loop checking to make sure the write has been successful ? If so, how can that be efficient than "synchronized block" ?Hypocrisy
@Hypocrisy : synchronized block will make it lock for all the keys. We can get better performance if we make it granular atomic based on different keys. That is where ConcurrentHashMap can be useful. putIfAbsent method in ConcurrentHashMap is check-if-absent-then-set method. It's an atomic operation.So it will prevent the overriding of value by another thread if the previous had already written one.Approachable
@Tremblay " how a get operation is expected to perform atomically ? " declaring volatile value for a key doesn't guarantee atomicity , if its a 64 bit object there may be two instructions to read it completely right ?Wrapped
@Tremblay Thank you very much for your explainCarthy
I
23

It is thread-safe. However, the way it is being thread-safe may not be what you expect. There are some "hints" you can see from:

This class is fully interoperable with Hashtable in programs that rely on its thread safety but not on its synchronization details

To know the whole story in a more complete picture, you need to be aware of the ConcurrentMap interface.

The original Map provides some very basic read/update methods. Even I was able to make a thread-safe implementation of Map; there are lots of cases that people cannot use my Map without considering my synchronization mechanism. This is a typical example:

if (!threadSafeMap.containsKey(key)) {
   threadSafeMap.put(key, value);
}

This piece of code is not thread-safe, even though the map itself is. Two threads calling containsKey() at the same time could think there is no such key they both therefore insert into the Map.

In order to fix the problem, we need to do extra synchronization explicitly. Assume the thread-safety of my Map is achieved by synchronized keywords, you will need to do:

synchronized(threadSafeMap) {
    if (!threadSafeMap.containsKey(key)) {
       threadSafeMap.put(key, value);
    }
}

Such extra code needs you to know about the "synchronization details" of the map. In the above example, we need to know that the synchronization is achieved by "synchronized".

ConcurrentMap interface take this one step further. It defines some common "complex" actions that involves multiple access to map. For example, the above example is exposed as putIfAbsent(). With these "complex" actions, users of ConcurrentMap (in most case) don't need to synchronise actions with multiple access to the map. Hence, the implementation of Map can perform more complicated synchronization mechanism for better performance. ConcurrentHashhMap is a good example. Thread-safety is in fact maintained by keeping separate locks for different partitions of the map. It is thread-safe because concurrent access to the map will not corrupt the internal data structure, or cause any update lost unexpected, etc.

With all the above in mind, the meaning of Javadoc will be clearer:

"Retrieval operations (including get) generally do not block" because ConcurrentHashMap is not using "synchronized" for its thread-safety. The logic of get itself takes care of the thread-safeness; and If you look further in the Javadoc:

The table is internally partitioned to try to permit the indicated number of concurrent updates without contention

Not only is retrieval non-blocking, even updates can happen concurrently. However, non-blocking/concurrent-updates does not means that it is thread-UNsafe. It simply means that it is using some ways other than simple "synchronized" for thread-safety.

However, as the internal synchronization mechanism is not exposed, if you want to do some complicated actions other than those provided by ConcurrentMap, you may need to consider changing your logic, or consider not using ConcurrentHashMap. For example:

// only remove if both key1 and key2 exists
if (map.containsKey(key1) && map.containsKey(key2)) {
    map.remove(key1);
    map.remove(key2);
}
Ionia answered 19/2, 2013 at 2:41 Comment(0)
D
11

ConcurrentHashmap.get() is thread-safe, in the sense that

  • It will not throw any exception, including ConcurrentModificationException
  • It will return a result that was true at some (recent) time in past. This means that two back-to-back calls to get can return different results. Of course, this true of any other Map as well.
Durward answered 19/2, 2013 at 0:26 Comment(7)
What do you mean by "two back-to-back calls to get"?Gastronomy
Call 'get()' and save the result. Call 'get()' a second time and compare the results. That's what 'back-to-back' means.Aldous
Back to back calls to get can return different results if there has been an intervening put() or other operation that modifies the map.Bugs
I don't quite agree on your answer, because: 1. Not throwing any exception does not mean that it is thread-safe, and ConcurrentModificationException is not about thread-safety. 2. "same result for two back-to-back call" is never the contract for Map, and it has nothing to do with thread-safety as well.Ionia
@AdrianShum I hear you. I was trying to simplify for someone that seemed me a relatively novice developer.Durward
@MiserableVariable However even he is novice, it doesn't mean it is fine to provide incorrect answer to him (Sorry I cannot see the answer being "simplifying", but simply irrelevant and incorrect)Ionia
I think it has to do with value object being Volatile in HashEntry.Campo
H
9

HashMap is divided into "buckets" based on hashCode. ConcurrentHashMap uses this fact. Its synchronization mechanism is based on blocking buckets rather than on entire Map. This way few threads can simultaneously write to few different buckets (one thread can write to one bucket at a time).

Reading from ConcurrentHashMap almost doesn't use synchronization. Synchronization is used when while fetching value for key, it sees null value. Since ConcurrentHashMap can't store null as values (yes, aside from keys, values also can't be nulls) it suggests that fetching null while reading happened in the middle of initializing map entry (key-value pair) by another thread: when key was assigned, but value not yet, and it still holds default null.
In such case reading thread will need to wait until entry will be written fully.

So results from read() will be based on current state of map. If you read value of key that was in the middle of updating you will likely get old value since writing process hasn't finished yet.

Harelip answered 19/2, 2013 at 0:54 Comment(0)
C
5

get() in ConcurrentHashMap is thread-safe because It reads the value which is Volatile. And in cases when value is null of any key, then get() method waits till it gets the lock and then it reads the updated value.

When put() method is updating CHM, then it sets the value of that key to null, and then it creates a new entry and updates the CHM. This null value is used by get() method as signal that another thread is updating the CHM with the same key.

Campo answered 20/8, 2013 at 5:22 Comment(1)
When put() method is updating CHM, then it sets the value of that key to null. Clarification for the same please! Any links? Code?Glavin
A
4

It just means that when one thread is updating and one thread is reading there is no guarantee that the one that called the ConcurrentHashMap method first, in time, will have their operation occur first.

Think about an update on the item telling where Bob is. If one thread asks where Bob is at about the same time that another thread updates to say he came 'inside', you can't predict whether the reader thread will get Bob's status as 'inside' or 'outside'. Even if the update thread calls the method first, the reader thread might get the 'outside' status.

The threads will not cause each other problems. The code is ThreadSafe.

One thread won't go into an infinite loop or start generating wierd NullPointerExceptions or get "itside" with half of the old status and half of the new.

Aldous answered 19/2, 2013 at 0:30 Comment(1)
Would a double-checked locking pattern solve the problem? By forcing the thread that is looking for the 'inside' value to check twice, would the writer thread have a chance to update the value? This way you would block writes but not reads?Inexpressible

© 2022 - 2024 — McMap. All rights reserved.