TL;DR: Is it possible for a single enumeration of a ConcurrentDictionary
, to emit the same key twice? Does the current implementation of the ConcurrentDictionary
class (.NET 5) allow this possibility?
I have a ConcurrentDictionary<string, decimal>
that is mutated by multiple threads concurrently, and I want periodically to copy it to a normal Dictionary<string, decimal>
, and pass it to the presentation layer for updating the UI. There are two ways to copy it, with and without snapshot semantics:
var concurrent = new ConcurrentDictionary<string, decimal>();
var copy1 = new Dictionary<string, decimal>(concurrent.ToArray()); // Snapshot
var copy2 = new Dictionary<string, decimal>(concurrent); // On-the-go
I am pretty sure that the first approach is safe, because the ToArray
method returns a consistent view of the ConcurrentDictionary
:
Returns a new array containing a snapshot of key and value pairs copied from the
ConcurrentDictionary<TKey,TValue>
.
But I would prefer to use the second approach, because it generates less contention.
I am worried though about the possibility of getting an ArgumentException: An item with the same key has already been added.
The documentation doesn't seem to exclude this possibility:
The enumerator returned from the dictionary ... does not represent a moment-in-time snapshot of the dictionary. The contents exposed through the enumerator may contain modifications made to the dictionary after
GetEnumerator
was called.
Here is the scenario that makes me worried:
- The thread A starts enumerating the
ConcurrentDictionary
, and the keyX
is emitted by the enumerator. Then the thread is temporarily suspended by the OS. - The thread B removes the key
X
. - The thread C adds a new entry with the key
X
. - The thread A resumes enumerating the
ConcurrentDictionary
, the enumerator observes the newly addedX
entry, and emits it. - The constructor of the
Dictionary
class attempts to insert twice the keyX
into the newly constructedDictionary
, and throws an exception.
I tried to reproduce this scenario, without success. But this is not 100% reassuring, because the conditions that could cause this situation to emerge could be subtle. Maybe the values I added didn't have the "right" hashcodes, or didn't generate the "right" number of hashcode collisions. I tried to find an answer by studying the source code of the class, but unfortunately it's too complicated for me to understand.
My question is: is it safe, based on the current implementation (.NET 5), to create fast copies of my ConcurrentDictionary
by enumerating it directly, or should I code defensively and take a snapshot every time I copy it?
Clarification: I would agree with anyone who says that using an API taking into consideration its undocumented implementation details is unwise. But alas, this is what this question is all about. It's a rather educational, out of curiosity question. I am not intending to use the acquired knowledge in production code, I promise. π
ConcurrentDictionary
, to emit the same key twice. β MerrowConcurrentDictionary
thenToArray()
is the documented way to do that. See ConcurrentDictionary in C#?. Enumerating the dictionary could definitely reflect modifications made during the enumeration, see ConcurrentDictionary enumeration and locking β FlatsConcurrentDictionary
, and I would like to avoid taking expensive snapshots. I am OK with the copies reflecting modifications made during the enumeration. But I am not OK with receiving the same key twice, and I am asking whether this is a possible scenario. β MerrowToSafeDictionary
extension method that takes aConcurrentDictionary
. Then inside it either useToArray
(like you are now), ornew
up theConcurrentDictionary
and callTryAdd
one by one. That way if the key does appear twice you'll be fine. This is similar to @MineR's earlier suggestion (although that suggestion will do last wins, while this suggestion does first wins). β ValentinevalentinoToSafeDictionary
extension method would be an easy to implement security measure. Interestingly theImmutableDictionary<K,V>
class does not throw when instantiated by enumerables containing duplicates. It is instantiated safely by design. β MerrowConcurrentDictionary
using the native enumerator. β MerrowDoes the current implementation prevent double emitted keys?
Which current implementation are you interested in? Framework (and which version)? Core (and which version)? Mono? etc etc I suspect you see the point I am making. ;) You need to program to the contract, not the implementation (particularly when there is more than one implementation). β ValentinevalentinoConcurrentDictionary
. I think it's this source file. β Merrow