Why are there no concurrent collections in C#?
Asked Answered
R

3

16

I am trying to get an overview of the thread safety theory behind the collections in C#.

Why are there no concurrent collections as there are in Java? (java docs). Some collections appear thread safe but it is not clear to me what the position is for example with regard to:

  • compound operations,
  • safety of using iterators,
  • write operations

I do not want to reinvent the wheel! (I am not a multi-threading guru and am definitely not underestimating how hard this would be anyway).

I hope the community can help.

Ratify answered 22/12, 2009 at 13:57 Comment(1)
Great response - I will leave this 'unanswered' for a short while longer to keep it on the radar. If anyone has any further links to articles on either pre or post .Net 4.0 state of play on this subject then please include. Thank you everyone.Ratify
M
29

.NET has had relatively "low level" concurrency support until now - but .NET 4.0 introduces the System.Collections.Concurrent namespace which contains various collections which are safe and useful.

Andrew's answer is entirely correct in terms of how to deal with collections before .NET 4.0 of course - and for most uses I'd just lock appropriately when accessing a "normal" shared collection. The concurrent collections, however, make it easy to use a producer/consumer queue, etc.

Melicent answered 22/12, 2009 at 14:7 Comment(1)
+1 thanks Jon. I guess that way .net does not lull people into a false sense of securty. I will check out the .Net 4.0 goodies. – AndrewRatify
L
20

C# offers several ways to work with collections across multiple threads. For a good write-up of these techniques I would recommend that you start with Collections and Synchronization (Thread Safety):

By default, Collections classes are generally not thread safe. Multiple readers can read the collection with confidence; however, any modification to the collection produces undefined results for all threads that access the collection, including the reader threads.

Collections classes can be made thread safe using any of the following methods:

  • Create a thread-safe wrapper using the Synchronized method, and access the collection exclusively through that wrapper.
  • If the class does not have a Synchronized method, derive from the class and implement a Synchronized method using the SyncRoot property.
  • Use a locking mechanism, such as the lock statement in C# (SyncLock in Visual Basic), on the SyncRoot property when accessing the collection.
Lumpish answered 22/12, 2009 at 13:59 Comment(3)
+1 that was fast! - that looks a little like the Collections.synchronizedList from Java. I will check out that link. ThanksRatify
And see also here: blogs.msdn.com/ericlippert/archive/2008/01/21/… where Eric Lippert discusses immutable collections in some depth, specifically with regard to concurrent access.Gaffney
Thanks Jeremy had not come accross this blog (now subscribed) there is some excellent material under the immutability tag.Ratify
T
6

As Jon Skeet mentioned, there are now "thread safe" collections in the System.Collections.Concurrent namespace in .NET 4.

One of the reason that no concurrent collections exist (at least my guess) in prior .NET Framework versions is that it is very hard to guarantee thread safety, even with a concurrent collection.

(This is not entirely true as some collections offer a Synchronized method to return a thread safe collection from a non-thread safe collection so there are some thread safe collections...)

For example assume one has a thread safe Dictionary - if one only want to to an insert if the Key does not exist one would first query the collection to see if the Key exists, then one would do an insert if the key does not exist. These two operation are not thread safe though, between the query of ContainsKey and the Add operation another thread could have done an insert of that key so there is a race condition.

Inother words the operations of the collection are thread safe - but the usage of it is not necessarily. In this case one would need to transition back to traditional locking techniques (mutex/monitor/semaphore...) to achieve thread safety so the concurrent collection has bought you nothing in terms of multi-threaded safety (but is probably worse for performance).

Twoedged answered 22/12, 2009 at 14:23 Comment(8)
I was thinking along the same lines. The Put-if-absent issue where you could end up with 2 objects in your collection where there was none before. Thanks +1.Ratify
@saret: How hard is it really, though, to provide the functions that are needed for true thread-safe usage? I would think if the dictionary didn't need to hold "Nothing" values, everything could be done with an enumerator, a Long change-count property, and a method that worked analogous to Threading.Interlocked.CompareExchange. Adding "Nothing" values creates a little more complexity, but not too much.Suburbia
@supercat, a possibly better way is to offer methods on the collection itself that do work within a lock (such as execute action/func within lock), or offer functionality such as InsertIfNotExistsTwoedged
@saret: Allowing exec within lock creates the possibility of deadlock if the supplied routine hangs. If "nothing" is used to mean "item does not/should not exist", operations like AddIfNotExists could be synthesized from the operations I listed. That's not to say having more primitives wouldn't be helpful, but I would think a collection with the listed functions could, with wrappers, do essentially everything one could want with a thread-safe collection provided contention wasn't too high (CompareExchange-based operations can starve).Suburbia
@supercat- if the supplied routine hangs you've got bigger problems on your hand - however as long as the lock used is consistent and supports recursive acquisition (in the event further lock methods are called) you wouldn't deadlock yourself (assuming the supplied func works). I agree that your route might be a good way to go, the real problem though is that framework designers like those on the .Net BCL team need to build general case code that have to support requirements that might not be suited to this route - a "dictionary didn't need to hold "Nothing" values" is not a safe assumptionTwoedged
@saret: If the dictionary needs to support "nothing" values, there are other ways of handling that, but the syntax is a little ickier (e.g. the CompareAndSwap-ish method can take a byref parameter to say whether the element existed, and by-value parameters to indicate whether it should have or should not exist). As for the routine hanging, if one thread tries to do something with CollectionA while enumerating collection B, and another thread tries to do something with CollectionB while enumerating Collection A, deadlock can arise.Suburbia
@Suburbia - thats the sort of problem that a concurrent collection can't necessarily fix though. Having to deal with two different datastructures (such as collections) like that would generally dictate that the user needs to do some higher level locking. The order of obtaining the higher level locks matters though and done inconsistently can lead to deadlocksTwoedged
@saret: Such problems can often be solved using a method which will take a snapshot of the collection and enumerate that. To minimize the overhead of cloning a collection at every enumeration, it may be possible to use copy-on-write, provided that the enumerator will be correctly disposed. To be sure, higher-level locking will be needed if it's necessary to keep multiple collections synchronized. My point was that if a collection is locked while outside code runs, the precise details of that locking become a concern for anyone using the collection.Suburbia

© 2022 - 2024 — McMap. All rights reserved.