I find myself very intrigued by the existence of a ConcurrentBag<T>
class in the upcoming .NET 4.0 framework:
Bags are useful for storing objects when ordering doesn't matter, and unlike sets, bags support duplicates.
My question is: how might this idea be implemented? Most collections I'm familiar with essentially amount to (under the hood) some form of array, in which order may not "matter," but there is an order (which is why, even though it doesn't need to, enumeration will pretty much always go through an unchanged collection, be it List
, Queue
, Stack
, etc. in the same sequence).
If I had to guess, I might suggest that internally it could be a Dictionary<T, LinkedList<T>>
; but that actually seems quite dubious considering it wouldn't make sense to use just any type T
as a key.
What I'm expecting/hoping is that this is actually an established object type that has already been "figured out" somewhere, and that somebody who knows of this established type can tell me about it. It's just so unusual to me--one of those concepts that's easy to understand in real life, but is difficult to translate into a usable class as a developer--which is why I'm curious as to the possibilities.
EDIT:
Some responders have suggested that a Bag
could be a form of a hashtable internally. This was my initial thought as well, but I foresaw two problems with this idea:
- A hashtable is not all that useful when you don't have a suitable hashcode function for the type in question.
- Simply tracking an object's "count" in a collection is not the same as storing the object.
As Meta-Knight suggested, perhaps an example would make this more clear:
public class ExpensiveObject() {
private ExpensiveObject() {
// very intense operations happening in here
}
public ExpensiveObject CreateExpensiveObject() {
return new ExpensiveObject();
}
}
static void Main() {
var expensiveObjects = new ConcurrentBag<ExpensiveObject>();
for (int i = 0; i < 5; i++) {
expensiveObjects.Add(ExpensiveObject.CreateExpensiveObject());
}
// after this point in the code, I want to believe I have 5 new
// expensive objects in my collection
while (expensiveObjects.Count > 0) {
ExpensiveObject expObj = null;
bool objectTaken = expensiveObjects.TryTake(out expObj);
if (objectTaken) {
// here I THINK I am queueing a particular operation to be
// executed on 5 separate threads for 5 separate objects,
// but if ConcurrentBag is a hashtable then I've just received
// the object 5 times and so I am working on the same object
// from 5 threads at the same time!
ThreadPool.QueueUserWorkItem(DoWorkOnExpensiveObject, expObj);
} else {
break;
}
}
}
static void DoWorkOnExpensiveObject(object obj) {
ExpensiveObject expObj = obj as ExpensiveObject;
if (expObj != null) {
// some work to be done
}
}
Bag
could be implemented as a HashTable with objects as keys and values set to the number of occurrences of the associated key. If this were the case, then "adding" an object would be the same as incrementing this value (the # of occurrences) by one, and "removing" an object would return the object and simply decrement this value. You are right that CreateExpensiveObject would guarantee that an object is created, but not that it is added to theBag
, if the implementation were a HashTable. – Modillion