How to create no-duplicates ConcurrentQueue?
Asked Answered
Q

3

16

I need a concurrent collection that doesn't allow duplicates (to use in BlockingCollection as Producer/Consumer). I don't need strict order of elements. From another hand i want to minimize the maximum time of element "live" in collection. I.e. collection mustn't be LIFO, ideally it should be FIFO.

Well I would say that I need ConcurrentQueue with no duplicates allowed, but ConcurrentBag with no duplicates also might work.

Why C# doesn't contain anything like that and probably someone already created it?

This question is result of my previous question What type of IProducerConsumerCollection<T> to use for my task?

Quent answered 30/4, 2011 at 23:12 Comment(4)
C# is just a language, you'd have to look for a library to get functionality like that. Like the .NET framework, the home of ConcurrentQueue/Bag. Nobody ever considered writing code like that, it is doomed to fail. Because you cannot predict exactly when the producer produces and the consumer consumes. Deciding when to eliminate duplicates is roughly analogous to making decisions based on the return value of Random.Next(). Whatever reason you have to implement something like that: it's doomed to fail.Umbra
I don't understand why it's doomed to fail. I can simulate Set using ConcurrentDictionary (i will use only key, value will be always null). I don't want to eliminate duplicates. There are should be no duplicates.Quent
moreover "C# in Nutshell" states that it's possible to write concurrent stack: "If you wrote your own concurrent collection that prohibited duplicates, however, you’d make TryAdd return false if the element already existed (an example would be if you wrote a concurrent set)"Quent
Related: Concurrent collections and unique elements.Mockingbird
C
3

There are no built-in .Net libraries that combine this set of rules for a collection. You have three options:

  1. Write your own collection class
  2. Use two collections: Write a custom class that uses one ConcurrentQueue and any Set-based collection that auto-checks for duplicates; have add to Set run and if successful, add to ConcurrentQueue; each add/remove would add to both collections when successful
  3. Use ConcurrentQueue but iterate through the entire list checking for a duplicate

The last two aren't very efficient (one with memory, the other with CPU, I/O, locking) and are messier because of the need for explicit locking, but would accomplish the task. They will be quicker to implement, but if the trade-offs don't meet your requirements you'll have to go with option #1.

Creatural answered 3/10, 2011 at 16:43 Comment(0)
P
1

Well if you strictly want to have no duplicates, you need 'Sets'. For instance NHibernate uses Iesi.Collections to provide such functionality. Taking Iesi, you can build your own functionality around the provided 'Set' classes (DictionarySet, HashSet, SortedSet). Source: http://www.codeproject.com/KB/recipes/sets.aspx

Petitionary answered 1/5, 2011 at 0:18 Comment(2)
.Net does have a HashSet<T>. As far as I encountered, NHibernate uses Iesi.Collections for ISet<T> (which has no good alternative in .Net libraries).Brominate
Yes, new with .net 3.5 and up.Petitionary
B
-2

You could simply use a ConcurrentQueue and before calling Enqueue check if the data is in the queue by calling the ConcurrentQueue.Contains<> method. I'm guessing the Contains<> extension method is fairly well optimized.

EDIT: As others have noted, for this to work it would you'd have to use a locking mechanism such as a mutex etc around the Contains<> method and the Enqueue method like this:

get mutex
if not Contains<>
{
    Enqueue
}
release mutex
Buonarroti answered 8/2, 2012 at 19:14 Comment(5)
I think that would not work because a race condition could allow another thread to Enqueue an "equal" object between the call to Contains and Enqueue. We'd really need a ConcurrentSet<> to do this right.Bowhead
...in which case you wouldn't need a concurrent version of the queue object. I think the original poster was looking for a ConcurrentSet that atomically prevented duplicates.Bowhead
Yes that is what the poster needs. But to get there he/she will have to write a class or cobble together a solution using provided classes and some thread safe logic as C# doesn't provide a ConcurrentSet at this point.Buonarroti
Your solution can still create duplicates in the queue.Sumba
My answer is edited to include the need for a lock mechanism between Contains<> and Enqueue.Buonarroti

© 2022 - 2024 — McMap. All rights reserved.