Check if element is already in a Queue [duplicate]
Asked Answered
C

1

21

I am using the Queue library in python and I want to keep queue entries unique.

As such I want to check 'something' isn't already in the queue before adding to it, essentially a function like this which works on the Queue library:

queue = Queue.Queue()
def in_queue(u):
  return u in queue

Or, should I be using a different library/method to achieve this?

Corned answered 12/5, 2013 at 10:29 Comment(1)
I'm not sure whether it's a dup. The question in that case is a bit more complicated, because he's trying to keep track of to-be-processed values and already-processed values together. However, the accepted answer there accidentally answers this question instead, before going on to answer the more complicated one.Melioration
M
60

The standard Queue class can't be iterated or otherwise checked.

However, it was built to be extended.

First, if you look at the source (which is linked from the docs), there are hook methods _init, _qsize, _put and _get that you can override to change the implementation. Look at the subclasses below the main class, and you can see how they do this.

So, one easy thing to do is replace the deque implementation with a set:

class SetQueue(Queue.Queue):
    def _init(self, maxsize):
        self.queue = set()
    def _put(self, item):
        self.queue.add(item)
    def _get(self):
        return self.queue.pop()

(I didn't implement _qsize because the default return len(self.queue) is fine.)

Now you don't have to check, just add it to the queue, and it'll be ignored if it's already there.

Of course this has the down side that the queue is no longer ordered. But you can solve that by using an OrderedSet (similar to the OrderedDict in collections). There's a recipe that's linked from the collections docs. Once you have that:

class OrderedSetQueue(Queue.Queue):
    def _init(self, maxsize):
        self.queue = OrderedSet()
    def _put(self, item):
        self.queue.add(item)
    def _get(self):
        return self.queue.pop()

If you actually want to be able to check values within a queue, you can add a method for that:

class CheckableQueue(Queue.Queue): # or OrderedSetQueue
    def __contains__(self, item):
        with self.mutex:
            return item in self.queue

However, this invites race conditions in your code. For example, if you do this:

if x not in my_queue:
    my_queue.put(x)

It's always possible that x was not in the queue when you checked, but was in the queue when you called put. In fact, the only use of this function which wouldn't be unsafe is some kind of optimistic checking (if the value isn't in the queue now, do some expensive work, then try to add it, accepting that the work is wasted if the value has been added in the meantime)—the same reason Queue.full() exists.

The only way to make this safe is to put both operations together under a lock:

with my_queue.mutex:
    if x not in my_queue:
        my_queue.put(x)

But at this point, you're defeating the purpose of using Queue in the first place. (You're also depending on the fact that Queue.mutex is a recursively-enterable mutex.) Better to add the operation as a method of your Queue subclass.

And if you always want to check first and add only if it's not there, OrderedSetQueue is a better way to do that.

Melioration answered 12/5, 2013 at 10:44 Comment(6)
Can you pls have a look at #30123991?Glycerinate
And #30132668 :)Glycerinate
Would extending Queue like this (e.g. OrderedSetQueue) still be thread safe? If you override the methods to use a set internally, do you bypass the built-in synchronization?Evaluate
Use self.queue.pop(last=False) if you want a FIFO queue with : orderedset.readthedocs.ioJusticeship
I just wanted to say this is an awesome answer (just used your advice to solve the same problem, works a treat)- thanks!Banker
@Evaluate Yes, it’s still threadsafe. The public methods of Queue do the synchronization; the hook methods implemented by subclasses are guaranteed to only be called under the lock. Look at the source that’s linked to in the answer.Melioration

© 2022 - 2024 — McMap. All rights reserved.