Test if python Counter is contained in another Counter
Asked Answered
E

5

13

How to test if a python Counter is contained in another one using the following definition:

A Counter a is contained in a Counter b if, and only if, for every key k in a, the value a[k] is less or equal to the value b[k]. The Counter({'a': 1, 'b': 1}) is contained in Counter({'a': 2, 'b': 2}) but it is not contained in Counter({'a': 2, 'c': 2}).

I think it is a poor design choice but in python 2.x the comparison operators (<, <=, >=, >) do not use the previous definition, so the third Counter is considered greater-than the first. In python 3.x, instead, Counter is an unorderable type.

Eurythmics answered 11/4, 2015 at 8:12 Comment(4)
You should properly define "contained" to avoid confusion.Coiffure
Counter doesn't actually support comparison operators.Rhymester
@JimDennis: We're supposed to be considering the Counter as a multiset, and that attempt doesn't take into account the multiplicity of elements.Rhymester
@JimDennis: No, I don't want to check if all the keys are present, I want to check also the multiplicity as user2347112 said: A Counter a is contained in a Counter b if, and only if, for every key k in a, the value a[k] is less or equal to the value b[k].Eurythmics
E
16

Update 2023: Counter supports rich comparison operators as of python 3.10, so this works:

container <= contained

Historical answer for python < 3.10:

The best I came up with is to convert the definition i gave in code:

def contains(container, contained):
    return all(container[x] >= contained[x] for x in contained)

But if feels strange that python don't have an out-of-the-box solution and I have to write a function for every operator (or make a generic one and pass the comparison function).

Eurythmics answered 11/4, 2015 at 8:12 Comment(1)
Agreed, given that Counter gives a couple of multiset-like operations, it's surprising it doesn't offer a subset of equivalent. I'd expect to just be able to do counter_a <= counter_b - Either way, this seems like the best solution possible to me.Winer
F
15

While Counter instances are not comparable with the < and > operators, you can find their difference with the - operator. The difference never returns negative counts, so if A - B is empty, you know that B contains all the items in A.

def contains(larger, smaller):
    return not smaller - larger
Fishtail answered 11/4, 2015 at 9:7 Comment(1)
I considered giving this answer, but I decided against it because this doesn't short-circuit and wastes a bunch of time and space building the Counter. Also, it has to iterate through both Counters due to how - handles negative counts in the second argument. The solution based on all is much more efficient.Rhymester
C
2

Another, fairly succinct, way to express this:

"Counter A is a subset of Counter B" is equivalent to (A & B) == A.

That's because the intersection (&) of two Counters has the counts of elements common to both. That'll be the same as A if every element of A (counting multiplicity) is also in B; otherwise it will be smaller.

Performance-wise, this seems to be about the same as the not A - B method proposed by Blckknght. Checking each key as in the answer of enrico.bacis is considerably faster.

As a variation, you can also check that the union is equal to the larger Counter (so nothing was added): (A | B) == B. This is noticeably slower for some largish multisets I tested (1,000,000 elements).

Cassaba answered 14/5, 2021 at 3:4 Comment(2)
collections.Counter("abc") & collections.Counter("b") == collections.counter("abc") returns False, so this doesn't seem to workStrafford
That's because "abc" is not a subset of "b".Cassaba
I
1

For all the keys in smaller Counter make sure that no value is greater than its counterpart in the bigger Counter:

def containment(big, small):
    return not any(v > big[k] for (k, v) in small.iteritems())

>>> containment(Counter({'a': 2, 'b': 2}), Counter({'a': 1, 'b': 1}))
True
>>> containment(Counter({'a': 2, 'c': 2, 'b': 3}), Counter({'a': 2, 'b': 2}))
True
>>> print containment(Counter({'a': 2, 'b': 2}), Counter({'a': 2, 'b': 2, 'c':1}))
False
>>> print containment(Counter({'a': 2, 'c': 2}), Counter({'a': 1, 'b': 1})
False
Iguanodon answered 12/4, 2015 at 7:46 Comment(7)
Which version of python are you using? In Python 2 it returns True for both the invocations, while in python 3 it throws an error because Counter is unorderable. Moreover you don't really need a function for that.Eurythmics
@Eurythmics I copied it wrong before. It works fine on Py 2.7 I'm on. Thanks for catching that!Iguanodon
As I said in the question, the comparison operator only check the total number of elements in the counter. Try with my example: Counter({'a': 2, 'b': 2}) >= Counter({'a': 1, 'b': 1}) gives True as expected, but Counter({'a': 2, 'c': 2}) >= Counter({'a': 1, 'b': 1}) also gives True even if the right hand side is not contained in the left hand side.Eurythmics
@Eurythmics I've added another check get around the case you mentioned.Iguanodon
Now it works but keep in mind that not any( predicate ) is exactly the same as all( not predicate ), so the answer is semantically the same as my answer above ;)Eurythmics
@Eurythmics True :DIguanodon
Another couple of suggestions, you are using .items() to get also the value v while you should prefer .iteritems() which does not create an intermediate list but a generator, since you are not storing the values. You are also not using v. To be more efficient, you should use not any(v > big[k] for (k, v) in small.iteritems()) or not any(small[k] > big[k] for k in small).Eurythmics
B
1

While of historical interest, all these answers are obsolete. Counter class objects are in fact comparable enter image description here

Birdella answered 20/2, 2023 at 23:52 Comment(1)
Thanks for the answer! I would write the actual code into your answer though instead of pasting a screenshot, otherwise it forces people to have to hand-type everything you've done here (versus just copy and pasting your code).Celiotomy

© 2022 - 2024 — McMap. All rights reserved.