Is there a more elegant way to express ((x == a and y == b) or (x == b and y == a))?
Asked Answered
M

11

115

I'm trying to evaluate ((x == a and y == b) or (x == b and y == a)) in Python, but it seems a bit verbose. Is there a more elegant way?

Manicure answered 17/10, 2019 at 15:5 Comment(9)
Depends on what type of objects x,y, a,b are: are they ints/floats/strings, arbitrary objects, or what? If they were builtin types and it was possible to keep both x,y and a,b in sorted order, then you could avoid the second branch. Note that creating a set will cause each of the four elements x,y, a,b to be hashed, which might or might not be trivial or have a performance implication depending entirely on what type of objects they are.Distill
Keep in mind the people you work with and the kind of project you're developing. I use a little Python here and there. If someone coded one of the answers here I would totally have to google what's going on. Your conditional is readable pretty much no matter what.Mccown
I wouldn't bother with any of the replacements. They're more terse, but not as clear (IMHO) and I think will all be slower.Embryogeny
This is a classic XYAB problem.Sublapsarianism
@CarlWitthoft I'm in. if (x+y == a+b) and x in {a,b}: ...Bennie
And in general if you're dealing with order-independent collections of objects, use sets/dicts/etc. This really is an XY problem, we'd need to see the larger codebase it's drawn from. I agree that ((x == a and y == b) or (x == b and y == a)) may look yukky, but 1) its intent is crystal clear and intelligible to all non-Python programmers, not cryptic 2) interpreters/compilers will always handle it well and it can essentially never result in non-performant code, unlike the alternatives. So, 'more elegant' can have serious downsides too.Distill
@Bennie Just imagine if the operator precedence put "==" ahead of "+" in your submission!Kisser
Try using DeMorgans rule: not((not(x == a) or not(y == b)) and (not(x == b) or not (y == a))Combine
@JoelKeene You should always consider DeMorgans Law in case it improves readability. But I don't think that's happening here.Hettiehetty
M
155

If the elements are hashable, you could use sets:

{a, b} == {y, x}
Marauding answered 17/10, 2019 at 15:8 Comment(15)
But that will also cover both equal to a or both equal to b. Perhaps the OP wants this, perhaps not.Marcellus
@Marcellus if both are equal to a, and a != b then the above statement is False, is True only if a == b, this implies (x == a and y == b).Marauding
@Marcellus no it doesn't, because there are exactly two items in the right hand set. If both are a, there's no b, and if both are b, there's no a, and in either case the sets can't be equal.Gulp
Ah, ok. My mistake, sorry.Marcellus
On the other hand, if we had three elements on each side and we needed to test whether they could be matched up one to one, then sets wouldn't work. {1, 1, 2} == {1, 2, 2}. At that point, you need sorted or Counter.Saltire
It's not just about whether there are two or more than two elements on each side, but about whether all the elements are distinct. If we compare {a, b, c} to {1, 2, 3} it still works;Profane
I find this tricky to read (don't read the "{}" as "()"). Enough to comment it - and then the purpose is lost.Loraine
I know it's python but creating two new sets just to compare the values seems like an overkill to me... Just define a function which does it and call when needed.Gurevich
@Gurevich A function would be even bigger overkill than 2 simple set constructions. And less readable with 4 arguments.Ambrosine
This is such a good solution, but it might be clearer for some people as set(x,y) == set(a,b)Bennie
@Andy: Except that the set constructor doesn't actually work like that, so you get a TypeError. set([x, y]) == set([a, b]) works, but compared to {x, y} == {a, b}, it runs slower, takes more typing, and involves more syntactic nesting (hurting clarity, particularly when the element expressions are more complex than x and y).Saltire
@marxin: Your intuition is probably based on statically or JIT-compiled language implementations, where functions can be easily inlined. In CPython, introducing a function almost doubles the cost of the test relative to doing it inline, and is sometimes more expensive than building sets.Saltire
@Ambrosine I've tested it and function is faster, not much but still. But regardless of that I would go for function, as its more explicit. Comparing sets is a nice trick but fundamentally feels wrong to do that, not only for the reason I mentioned before. Using sets compare the hashes, not the objects, so its not even doing the same thing.Gurevich
@Gurevich It's about readability for me. You'd need a function with a short readable name that takes 4 arguments, and makes clear what it does without having to look it up to get even close to the readability of a simple set comparison like this. Set comparison also fits really well conceptually. The OP wants to compare 2 groups of 2 numbers to check if they are the same. That's exactly what python sets do in comparisons. I'm fine with agreeing to disagree, though. I can see where you're coming from.Ambrosine
@marxin: Using sets does not just compare hashes. Hashes are only used as an optimization; set operations check for actual object equality after finding a hash match.Saltire
S
62

I think the best you could get is to package them into tuples:

if (a, b) == (x, y) or (a, b) == (y, x)

Or, maybe wrap that in a set lookup

if (a, b) in {(x, y), (y, x)}

Just since it was mentioned by a couple comments, I did some timings, and tuples and sets appear to perform identically here when the lookup fails:

from timeit import timeit

x = 1
y = 2
a = 3
b = 4

>>> timeit(lambda: (a, b) in {(x, y), (y, x)}, number=int(5e7))
32.8357742

>>> timeit(lambda: (a, b) in ((x, y), (y, x)), number=int(5e7))
31.6169182

Although tuples are actually faster when the lookup succeeds:

x = 1
y = 2
a = 1
b = 2

>>> timeit(lambda: (a, b) in {(x, y), (y, x)}, number=int(5e7))
35.6219458

>>> timeit(lambda: (a, b) in ((x, y), (y, x)), number=int(5e7))
27.753138700000008

I chose to use a set because I'm doing a membership lookup, and conceptually a set is a better fit for that use-case than a tuple. If you measured a significant different between the two structures in a particular use case, go with the faster one. I don't think performance is a factor here though.

Shoshonean answered 17/10, 2019 at 15:8 Comment(13)
The tuple method looks very clean. I'd be concerned about the performance impact of using a set. You could do if (a, b) in ((x, y), (y, x)), though?Deeann
@Deeann If you're concerned about performance impact, Python is not the language for you. :-DSpann
I really like the first method, two tuple comparisons. It's easy to parse (one clause at a time) and each clause is very simple. And top of which, it should be relatively efficient.She
Is there any reason to prefer the set solution in the answer to the tuple solution from @Brilliand?Karakul
A set requires that objects be hashable, so a tuple would be a more general solution with fewer dependencies. The performance, while probably not generally important, may also look very different when dealing with large objects with expensive hashing and equality checks.Crossways
@Dukeling True, which is why I noted at the bottom to address it in a case-by-case. I like sets, but yes, there may be cases where they aren't appropriate to use.Shoshonean
A single timeit call isn't a great source of 'proof'. Even more so when your proof that tuples are faster is on the best case for tuples and you happily ignore the worst case.Septuplet
@Septuplet It was repeatable, and was 50 million tests over 30 seconds. I was also arguing against using tuples as you can see above, so it's not like I'm inflating my own position by casting them in a good light. I can add another test for tuples if you specify the worst case scenario that you're referring to. I also never used the word "proof", because obviously a time test is never absolute proof of anything.Shoshonean
@Shoshonean Yes, 50million can suffer from CPU issues, say Spotify is loading a new song or some other background process. I have seen large fluctuations from single timeit calls. One isn't a reliable metric. I never said you said the word proof, ' != ".Septuplet
@Septuplet I'm not sure what point you're trying to make. Someone said there may be a performance issue and I did a test which showed that there seems to be a minimal difference (and again, I independently tested it multiple times, which is about as good as I can do). I then explicitly said at the bottom to do your own testing before coming to a conclusion if you're worried.Shoshonean
@Shoshonean I don't see how I can be anymore clear - I've said the same thing multiple times in multiple ways. Have a good day.Septuplet
@Spann Just because Python isn't relatively fast versus other languages doesn't mean there's no point in bothering to optimize the code you write. There's still sense in choosing the best solution given the speed constraints of the language. Besides, there's work on removing the GIL and speeding it up considerably in recent years.Retinoscope
@Retinoscope It means that useful optimization isn't usually based on microoptimizations like this. Futzing around with container types is virtually useless compared to amortizing overhead with Numpy or rewriting hot code as a native library.Spann
F
32

Tuples make it slightly more readable:

(x, y) == (a, b) or (x, y) == (b, a)

This gives a clue: we're checking whether the sequence x, y is equal to the sequence a, b but ignoring ordering. That's just set equality!

{x, y} == {a, b}
Fabrianne answered 17/10, 2019 at 15:9 Comment(2)
, creates a tuple, not a list. so (x, y) and (a, b) are tuples, same as x, y and a, b.Omnidirectional
I meant "list" in the sense of "ordered sequence of elements", not in the sense of the Python list type. Edited because indeed this was confusing.Fabrianne
K
28

If the items aren't hashable, but support ordering comparisons, you could try:

sorted((x, y)) == sorted((a, b))
Kelton answered 18/10, 2019 at 1:1 Comment(2)
Since this works with hashable items as well (right?), it's a more global solution.Kisser
@CarlWitthoft: No. There are types that are hashable but not sortable: complex, for example.Melpomene
F
28

The most elegant way, in my opinion, would be

(x, y) in ((a, b), (b, a))

This is a better way than using sets, i.e. {a, b} == {y, x}, as indicated in other answers because we don't need to think if the variables are hashable.

Foraminifer answered 18/10, 2019 at 19:27 Comment(2)
How does this differ from this earlier answer?Inglis
@Inglis It uses a tuple where the earlier answer uses a set. That earlier answer did consider this solution, but declined to list it as a recommendation.Deeann
S
25

If these are numbers, you can use (x+y)==(a+b) and (x*y)==(a*b).

If these are comparable items, you can use min(x,y)==min(a,b) and max(x,y)==max(a,b).

But ((x == a and y == b) or (x == b and y == a)) is clear, safe, and more general.

Schliemann answered 18/10, 2019 at 11:22 Comment(3)
Haha, symmetric polynomials ftw!Beriberi
This is the correct answer, in particular the last sentence. Keep it simple, this should not require multiple new iterables or anything like that. For those that want to use sets, take a look at the implementation of set objects and then imagine trying to run this in a tight loop...Cipango
@RazvanSocol OP didn't say what the data types are, and this answer does qualify the solutions that are type-dependent.Cipango
J
23

As a generalization to more than two variables we can use itertools.permutations. That is instead of

(x == a and y == b and z == c) or (x == a and y == c and z == b) or ...

we can write

(x, y, z) in itertools.permutations([a, b, c])

And of course the two variable version:

(x, y) in itertools.permutations([a, b])
Jongjongleur answered 18/10, 2019 at 20:45 Comment(2)
Great answer. It is worth pointing out here (for those that haven't done much with generators before) that this is very memory efficient, as only one permutation is created at a time, and the "in" check would stop and return True immediately after a match is found.Jaunitajaunt
It's also worth pointing out that the complexity of this method is O(N*N!); For 11 variables, this can take over a second to finish. (I posted a faster method, but it still takes O(N^2), and starts taking over a second on 10k variables; So it seems this can either be done fast or generally (wrt. hashability/orderability), but not both :P)Juna
P
15

You can use tuples to represent your data and then check for set inclusion, like:

def test_fun(x, y):
    test_set = {(a, b), (b, a)}

    return (x, y) in test_set
Pohai answered 17/10, 2019 at 15:10 Comment(2)
Written as a one-liner and with a list (for non-hashable items), I would consider this to be the best answer. (although I'm a complete Python newbie).Loraine
@Édouard Now there's another answer that's essentially that (just with a tuple instead of a list, which is more efficient anyway).Deeann
U
11

You already got the most readable solution. There are other ways to express this, perhaps with less characters, but they are less straight-forward to read.

Depending on what the values actually represent your best bet is to wrap the check in a function with a speaking name. Alternatively or in addition, you can model the objects x,y and a,b each in dedicated higher class objects that you then can compare with the logic of the comparison in a class equality check method or a dedicated custom function.

Uncinate answered 20/10, 2019 at 12:7 Comment(0)
J
5

It seems the OP was only concerned with the case of two variables, but since StackOverflow is also for those who search for the same question later, I'll try to tackle the generic case here in some detail; One previous answer already contains a generic answer using itertools.permutations(), but that method leads to O(N*N!) comparisons, since there are N! permutations with N items each. (This was the main motivation for this answer)

First, let's summarize how some of the methods in previous answers apply to the generic case, as motivation for the method presented here. I'll be using A to refer to (x, y) and B to refer to (a, b), which can be tuples of arbitrary (but equal) length.

set(A) == set(B) is fast, but only works if the values are hashable and you can guarantee that one of the tuples doesn't contain any duplicate values. (Eg. {1, 1, 2} == {1, 2, 2}, as pointed out by @user2357112 under @Daniel Mesejo's answer)

The previous method can be extended to work with duplicate values by using dictionaries with counts, instead of sets: (This still has the limitation that all values need to be hashable, so e.g. mutable values like list won't work)

def counts(items):
    d = {}
    for item in items:
        d[item] = d.get(item, 0) + 1
    return d

counts(A) == counts(B)

sorted(A) == sorted(B) doesn't require hashable values, but is slightly slower, and requires orderable values instead. (So e.g. complex won't work)

A in itertools.permutations(B) doesn't require hashable or orderable values, but like already mentioned, it has O(N*N!) complexity, so even with just 11 items, it can take over a second to finish.

So, is there a way to be as general, but do it considerably faster? Why yes, by "manually" checking that there's the same amount of each item: (The complexity of this one is O(N^2), so this isn't good for large inputs either; On my machine, 10k items can take over a second - but with smaller inputs, like 10 items, this is just as fast as the others)

def unordered_eq(A, B):
    for a in A:
        if A.count(a) != B.count(a):
            return False
    return True

To get the best performance, one might want to try the dict-based method first, fall back to the sorted-based method if that fails due to unhashable values, and finally fall back to the count-based method if that too fails due to unorderable values.

Juna answered 21/10, 2019 at 6:48 Comment(1)
counts is equivalent to collections.Counter, which is implemented in C in Python 3 so should be faster, hopefully.Gens
R
0

Your original code:

((x == a and y == b) or (x == b and y == a))

is already the optimal solution for most cases. It's readable and easy to understand, and doesn't involve allocating tuples or sets, or performing sorting. These approaches are arguably less clear since they introduce a layer of indirection on top of the raw variables, and are certainly less performant since they allocate and garbage collect heap memory.

Now, it's important not to prematurely optimize performance, but that doesn't mean fancier solutions are necessarily optimized for readability, either. In this case, I would keep it simple and boring.

If you plan to introduce another variable, then the set solution may begin to make sense, although you're likely going to want to keep these variables in some sort of long-term data structure that makes sense for your use case. In general, it's an antipattern to pack loose variables into data structures just for a single condition test.

Retinoscope answered 31/1 at 18:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.