Why are python sets "sorted" in ascending order? [duplicate]
Asked Answered
L

2

5

Let's run the following code:

st = {3, 1, 2}
st
>>> {1, 2, 3}
st.pop()
>>> 1
st.pop()
>>> 2
st.pop()
>>> 3

Although sets are said to be unordered, this set behaves as if it was sorted in ascending order. The method pop(), that should return an 'arbitrary element', according to the documentation, returns elements in ascending order as well. What is the reason for this?

Loretaloretta answered 21/1, 2022 at 17:7 Comment(5)
afaik it's completely random and according to the hash of the objectAlpert
"unordered" means it cannot be relied upon to be ordered, not that is can be relied on to not be ordered. Same idea for "arbitrary element".Pompey
.pop removes elements in the order they exist in the set. The documentation for set.pop says "arbitrary" because the underlying order of a set is undefined. This is because the mathematical (and Python) definition of a set is only concerned with membership in the set, not order within the set. The ordering within a set is based on hashcode, which is unpredictable.Pacifistic
Items in a set have to de stored (and come out) in some order, it's just that you cannot rely on the order.Kinship
See the second linked duplicate's accepted answer It should be illuminating. But suffice it to say, when people say that "sets are unordered", they mean "you can't rely on any particular order"Flameproof
A
6

The order correlates to the hash of the object, size of the set, binary representation of the number, insertion order and other implementation parameters. It is completely arbitrary and shouldn't be relied upon:

>>> st = {3, 1, 2,4,9,124124,124124124124,123,12,41,15,}
>>> st
{1, 2, 3, 4, 9, 41, 12, 15, 124124, 123, 124124124124}
>>> st.pop()
1
>>> st.pop()
2
>>> st.pop()
3
>>> st.pop()
4
>>> st.pop()
9
>>> st.pop()
41
>>> st.pop()
12
>>> {1, 41, 12}
{1, 12, 41}
>>> {1, 9, 41, 12}
{1, 12, 9, 41}  # Looks like 9 wants to go after 12.
>>> hash(9)
9
>>> hash(12)
12
>>> hash(41)
41
>>> {1, 2, 3, 4, 9, 41, 12}
{1, 2, 3, 4, 9, 12, 41}  # 12 before 41
>>> {1, 2, 3, 4, 9, 41, 12, 15}  # add 15 at the end
{1, 2, 3, 4, 9, 41, 12, 15}  # 12 after 41
Alpert answered 21/1, 2022 at 17:11 Comment(1)
@MichaelSzczesny I didn't mean actually random. I just meant it's completely implementation dependent.Alpert
P
4

You are never supposed to rely on a set's ordering. Once you put more than one thing into a set, expect to never be able to rely on a stable ordering between elements. That is the contract that Python needs you to adhere to.

The implementation details may show you behavior that may lead you to believe that you sometimes can predict the ordering of a set. For example, as you have discovered, if you put some low-order descending integers into a set and print the set, they will show up in ascending order. This is probably because the hash of an integer is equal to itself.

hash(1) == 1
hash(2) == 2

(You can make a Python program to confirm this).

An implementation detail of set is that internally it uses a hashing and bucket storage strategy -- the same as Python's dict does. An implementation detail of this is that items are traversed in an ordering based loosely on how the hash codes have been partitioned internally. Thus for some implementations of Python, e.g. CPython, you will sometimes see what looks like deterministic sorting order for integers, because the hash of an integer is so tightly bound to the integer itself (they are the same value), and this interacts with the implementation details of hash traversal.

The contract with set tells you to never rely on this behavior for future-proofing and cross-Python interpreter compatibility.

Another implementation detail is that .pop set will (for some implementations of Python) return the elements of the set in the order which you see the elements. So, if you print a set to see its contents, and then pop all the elements out of that set, they will pop in an order consistent with what you saw when you printed the set.

As an implementation detail, the ordering of a set is "stable" as long as the contents of the set have not changed. You're not supposed to rely on this. Popping is allowed to select an arbitrary element from the set, but not required to.

Pacifistic answered 21/1, 2022 at 17:22 Comment(2)
it has many more implementation details, the hash of small numbers is only a small factor as evident by my last example.Alpert
Yes, I saw that 12 came after 41. But numbers 0 through 9 will be grouped with each other, at least in the implementation of Python that I have. I don't know enough about the implementation of the hashing strategy to know why this happens, but again, we shouldn't rely on it.Pacifistic

© 2022 - 2024 — McMap. All rights reserved.