Are sets ordered like dicts in python3.6
Asked Answered
H

2

47

Due to changes in dict implementation in Python 3.6 it is now ordered by default. Do sets preserve order as well now?

I could not find any information about it but as both of those data structures are very similar in the way they work under the hood I thought it might be the case.

I know there is no promise for dicts to be ordered in all cases but they are most of the time. As stated in Python docs:

The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon

Hurley answered 9/8, 2017 at 4:52 Comment(2)
@byxor You should not depend on random order, sets are arbitrarily ordered but far from random due to the hashingSatinet
If you are interested in why sets are not insertion ordered, see Why don't Python sets preserve insertion order?Cephalonia
B
30

No, sets are still unordered.

You can verify this just by displaying a set that should have a "well-defined hash order"1 to make sure we don't accidentally get a set that looks ordered but actually isn't:

>>> a_set = {3,2,1}
>>> a_set
{1, 2, 3}
>>> list(a_set)
[1, 2, 3]

If it were ordered you would expect {3, 2, 1} and [3, 2, 1] as result of the examples.

While dicts are actually ordered (same example just a bit modified):

>>> a_dict = {3: 3, 2: 2, 1:1}
>>> a_dict
{3: 3, 2: 2, 1: 1}
>>> list(a_dict)
[3, 2, 1]

1 "well-defined hash order":

For integers that satisfy 0 <= integer < sys.hash_info.modulus the hash is just the number itself. That means if the set is ordered "based" on the hash (and not ordered based on the insertion "time") and the hash values don't collide (that's why I used small numbers and numbers that only differ by one) the order should be deterministic because they occupy slots inside the set that are next to each other:

  • Either from smallest to highest
  • or a from a specific value to the highest and then from the smallest to the specific value. This case happens if the next (in the sense of neighboring) free slot in the set is the first one.

As an example for the latter:

>>> a_set = {6,7,8,9}
>>> a_set
{8, 9, 6, 7}
Bashan answered 9/8, 2017 at 11:37 Comment(4)
Negative ints also hash to themselves (apart from -1), although I'm not sure what the exact lower boundary isSatinet
@Satinet Yes, but because -1 and -2 both hash to -2 there's a collision.:)Bashan
Yes and -1 behaves this way because it's an error code in C; I believe the boundary is (sys.maxsize // 4) - 1, at least this is what Martijn Pieters told me previouslySatinet
That's good to know. But it makes sense if one wants to catch errors during hash. I also found the maximum value. It's sys.hash_info.modulus. :)Bashan
S
12

sets are not ordered in Python 3.6, not even as a CPython implementation detail. A simple example illustrates this:

>>> import string
>>> string.digits
'0123456789'
>>> set(string.digits)
{'7', '0', '2', '8', '6', '9', '1', '5', '4', '3'}

The Python 3 docs are clear on this:

A set is an unordered collection with no duplicate elements.

Satinet answered 9/8, 2017 at 11:46 Comment(6)
But the docs on dict also say that "It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique (within one dictionary)." (source). That's just the language spec, the implementation could be ordered...Bashan
@Bashan "best to think of" is the key phrasing I think, there is no such caveat for the set docsSatinet
I thought that's just "fluff" (or included because PyPy had an ordered dictionary for a long time). But, like I said, that's just the language spec. It doesn't mean an implementation could implement it in an ordered fashion (i.e. with buckets or similar).Bashan
Comments are outdated, the word "unordered" is removed from that section of the docs now.Cephalonia
The wording remains the same in the python 3.6 docs i think docs.python.org/3.6/tutorial/datastructures.html#dictionariesSatinet
Makes sense, because dict ordering was not officially ratified until 3.7Cephalonia

© 2022 - 2024 — McMap. All rights reserved.