Is the order of a Python dictionary guaranteed over iterations?
Asked Answered
U

6

67

I'm currently implementing a complex microbial food-web in Python using SciPy.integrate.ode. I need the ability to easily add species and reactions to the system, so I have to code up something quite general. My scheme looks something like this:

class Reaction(object):
    def __init__(self):
        #stuff common to all reactions
    def __getReactionRate(self, **kwargs):
        raise NotImplementedError

... Reaction subclasses that 
... implement specific types of reactions


class Species(object):
    def __init__(self, reactionsDict):
        self.reactionsDict = reactionsDict
        #reactionsDict looks like {'ReactionName':reactionObject, ...}
        #stuff common to all species

    def sumOverAllReactionsForThisSpecies(self, **kwargs):
        #loop over all the reactions and return the 
        #cumulative change in the concentrations of all solutes

...Species subclasses where for each species
... are defined and passed to the superclass constructor

class FermentationChamber(object):
    def __init__(self, speciesList, timeToSolve, *args):
        #do initialization

    def step(self):
        #loop over each species, which in turn loops 
        #over each reaction inside it and return a 
        #cumulative dictionary of total change for each 
        #solute in the whole system


if __name__==__main__:
    f = FermentationChamber(...)

    o  = ode(...) #initialize ode solver

    while o.successful() and o.t<timeToSolve:
         o.integrate()

    #process o.t and o.y (o.t contains the time points
    #and o.y contains the solution matrix)

So, the question is, when I iterate over the dictionaries in Species.sumOverAllReactionsForThisSpecies() and FermentationChamber.step(), is the iteration order of the dictionaries guaranteed to be the same if no elements are added or removed from the dictionaries between the first and the last iteration? That is, can I assume that the order of the numpy array created at each iteration from the dictionary will not vary? For example, if a dictionary has the format {'Glucose':10, 'Fructose':12}, if an Array created from this dictionary will always have the same order (it doesn't matter what that order is, as long as it's deterministic).

Sorry for the mega-post, I just wanted to let you know where I'm coming from.

Utley answered 12/1, 2010 at 22:42 Comment(2)
@ChinmayKanchi do you mind if I heavily edit this question? All the detail about food webs and integrating ODEs has nothing to do with the question, which is a very good and important one.Fervor
Python 3.6+ is well covered in #39980823Tab
S
3

Python 3.1 has a collections.OrderedDict class that can be used for this purpose. It's very efficient, too: "Big-O running times for all methods are the same as for regular dictionaries."

The code for OrderedDict itself is compatible with Python 2.x, though some inherited methods (from the _abcoll module) do use Python 3-only features. However, they can be modified to 2.x code with minimal effort.

Shorttempered answered 13/1, 2010 at 5:30 Comment(1)
This does not actually answer the question, and using OrderedDict will, while also guaranteeing deterministic ordering, increase resource usage (not sure in what way exactly, though). As other answers indicates, a normal dict already has this guarantee, so no need to use OrderedDict (for this particular usecase).Nasal
P
80

Yes, the same order is guaranteed if it is not modified.

See the docs here.

Edit:

Regarding if changing the value (but not adding/removing a key) will affect the order, this is what the comments in the C-source says:

/* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
 * dictionary if it's merely replacing the value for an existing key.
 * This means that it's safe to loop over a dictionary with PyDict_Next()
 * and occasionally replace a value -- but you can't insert new keys or
 * remove them.
 */

It seems that its not an implementation detail, but a requirement of the language.

Polymeric answered 12/1, 2010 at 22:43 Comment(7)
Ah, excellent! I wasn't sure I'd interpreted that correctly. Just to be sure, it doesn't matter if the values themselves are modified, right, as long as the keys aren't?Utley
I'm pretty sure "no modifications" means no modifications, period. Changing the values may alter dictionary sort order.Outing
Damnation! Looks like I'll have to rethink that algorithm. Is there a ordered map-type datastructure in scipy/numpy or the python standard library? I'd prefer to not have to depend on more libraries than I have to.Utley
@Chinmay, make sure you understand what "changing the value" means here though. If those values are really instances, not primitives, then if you're only changing attributes on those instances but not replacing the instances in the dictionary with other instances, then you're not actually "changing the values" in the dictionary and you won't be affecting the order. Clear?Firewarden
There should be a data structure in Python that gives the properties of a binary tree: a specified ordering, logarithmic insertions and deletions, and constant-time sorted iteration in both directions. I've been bit by the lack of this in Python a couple times.Arbor
That said, a common approach in Python is to use a sorted array, and binary searching for lookups. Python's sort algorithm is very good at sorting partially-sorted lists, and the bisect module handles searching (don't implement binary searching yourself; it's easy to get wrong). This isn't a general replacement for a binary tree, but it may be all you need.Arbor
"Yes, the same order is guaranteed if it is not modified."- What about Python 3 and the PYTHONHASHSEED docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED? (although Python 3.6 has dict order the same as insertion order of course)Dayton
T
45

It depends on the Python version.

Python 3.7+

Dictionary iteration order is guaranteed to be in order of insertion.

Python 3.6

Dictionary iteration order happens to be in order of insertion in CPython implementation, but it is not a documented guarantee of the language.

Prior versions

Keys and values are iterated over in an arbitrary order which is non-random, varies across Python implementations, and depends on the dictionary’s history of insertions and deletions. If keys, values and items views are iterated over with no intervening modifications to the dictionary, the order of items will directly correspond. https://docs.python.org/2/library/stdtypes.html#dictionary-view-objects

The -R option

Python 2.6 added the -R option as (insufficient, it turned out) protection against hash flooding attacks. In Python 2 turning this on affected dictionary iteration order (the properties specified above were still maintained, but the specific iteration order would be different from one execution of the program to the next). For this reason, the option was off by default.

In Python 3, the -R option is on by default since Python 3.3, which adds nondeterminism to dict iteration order, as every time Python interpreter is run, the seed value for hash computation is generated randomly. This situation lasts until CPython 3.6 which changed dict implementation in a way so that the hash values of entries do not influence iteration order.

Source

  • Changed in version 3.7: Dictionary order is guaranteed to be insertion order. This behavior was an implementation detail of CPython from 3.6. https://docs.python.org/3.8/library/stdtypes.html

  • What’s New In Python 3.6: The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon (this may change in the future, but it is desired to have this new dict implementation in the language for a few releases before changing the language spec to mandate order-preserving semantics for all current and future Python implementations; this also helps preserve backwards-compatibility with older versions of the language where random iteration order is still in effect, e.g. Python 3.5). https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-compactdict

Tab answered 20/3, 2020 at 13:55 Comment(3)
Thanks for making this compilation! Actually came here looking something like your answer.Rhetorician
There is a nice two-page discussion of dict iteration guarantees in "Software Engineering at Google" by T. WInters and others [1]. Nothing surprising, but it is to the point and in references hash flooding attacks [2]. [1] books.google.cz/… [2] youtube.com/watch?v=Vdrab3sB7MUTab
Do you have an official source for the decision in Python 3.7?Metaphrase
O
8

Provided no modifications are made to the dictionary, the answer is yes. See the docs here.

However, dictionaries are unordered by nature in Python. In general, it's not the best practice to rely on dictionaries for sensitive sorted data.

An example of an a more robust solution would be Django's SortedDict data structure.

Outing answered 12/1, 2010 at 22:49 Comment(0)
P
7

If you want the order to be consistent, I would do something to force a particular order. Although you might be able to convince yourself that the order is guaranteed, and you might be right, it seems fragile to me, and it will be mysterious to other developers.

For example, you emphasize always in your question. Is it important that it be the same order in Python 2.5 and 2.6? 2.6 and 3.1? CPython and Jython? I wouldn't count on those.

Periostitis answered 12/1, 2010 at 22:49 Comment(1)
Good point. I wasn't sure how fragile it would be when I asked this question. A rethink of this algorithm is definitely in order.Utley
R
6

I also would recommend not relying on the fact the dictionaries order is non-random.

If you want a built in solution to sorting you dictionary read http://www.python.org/dev/peps/pep-0265/

Here is the most relevant material:

This PEP is rejected because the need for it has been largely fulfilled by Py2.4's sorted() builtin function:

    >>> sorted(d.iteritems(), key=itemgetter(1), reverse=True)
    [('b', 23), ('d', 17), ('c', 5), ('a', 2), ('e', 1)]

or for just the keys:

    >>> sorted(d, key=d.__getitem__, reverse=True)
    ['b', 'd', 'c', 'a', 'e']

Also, Python 2.5's heapq.nlargest() function addresses the common use
case of finding only a few of the highest valued items:

    >>> nlargest(2, d.iteritems(), itemgetter(1))
    [('b', 23), ('d', 17)]
Rainer answered 12/1, 2010 at 22:54 Comment(0)
S
3

Python 3.1 has a collections.OrderedDict class that can be used for this purpose. It's very efficient, too: "Big-O running times for all methods are the same as for regular dictionaries."

The code for OrderedDict itself is compatible with Python 2.x, though some inherited methods (from the _abcoll module) do use Python 3-only features. However, they can be modified to 2.x code with minimal effort.

Shorttempered answered 13/1, 2010 at 5:30 Comment(1)
This does not actually answer the question, and using OrderedDict will, while also guaranteeing deterministic ordering, increase resource usage (not sure in what way exactly, though). As other answers indicates, a normal dict already has this guarantee, so no need to use OrderedDict (for this particular usecase).Nasal

© 2022 - 2024 — McMap. All rights reserved.