Can I get JSON to load into an OrderedDict?
Asked Answered
G

6

466

Ok so I can use an OrderedDict in json.dump. That is, an OrderedDict can be used as an input to JSON.

But can it be used as an output? If so how? In my case I'd like to load into an OrderedDict so I can keep the order of the keys in the file.

If not, is there some kind of workaround?

Generalist answered 3/8, 2011 at 4:38 Comment(6)
Yes, in my case I am bridging the gap between different languages and applications, and JSON works very well. But the ordering of keys is a bit of an issue. Would be awesome to have a simple to tick in json.load to use OrderedDicts instead of Dicts in Python.Generalist
That is pretty annoying. In Javascript (of which json is a subset) order of keys is also not preserved...Gilbye
JSON spec defines object type as having unordered keys... expecting specific key order is a mistakeGallery
Key ordering isn't usually for any sort of functional requirements. It's mainly just for human readability. If I just want my json to be pretty-printed, I do not expect any of the document order to change at all.Levitation
It also helps avoid large git diffs!Playacting
See also: stackoverflow.com/questions/43789439Distortion
G
670

Yes, you can. By specifying the object_pairs_hook argument to JSONDecoder. In fact, this is the exact example given in the documentation.

>>> json.JSONDecoder(object_pairs_hook=collections.OrderedDict).decode('{"foo":1, "bar": 2}')
OrderedDict([('foo', 1), ('bar', 2)])
>>> 

You can pass this parameter to json.loads (if you don't need a Decoder instance for other purposes) like so:

>>> import json
>>> from collections import OrderedDict
>>> data = json.loads('{"foo":1, "bar": 2}', object_pairs_hook=OrderedDict)
>>> print json.dumps(data, indent=4)
{
    "foo": 1,
    "bar": 2
}
>>> 

Using json.load is done in the same way:

>>> data = json.load(open('config.json'), object_pairs_hook=OrderedDict)
Gilbye answered 3/8, 2011 at 4:48 Comment(7)
I am perplexed. The docs say the object_pairs_hook gets called for each literal that gets decoded into pairs. Why doesn't this create a new OrderedDict for each record in the JSON?Marrilee
Hmm... the docs are somewhat ambiguously phrased. What they mean as that the "whole result of decoding all the pairs" will be passed, in order, as a list, to object_pairs_hook, rather than "each pair will be passed to object_pairs_hook,"Gilbye
But is loses the original order of the input json?Shuttle
Was surprised to see that json.load does not keep it ordered by default, but looks like it is only mirroring what json itself does - the {} are unordered, but the [] in the json are ordered as described hereWisla
Does adding the OrderedDict hook keep the orders in dict in deeper hierarchy?Quartering
@RandomCertainty yes, every time a JSON object is encountered while parsing source, OrderedDict will be used to build up the resulting python value.Gilbye
Thank you. This solves my problem. Only, data = json.load(open('config.json'), object_pairs_hook=OrderedDict) works for me though. The others do not work. I am using Ubuntu 18.04 with Python 2.7.15+Coauthor
A
129

Simple version for Python 2.7+

my_ordered_dict = json.loads(json_str, object_pairs_hook=collections.OrderedDict)

Or for Python 2.4 to 2.6

import simplejson as json
import ordereddict

my_ordered_dict = json.loads(json_str, object_pairs_hook=ordereddict.OrderedDict)
Ar answered 3/8, 2011 at 5:0 Comment(10)
Ahhh, but it doesn't include the the object_pairs_hook -- which is why you still need simplejson in 2.6. ;)Ar
Want to note that simplejson and ordereddict are separate libraries that you need to install.Marc
for python 2.7+: "import json, collections" in code, for python2.6- "aptitude install python-pip" and "pip install ordereddict" in the systemJansson
This is much more easyier and fast forward than previous method with JSONDecoder.Ronaldronalda
Oddly, in pypy, the included json will fail to loads('{}', object_pairs_hook=OrderedDict).Bullheaded
@Ar any reason I would receive a TypeError: 'OrderedDict' object is not callable?Nor
@Nor -- Not AFAIK, though I haven't looked at this in a long time.Ar
@Nor - maybe you are adding an extra pair of parentheses? It should be object_pairs_hook=OrderedDict, not object_pairs_hook=OrderedDict()Diatropism
FYI centos-6 has an older version of python-simplejson v2.0.9 which does not have the required "object_pairs_hook". So a workaround/hack is to use json.dumps of your ordered dictionary object's pairs collections.OrderedDict(json.loads(json.dumps(x.items()))).Ina
But is loses the original order of the input json?Shuttle
M
51

Some great news! Since version 3.6 the cPython implementation has preserved the insertion order of dictionaries (https://mail.python.org/pipermail/python-dev/2016-September/146327.html). This means that the json library is now order preserving by default. Observe the difference in behaviour between python 3.5 and 3.6. The code:

import json
data = json.loads('{"foo":1, "bar":2, "fiddle":{"bar":2, "foo":1}}')
print(json.dumps(data, indent=4))

In py3.5 the resulting order is undefined:

{
    "fiddle": {
        "bar": 2,
        "foo": 1
    },
    "bar": 2,
    "foo": 1
}

In the cPython implementation of python 3.6:

{
    "foo": 1,
    "bar": 2,
    "fiddle": {
        "bar": 2,
        "foo": 1
    }
}

The really great news is that this has become a language specification as of python 3.7 (as opposed to an implementation detail of cPython 3.6+): https://mail.python.org/pipermail/python-dev/2017-December/151283.html

So the answer to your question now becomes: upgrade to python 3.6! :)

Modestine answered 19/12, 2017 at 6:40 Comment(6)
Although I see the same behavior as you in the given example, in the CPython implementation of Python 3.6.4, json.loads('{"2": 2, "1": 1}') becomes {'1': 1, '2': 2} for me.Deuno
@Deuno it looks like dict.__repr__ sorts keys while the underlying ordering is preserved. In other words, json.loads('{"2": 2, "1": 1}').items() is dict_items([('2', 2), ('1', 1)]) even if repr(json.loads('{"2": 2, "1": 1}')) is "{'1': 1, '2': 2}".Grolier
@SimonCharette Hm, could be; I'm actually unable to reproduce my own observation in conda's pkgs/main/win-64::python-3.6.4-h0c2934d_3, so this will be tough to test.Deuno
This doesn't really help much though, since "renaming" keys will still ruin the order of keys.Don
Python documentation link -- The documentation mentions that "Starting with Python 3.7, the regular dict became order preserving, so it is no longer necessary to specify collections.OrderedDict for JSON generation and parsing.", which implies that by default the load inserts into the dict in the correct order.Beni
@fuglede: That sounds like you were using an IPython version that sorts dict keys for display. (It's definitely not dict.__repr__ - that doesn't sort in any Python version.)Abadan
M
8

The normally used load command will work if you specify the object_pairs_hook parameter:

import json
from  collections import OrderedDict
with open('foo.json', 'r') as fp:
    metrics_types = json.load(fp, object_pairs_hook=OrderedDict)
Mymya answered 16/11, 2017 at 8:22 Comment(0)
F
7

You could always write out the list of keys in addition to dumping the dict, and then reconstruct the OrderedDict by iterating through the list?

Fetal answered 3/8, 2011 at 4:41 Comment(3)
+1 for low-tech solution. I've done that when dealing with the same issue with YAML, but having to duplicate is kinda lame, especially when the underlying format preserves order. Might also make sense to avoid losing key-value pairs that are in the dict but missing from the list of keys, tacking them on after all the explicitly ordered items.Ovenware
The low tech solution also preserves context that isn't otherwise necessarily preserved in the exported format (IOW; someone sees JSON and there's nothing there explicitly stating "these keys should remain in this order" if they do manipulations on it).Fetal
What determines that the list of keys "dumped" are in the right order? What about nested dicts? Seems like both the dumping would need to handle that and the reconstruction would need to be done recursively using OrdereDicts.Rumen
C
5

In addition to dumping the ordered list of keys alongside the dictionary, another low-tech solution, which has the advantage of being explicit, is to dump the (ordered) list of key-value pairs ordered_dict.items(); loading is a simple OrderedDict(<list of key-value pairs>). This handles an ordered dictionary despite the fact that JSON does not have this concept (JSON dictionaries have no order).

It is indeed nice to take advantage of the fact that json dumps the OrderedDict in the correct order. However, it is in general unnecessarily heavy and not necessarily meaningful to have to read all JSON dictionaries as an OrderedDict (through the object_pairs_hook argument), so an explicit conversion of only the dictionaries that must be ordered makes sense too.

Chlorohydrin answered 24/2, 2015 at 5:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.