What are the differences between json and simplejson Python modules?
Asked Answered
C

13

415

I have seen many projects using simplejson module instead of json module from the Standard Library. Also, there are many different simplejson modules. Why would use these alternatives, instead of the one in the Standard Library?

Chit answered 3/4, 2009 at 6:56 Comment(0)
C
424

json is simplejson, added to the stdlib. But since json was added in 2.6, simplejson has the advantage of working on more Python versions (2.4+).

simplejson is also updated more frequently than Python, so if you need (or want) the latest version, it's best to use simplejson itself, if possible.

A good practice, in my opinion, is to use one or the other as a fallback.

try:
    import simplejson as json
except ImportError:
    import json
Conatus answered 3/4, 2009 at 6:59 Comment(12)
Now if I could only get pyflakes to stop complaining about redefinition of unused 'json'Handsome
They are not the same nor compatible, simplejson has a JSONDecodeError and json has a ValueErrorFelizio
@BjornTipling JSONDecodeError is a subclass of ValueErrorAdvertise
I disagree with the above answer assuming you have an up to date Python. The built-in (great plus!!!) Json library in Python 2.7 is as fast as simplejson and has less refused-to-be-fixed unicode bugs. See answer https://mcmap.net/q/86176/-what-are-the-differences-between-json-and-simplejson-python-modulesEnnead
It seems the Python2.7 json adopted simplejson v2.0.9 which is far behind the current simplejson v3.6.5 as of writing. There are lots of improvements worth the import simplejsonStolzer
About JSONDecodeError, it's a good decision not to use ValueError: being more specific, it's much more handy for detect errors coming from outside versus your own errors. No wonders the requests library uses simplejson.Hollington
One bug that continues to exist in json is that named tuple are not serialized to a dict, but to a list. This has been an outstanding bug since 2011 (bugs.python.org/issue12657). Simplejson has fixed this in 2.2.0., which was release back in Sep 4, 2011.Hedgehop
I have tested with latest versions; and in both cases of loads and dumps json is almost 10x faster than simplejsonAttorneyatlaw
Note entirely that out of the box though, you can't really swap one for the other and move along. For example, if you want to use json.dumps() and use a generator in there, native json module won't work and it'll give an error of TypeError: <generator object xxx> is not JSON serializable. So you would have to extend from class list and override iter and len and pass that along to the json.dumps() if using native module. But with simplejson you can just pass the generator with the option iterable_as_array=True and you are good to go. I couldn't find that option for the native jsonmoduleStpierre
try this with Python 2.7: json.dumps(Decimal('1.642')) and then the same with simplejson: simplejson.dumps(Decimal('1.642')) spoiler alert: the former will fail miserably while the latter will just work conclusion: use simplejson as much as possibleMeggy
I think it's not a good practice, the simplejson has a different api with built-in json and can not replace directlyAlgo
It would be nice to have an up-to-date answer for this question. The comparisons are always with Python 2.7! How about json in the Python 10 and 11 libraries? How does it compare to current simplejson?Farlie
E
88

I have to disagree with the other answers: the built in json library (in Python 2.7) is not necessarily slower than simplejson. It also doesn't have this annoying unicode bug.

Here is a simple benchmark:

import json
import simplejson
from timeit import repeat

NUMBER = 100000
REPEAT = 10

def compare_json_and_simplejson(data):
    """Compare json and simplejson - dumps and loads"""
    compare_json_and_simplejson.data = data
    compare_json_and_simplejson.dump = json.dumps(data)
    assert json.dumps(data) == simplejson.dumps(data)
    result = min(repeat("json.dumps(compare_json_and_simplejson.data)", "from __main__ import json, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "      json dumps {} seconds".format(result)
    result = min(repeat("simplejson.dumps(compare_json_and_simplejson.data)", "from __main__ import simplejson, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "simplejson dumps {} seconds".format(result)
    assert json.loads(compare_json_and_simplejson.dump) == data
    result = min(repeat("json.loads(compare_json_and_simplejson.dump)", "from __main__ import json, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "      json loads {} seconds".format(result)
    result = min(repeat("simplejson.loads(compare_json_and_simplejson.dump)", "from __main__ import simplejson, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "simplejson loads {} seconds".format(result)


print "Complex real world data:" 
COMPLEX_DATA = {'status': 1, 'timestamp': 1362323499.23, 'site_code': 'testing123', 'remote_address': '212.179.220.18', 'input_text': u'ny monday for less than \u20aa123', 'locale_value': 'UK', 'eva_version': 'v1.0.3286', 'message': 'Successful Parse', 'muuid1': '11e2-8414-a5e9e0fd-95a6-12313913cc26', 'api_reply': {"api_reply": {"Money": {"Currency": "ILS", "Amount": "123", "Restriction": "Less"}, "ProcessedText": "ny monday for less than \\u20aa123", "Locations": [{"Index": 0, "Derived From": "Default", "Home": "Default", "Departure": {"Date": "2013-03-04"}, "Next": 10}, {"Arrival": {"Date": "2013-03-04", "Calculated": True}, "Index": 10, "All Airports Code": "NYC", "Airports": "EWR,JFK,LGA,PHL", "Name": "New York City, New York, United States (GID=5128581)", "Latitude": 40.71427, "Country": "US", "Type": "City", "Geoid": 5128581, "Longitude": -74.00597}]}}}
compare_json_and_simplejson(COMPLEX_DATA)
print "\nSimple data:"
SIMPLE_DATA = [1, 2, 3, "asasd", {'a':'b'}]
compare_json_and_simplejson(SIMPLE_DATA)

And the results on my system (Python 2.7.4, Linux 64-bit):

Complex real world data:
json dumps 1.56666707993 seconds
simplejson dumps 2.25638604164 seconds
json loads 2.71256899834 seconds
simplejson loads 1.29233884811 seconds

Simple data:
json dumps 0.370109081268 seconds
simplejson dumps 0.574181079865 seconds
json loads 0.422876119614 seconds
simplejson loads 0.270955085754 seconds

For dumping, json is faster than simplejson. For loading, simplejson is faster.

Since I am currently building a web service, dumps() is more important—and using a standard library is always preferred.

Also, cjson was not updated in the past 4 years, so I wouldn't touch it.

Ennead answered 21/4, 2013 at 12:53 Comment(3)
This is misleading. My answer below explains why.Crankcase
On my Win7 PC (i7 CPU), json (CPython 3.5.0) is 68%|45% faster on simple|complex dumps and 35%|17% on simple|complex loads w.r.t. simplejson v3.8.0 with C speedups using your benchmark code. Therefore, I would not use simplejson anymore with this setup.Annular
I just ran this on Python 3.6.1 and json wins or is the same for all the tests. In fact json is a little under twice as fast of the complex real world data dumps test!Visibility
C
29

All of these answers aren't very helpful because they are time sensitive.

After doing some research of my own I found that simplejson is indeed faster than the builtin, if you keep it updated to the latest version.

pip/easy_install wanted to install 2.3.2 on ubuntu 12.04, but after finding out the latest simplejson version is actually 3.3.0, so I updated it and reran the time tests.

  • simplejson is about 3x faster than the builtin json at loads
  • simplejson is about 30% faster than the builtin json at dumps

Disclaimer:

The above statements are in python-2.7.3 and simplejson 3.3.0 (with c speedups) And to make sure my answer also isn't time sensitive, you should run your own tests to check since it varies so much between versions; there's no easy answer that isn't time sensitive.

How to tell if C speedups are enabled in simplejson:

import simplejson
# If this is True, then c speedups are enabled.
print bool(getattr(simplejson, '_speedups', False))

UPDATE: I recently came across a library called ujson that is performing ~3x faster than simplejson with some basic tests.

Crankcase answered 24/7, 2013 at 1:48 Comment(5)
Thanks for mentioning ujson. This one lead me to another library RapidJSON which looks better maintainedJenisejenkel
don't use ujson, it's littered with bugs and memory leaks and crashers and hasn't been updated in quite some time. We've ditched it and switched to simplejson as it has more functionality than json and is updatedAmusement
@Amusement Your comment has fortunately not aged well ;-) ujson looks to be quite alive and active since… March 2020 :D So yes, when you wrote that it was true, but it seems (not having taken a closer looks since I'm quite happy with built-in json at this time) to have gotten better.Syllabize
ya, software is rather dynamic, glad someone picked up the torch!Amusement
Exactly they are time sensitive, but so is this answer! What about simplejson now compared to json for Python 10?Farlie
B
20

I've been benchmarking json, simplejson and cjson.

  • cjson is fastest
  • simplejson is almost on par with cjson
  • json is about 10x slower than simplejson

http://pastie.org/1507411:

$ python test_serialization_speed.py 
--------------------
   Encoding Tests
--------------------
Encoding: 100000 x {'m': 'asdsasdqwqw', 't': 3}
[      json] 1.12385 seconds for 100000 runs. avg: 0.011239ms
[simplejson] 0.44356 seconds for 100000 runs. avg: 0.004436ms
[     cjson] 0.09593 seconds for 100000 runs. avg: 0.000959ms

Encoding: 10000 x {'m': [['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19]], 't': 3}
[      json] 7.76628 seconds for 10000 runs. avg: 0.776628ms
[simplejson] 0.51179 seconds for 10000 runs. avg: 0.051179ms
[     cjson] 0.44362 seconds for 10000 runs. avg: 0.044362ms

--------------------
   Decoding Tests
--------------------
Decoding: 100000 x {"m": "asdsasdqwqw", "t": 3}
[      json] 3.32861 seconds for 100000 runs. avg: 0.033286ms
[simplejson] 0.37164 seconds for 100000 runs. avg: 0.003716ms
[     cjson] 0.03893 seconds for 100000 runs. avg: 0.000389ms

Decoding: 10000 x {"m": [["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19]], "t": 3}
[      json] 37.26270 seconds for 10000 runs. avg: 3.726270ms
[simplejson] 0.56643 seconds for 10000 runs. avg: 0.056643ms
[     cjson] 0.33007 seconds for 10000 runs. avg: 0.033007ms
Basile answered 28/1, 2011 at 22:39 Comment(3)
Please add a pastie for the actual test module.Ennead
which versions of Python and the libs in question?Medievalism
This is not true anymore. json in python2.7 is performance improvements.Twinscrew
I
14

Some values are serialized differently between simplejson and json.

Notably, instances of collections.namedtuple are serialized as arrays by json but as objects by simplejson. You can override this behaviour by passing namedtuple_as_object=False to simplejson.dump, but by default the behaviours do not match.

>>> import collections, simplejson, json
>>> TupleClass = collections.namedtuple("TupleClass", ("a", "b"))
>>> value = TupleClass(1, 2)
>>> json.dumps(value)
'[1, 2]'
>>> simplejson.dumps(value)
'{"a": 1, "b": 2}'
>>> simplejson.dumps(value, namedtuple_as_object=False)
'[1, 2]'
Install answered 18/3, 2016 at 18:7 Comment(0)
P
7

An API incompatibility I found, with Python 2.7 vs simplejson 3.3.1 is in whether output produces str or unicode objects. e.g.

>>> from json import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode("""{ "a":"b" }""")
{u'a': u'b'}

vs

>>> from simplejson import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode("""{ "a":"b" }""")
{'a': 'b'}

If the preference is to use simplejson, then this can be addressed by coercing the argument string to unicode, as in:

>>> from simplejson import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode(unicode("""{ "a":"b" }""", "utf-8"))
{u'a': u'b'}

The coercion does require knowing the original charset, for example:

>>> jd.decode(unicode("""{ "a": "ξηθννββωφρες" }"""))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 8: ordinal not in range(128)

This is the won't fix issue 40

Phosphaturia answered 14/10, 2013 at 18:19 Comment(0)
E
6

Another reason projects use simplejson is that the builtin json did not originally include its C speedups, so the performance difference was noticeable.

Emissive answered 3/4, 2009 at 16:42 Comment(0)
W
5

The builtin json module got included in Python 2.6. Any projects that support versions of Python < 2.6 need to have a fallback. In many cases, that fallback is simplejson.

Wintry answered 3/4, 2009 at 6:57 Comment(0)
G
4

Here's (a now outdated) comparison of Python json libraries:

Comparing JSON modules for Python (archive link)

Regardless of the results in this comparison you should use the standard library json if you are on Python 2.6. And.. might as well just use simplejson otherwise.

Gentlemanly answered 3/4, 2009 at 7:5 Comment(0)
A
3

json seems faster than simplejson in both cases of loads and dumps in latest version

Tested versions:

  • python: 3.6.8
  • json: 2.0.9
  • simplejson: 3.16.0

Results:

>>> def test(obj, call, data, times):
...   s = datetime.now()
...   print("calling: ", call, " in ", obj, " ", times, " times") 
...   for _ in range(times):
...     r = getattr(obj, call)(data)
...   e = datetime.now()
...   print("total time: ", str(e-s))
...   return r

>>> test(json, "dumps", data, 10000)
calling:  dumps  in  <module 'json' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\json\\__init__.py'>   10000  times
total time:  0:00:00.054857

>>> test(simplejson, "dumps", data, 10000)
calling:  dumps  in  <module 'simplejson' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\site-packages\\simplejson\\__init__.py'>   10000  times
total time:  0:00:00.419895
'{"1": 100, "2": "acs", "3.5": 3.5567, "d": [1, "23"], "e": {"a": "A"}}'

>>> test(json, "loads", strdata, 1000)
calling:  loads  in  <module 'json' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\json\\__init__.py'>   1000  times
total time:  0:00:00.004985
{'1': 100, '2': 'acs', '3.5': 3.5567, 'd': [1, '23'], 'e': {'a': 'A'}}

>>> test(simplejson, "loads", strdata, 1000)
calling:  loads  in  <module 'simplejson' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\site-packages\\simplejson\\__init__.py'>   1000  times
total time:  0:00:00.040890
{'1': 100, '2': 'acs', '3.5': 3.5567, 'd': [1, '23'], 'e': {'a': 'A'}}

For versions:

  • python: 3.7.4
  • json: 2.0.9
  • simplejson: 3.17.0

json was faster than simplejson during dumps operation but both maintained the same speed during loads operations

Attorneyatlaw answered 28/11, 2019 at 7:49 Comment(0)
H
2

simplejson module is simply 1,5 times faster than json (On my computer, with simplejson 2.1.1 and Python 2.7 x86).

If you want, you can try the benchmark: http://abral.altervista.org/jsonpickle-bench.zip On my PC simplejson is faster than cPickle. I would like to know also your benchmarks!

Probably, as said Coady, the difference between simplejson and json is that simplejson includes _speedups.c. So, why don't python developers use simplejson?

Hah answered 23/8, 2010 at 1:1 Comment(0)
A
2

In python3, if you a string of b'bytes', with json you have to .decode() the content before you can load it. simplejson takes care of this so you can just do simplejson.loads(byte_string).

Agamete answered 24/6, 2016 at 15:16 Comment(1)
Changed in version 3.6: s can now be of type bytes or bytearray. The input encoding should be UTF-8, UTF-16 or UTF-32.Proline
C
0

I came across this question as I was looking to install simplejson for Python 2.6. I needed to use the 'object_pairs_hook' of json.load() in order to load a json file as an OrderedDict. Being familiar with more recent versions of Python I didn't realize that the json module for Python 2.6 doesn't include the 'object_pairs_hook' so I had to install simplejson for this purpose. From personal experience this is why i use simplejson as opposed to the standard json module.

Cubage answered 7/7, 2015 at 12:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.