After answering a question about how to parse a text file containing arrays of floats, I ran the following benchmark:
import timeit
import random
line = [random.random() for x in range(1000)]
n = 10000
json_setup = 'line = "{}"; import json'.format(line)
json_work = 'json.loads(line)'
json_time = timeit.timeit(json_work, json_setup, number=n)
print "json: ", json_time
ast_setup = 'line = "{}"; import ast'.format(line)
ast_work = 'ast.literal_eval(line)'
ast_time = timeit.timeit(ast_work, ast_setup, number=n)
print "ast: ", ast_time
print "time ratio ast/json: ", ast_time / json_time
I ran this code several times and consistently got this kind of results:
$ python json-ast-bench.py
json: 4.3199338913
ast: 28.4827561378
time ratio ast/json: 6.59333148483
So it appears that json
is almost an order of magnitude faster than ast
for this use case.
I had the same results with both Python 2.7.5+ and Python 3.3.2+.
Questions:
- Why is json.loads so much faster ? This question seems to imply that ast is more flexible regarding the input data (double or single quotes)
- Are there use cases where I would prefer to use
ast.literal_eval
overjson.loads
although it's slower ?
Edit: Anyway if performance matters, I would recommend using UltraJSON (just what I use at work, ~4 times faster than json using the same mini-benchmark).
ast.literal_eval
is so lightly used that nobody felt it was worth the time to work (& work, & work) at speeding it. In contrast, the JSON libraries are routinely used to parse gigabytes of data. – Geanstr(python_list)
rather than JSON; JSON just didn't spring to mind immediately. – Pickerelweed-2
or1+2j
or[i+1 for i in range(5)]
, just feed it toast.dump(ast.parse(s))
. – Stash