UPDATE 12/30/2014
The easiest way to achieve this would be to use the object_hook
callback of the JSONDecoder
as described in my old answer below. But, since this would require an extra function call for each key-value pair in the data, this might have an impact on performance.
So, if you truly want to change how json
handles None, you need to dig a little deeper. The JSONDecoder
uses a scanner to find certain tokens in the JSON input. Unfortunately, this is a function and not a class, therefore subclassing is not that easy. The scanner function is called py_make_scanner
and can be found in json/scanner.py. It is basically a function that gets a JSONDecoder as an argument and returns a scan_once
function. The scan_once
function receives a string and an index of the current scanner position.
A simple customized scanner function could look like this:
import json
def make_my_scanner(context):
# reference to actual scanner
interal_scanner = json.scanner.py_make_scanner(context)
# some references for the _scan_once function below
parse_object = context.parse_object
parse_array = context.parse_array
parse_string = context.parse_string
encoding = context.encoding
strict = context.strict
object_hook = context.object_hook
object_pairs_hook = context.object_pairs_hook
# customized _scan_once
def _scan_once(string, idx):
try:
nextchar = string[idx]
except IndexError:
raise StopIteration
# override some parse_** calls with the correct _scan_once
if nextchar == '"':
return parse_string(string, idx + 1, encoding, strict)
elif nextchar == '{':
return parse_object((string, idx + 1), encoding, strict,
_scan_once, object_hook, object_pairs_hook)
elif nextchar == '[':
return parse_array((string, idx + 1), _scan_once)
elif nextchar == 'n' and string[idx:idx + 4] == 'null':
return 'Cat', idx + 4
# invoke default scanner
return interal_scanner(string, idx)
return _scan_once
Now we just need a JSONDecoder
subclass that will use our scanner instead of the default scanner:
class MyJSONDecoder(json.JSONDecoder):
def __init__(self, encoding=None, object_hook=None, parse_float=None,
parse_int=None, parse_constant=None, strict=True,
object_pairs_hook=None):
json.JSONDecoder.__init__(self, encoding, object_hook, parse_float, parse_int, parse_constant, strict, object_pairs_hook)
# override scanner
self.scan_once = make_my_scanner(self)
And then use it like this:
decoder = MyJSONDecoder()
print decoder.decode('{"field1":null, "field2": "data!"}')
Old answer, but still valid if you do not care about the performance impact of another function call:
You need to create a JSONDecoder
object with a special object_hook
method:
import json
def parse_object(o):
for key in o:
if o[key] is None:
o[key] = 'Cat'
return o
decoder = json.JSONDecoder(object_hook=parse_object)
print decoder.decode('{"field1":null, "field2": "data!"}')
# that will print: {u'field2': u'data!', u'field1': u'Cat'}
According to the Python documentation of the json module:
object_hook is an optional function that will be called with the result of any object literal decoded (a dict). The return value of object_hook will be used instead of the dict.
So parse_object
will get a dictionary that can be manipulated by exchanging all None
values with 'Cat'. The returned object/dictionary will then be used in the output.
dict.values()
after you decode the JSON, and convertNone
to'Cat'
? If not, see this: taketwoprogramming.blogspot.com/2009/06/… – Aubreyfor key in old_dict: old_dict[key] = 'Cat' if old_dict[key] == None
It's not bad at all to do this right after. – Aubrey