Python JSONDecoder custom translation of null type
Asked Answered
H

1

6

In python the JSONDecoder preforms the translation of null to None by default as seen below. How can I change that translation of null -> None to something different. i.e. null -> 'Cat'

class json.JSONDecoder([encoding[, object_hook[, parse_float[, parse_int[, parse_constant[, strict[, object_pairs_hook]]]]]]])

Simple JSON decoder.

Performs the following translations in decoding by default:
  JSON  Python
  object    dict
  array     list
  string    unicode
  number (int)  int, long
  number (real)     float
  true  True
  false     False
  null  None

I would like json.loads({"field1":null, "field2": "data!"})

to return {u'field2': u'data!', u'field1': u'Cat'}

Harwell answered 29/12, 2014 at 20:58 Comment(3)
Can you just spin through dict.values() after you decode the JSON, and convert None to 'Cat'? If not, see this: taketwoprogramming.blogspot.com/2009/06/…Aubrey
mattm I could just spin through all the values after they have been translated but would like to avoid the additional processing of compares for each value. I would like to extend JSONDecoder somehow to change the initial translation to just handle the switch from 'None' to 'Cat'Harwell
for key in old_dict: old_dict[key] = 'Cat' if old_dict[key] == None It's not bad at all to do this right after.Aubrey
B
8

UPDATE 12/30/2014

The easiest way to achieve this would be to use the object_hook callback of the JSONDecoder as described in my old answer below. But, since this would require an extra function call for each key-value pair in the data, this might have an impact on performance.

So, if you truly want to change how json handles None, you need to dig a little deeper. The JSONDecoder uses a scanner to find certain tokens in the JSON input. Unfortunately, this is a function and not a class, therefore subclassing is not that easy. The scanner function is called py_make_scanner and can be found in json/scanner.py. It is basically a function that gets a JSONDecoder as an argument and returns a scan_once function. The scan_once function receives a string and an index of the current scanner position.

A simple customized scanner function could look like this:

import json

def make_my_scanner(context):
    # reference to actual scanner
    interal_scanner = json.scanner.py_make_scanner(context)

    # some references for the _scan_once function below
    parse_object = context.parse_object
    parse_array = context.parse_array
    parse_string = context.parse_string
    encoding = context.encoding
    strict = context.strict
    object_hook = context.object_hook
    object_pairs_hook = context.object_pairs_hook

    # customized _scan_once
    def _scan_once(string, idx):
        try:
            nextchar = string[idx]
        except IndexError:
            raise StopIteration

        # override some parse_** calls with the correct _scan_once
        if nextchar == '"':
            return parse_string(string, idx + 1, encoding, strict)
        elif nextchar == '{':
            return parse_object((string, idx + 1), encoding, strict,
                _scan_once, object_hook, object_pairs_hook)
        elif nextchar == '[':
            return parse_array((string, idx + 1), _scan_once)
        elif nextchar == 'n' and string[idx:idx + 4] == 'null':
            return 'Cat', idx + 4

        # invoke default scanner
        return interal_scanner(string, idx)

    return _scan_once

Now we just need a JSONDecoder subclass that will use our scanner instead of the default scanner:

class MyJSONDecoder(json.JSONDecoder):
    def __init__(self, encoding=None, object_hook=None, parse_float=None,
            parse_int=None, parse_constant=None, strict=True,
            object_pairs_hook=None):

        json.JSONDecoder.__init__(self, encoding, object_hook, parse_float, parse_int, parse_constant, strict, object_pairs_hook)

        # override scanner
        self.scan_once = make_my_scanner(self)

And then use it like this:

decoder = MyJSONDecoder()
print decoder.decode('{"field1":null, "field2": "data!"}')

Old answer, but still valid if you do not care about the performance impact of another function call:

You need to create a JSONDecoder object with a special object_hook method:

import json

def parse_object(o):
    for key in o:
        if o[key] is None:
            o[key] = 'Cat'
    return o

decoder = json.JSONDecoder(object_hook=parse_object)

print decoder.decode('{"field1":null, "field2": "data!"}')
# that will print: {u'field2': u'data!', u'field1': u'Cat'}

According to the Python documentation of the json module:

object_hook is an optional function that will be called with the result of any object literal decoded (a dict). The return value of object_hook will be used instead of the dict.

So parse_object will get a dictionary that can be manipulated by exchanging all None values with 'Cat'. The returned object/dictionary will then be used in the output.

Boresome answered 29/12, 2014 at 21:41 Comment(3)
Timo D, this is a great solution but not exactly what I am trying to accomplish. During the initial translation within the JSONDecoder there is a mapping that equates 'null' to 'None' and I want to change that mapping by extending JSONDecoder somehow. There would be a performance advantage to use a custom translation initially over post processing with the object_hook. I am assuming the object_hook function is called after the initial translation has occurred changing 'null' -> 'Cat'. So how can I change the translation rule by extending JSONDecoder?Harwell
Fair enough, I updated my answer to include a way that should not impact performanceBoresome
Timo D, most excellent explanation thank you. Nice to see how to specify a custom scanner, this opens a new world of possibilities.Harwell

© 2022 - 2024 — McMap. All rights reserved.