Pretty print json but keep inner arrays on one line python
Asked Answered
F

3

47

I am pretty printing a json in Python using this code:

json.dumps(json_output, indent=2, separators=(',', ': ')

This prints my json like:

{    
    "rows_parsed": [
        [
          "a",
          "b",
          "c",
          "d"
        ],
        [
          "e",
          "f",
          "g",
          "i"
        ],
    ]
}

However, I want it to print like:

{    
    "rows_parsed": [
        ["a","b","c","d"],
        ["e","f","g","i"],
    ]
}

How can I keep the arrays that are in arrays all on one line like above?

Fennessy answered 8/10, 2014 at 19:18 Comment(7)
Note that your desired output does not keep all arrays on one line.Sewing
Great point. Let me clarify my question.Fennessy
(Easy:) consider pprint. (Hard:) consider writing a custom JSONEncoder and pass it as cls argument to dumps. (Obligatory:) think again why you need this all.Cranach
Possible duplicate of JSON dumps custom formattingGlut
Do you want to keep "arrays that are in arrays" all on one line, or do you really want to keep arrays that doesn't contain other arrays or dicts on one line? The latter seems like a more natural thing to want.Dialogist
@Cranach pprint does not produce valid json.Easternmost
@mara004: Indeed, it uses single quotes. I can't edit my comment though :(Cranach
K
5

Here is a way to do it with as least amount of modifications as possible:

import json
from json import JSONEncoder
import re

class MarkedList:
    _list = None
    def __init__(self, l):
        self._list = l

z = {    
    "rows_parsed": [
        MarkedList([
          "a",
          "b",
          "c",
          "d"
        ]),
        MarkedList([
          "e",
          "f",
          "g",
          "i"
        ]),
    ]
}

class CustomJSONEncoder(JSONEncoder):
    def default(self, o):
        if isinstance(o, MarkedList):
            return "##<{}>##".format(o._list)

b = json.dumps(z, indent=2, separators=(',', ':'), cls=CustomJSONEncoder)
b = b.replace('"##<', "").replace('>##"', "")

print(b)

Basically the lists that you want formatted in that way you make instance of MarkedList and they get parsed as strings with hopefully unique enough sequence that is later stripped from the output of dumps. This is done to eliminate the quotes that are put around a json string.

Another much more efficient way to do it, but a much more ugly one is to monkey patch json.encoder._make_iterencode._iterencode with something like:

def _iterencode(o, _current_indent_level):
    if isinstance(o, str):
        yield _encoder(o)
    elif o is None:
        yield 'null'
    elif o is True:
        yield 'true'
    elif o is False:
        yield 'false'
    elif isinstance(o, int):
        # see comment for int/float in _make_iterencode
        yield _intstr(o)
    elif isinstance(o, float):
        # see comment for int/float in _make_iterencode
        yield _floatstr(o)
    elif isinstance(o, MarkedList):
        yield _my_custom_parsing(o)
    elif isinstance(o, (list, tuple)):
        yield from _iterencode_list(o, _current_indent_level)
    elif isinstance(o, dict):
        yield from _iterencode_dict(o, _current_indent_level)
    else:
        if markers is not None:
            markerid = id(o)
            if markerid in markers:
                raise ValueError("Circular reference detected")
            markers[markerid] = o
        o = _default(o)
        yield from _iterencode(o, _current_indent_level)
        if markers is not None:
            del markers[markerid]
Knobby answered 11/9, 2018 at 19:45 Comment(1)
"##<{}>##".format(o._list) doesn't work if the list contains None, and also renders single quotes which aren't valid json.Whirl
E
1

I don't see how you could do it in the json.dumps. After a bit of searching I came across a few options: One option would be to do some post-processing with a custom function:

def fix_json_indent(text, indent=3):
            space_indent = indent * 4
    initial = " " * space_indent
    json_output = []
    current_level_elems = []
    all_entries_at_level = None  # holder for consecutive entries at exact space_indent level
    for line in text.splitlines():
        if line.startswith(initial):
            if line[space_indent] == " ":
                # line indented further than the level
                if all_entries_at_level:
                    current_level_elems.append(all_entries_at_level)
                    all_entries_at_level = None
                item = line.strip()
                current_level_elems.append(item)
                if item.endswith(","):
                    current_level_elems.append(" ")
            elif current_level_elems:
                # line on the same space_indent level
                # no more sublevel_entries 
                current_level_elems.append(line.strip())
                json_output.append("".join(current_level_elems))
                current_level_elems = []
            else:
                # line at the exact space_indent level but no items indented further
                if all_entries_at_level:
                    # last pending item was not the start of a new sublevel_entries.
                    json_output.append(all_entries_at_level)
                all_entries_at_level = line.rstrip()
        else:
            if all_entries_at_level:
                json_output.append(all_entries_at_level)
                all_entries_at_level = None
            if current_level_elems:
                json_output.append("".join(current_level_elems))
            json_output.append(line)
    return "\n".join(json_output)

Another possibility is a regex but it is quite ugly and depends on the structure of the code you posted:

def fix_json_indent(text):
    import re
    return  re.sub('{"', '{\n"', re.sub('\[\[', '[\n[', re.sub('\]\]', ']\n]', re.sub('}', '\n}', text))))
Eme answered 10/7, 2019 at 12:51 Comment(0)
T
0

I modified @Martin Gergov answer a bit to make things simpler and more JSON-friendly.

def transform(json_obj, indent=4):
    def inner_transform(o):
        if isinstance(o, list) or isinstance(o, tuple):
            for v in o:
                if isinstance(v, dict):
                    return [inner_transform(v) for v in o]
                # elif isinstance(v, list): # check note on the bottom
                #     ...
            return "##<{}>##".format(json.dumps(o))
        elif isinstance(o, dict):
            return {k: inner_transform(v) for k, v in o.items()}
        return o

    if isinstance(json_obj, dict):
        transformed = {k: inner_transform(v) for k, v in json_obj.items()}
    elif isinstance(json_obj, list) or isinstance(json_obj, tuple):
        transformed = [inner_transform(v) for v in json_obj]

    transformed_json = json.dumps(transformed, separators=(', ', ': '), indent=indent)
    transformed_json = transformed_json.replace('"##<', "").replace('>##"', "").replace('\\"', "\"")

    return transformed_json

Test it with this

data = [
    [
        [1,2,3],
        {
            "a": ["a", 'b', "c", "d"],
            "b": {
                "x": [1, 2, 3, None],
                "y": "value"
            },
            "c": [1, 2, 3]
        }
    ]
]

pretty_json = transform(data)
print(pretty_json)

Result:

[
    [
        [1, 2, 3],
        {
            "a": ["a", "b", "c", "d"],
            "b": {
                "x": [1, 2, 3, null],
                "y": "value"
            },
            "c": [1, 2, 3]
        }
    ]
]

Unless if you want a list which contains a list which contains a list+ which contains a dict like [[1,2,[2, {"a": 0}]]] you'd have to modify that yourself...

Theobald answered 13/7, 2024 at 15:7 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.