Create nested JSON from flat csv
Asked Answered
C

1

3

Trying to create a 4 deep nested JSON from a csv based upon this example:

Region,Company,Department,Expense,Cost
Gondwanaland,Bobs Bits,Operations,nuts,332
Gondwanaland,Bobs Bits,Operations,bolts,254
Gondwanaland,Maureens Melons,Operations,nuts,123

At each level I would like to sum the costs and include it in the outputted JSON at the relevant level.

The structure of the outputted JSON should look something like this:

    {
          "id": "aUniqueIdentifier", 
          "name": "usually a nodes name", 
          "data": [
                {
                      "key": "some key", 
                      "value": "some value"
                }, 
                {
                      "key": "some other key", 
                      "value": "some other value"
                }
          ], 
          "children": [/* other nodes or empty */ ]
    }

(REF: http://blog.thejit.org/2008/04/27/feeding-json-tree-structures-to-the-jit/)

Thinking along the lines of a recursive function in python but have not had much success with this approach so far... any suggestions for a quick and easy solution greatly appreciated?

UPDATE: Gradually giving up on the idea of the summarised costs because I just can't figure it out :(. I'not much of a python coder yet)! Simply being able to generate the formatted JSON would be good enough and I can plug in the numbers later if I have to.

Have been reading, googling and reading for a solution and on the way have learnt a lot but still no success in creating my nested JSON files from the above CSV strucutre. Must be a simple solution somewhere on the web? Maybe somebody else has had more luck with their search terms????

Chowchow answered 30/10, 2011 at 23:39 Comment(4)
If this is homework, please label it as such.Moonshine
nope its experimentations with data visualisation of government budgets by a newbie... Trying to put 15,000 records into a nice treemap with on demand nodes (thejit.org) . Have been googling and experimenting all day without luck... lots of similar things but nothing working for me yet.Chowchow
Fair enough. I could show you how to write the grouping of rows into a nested data structure, but I'm unclear on how the fields in your sample input map to the target structure. Can you post the exact json output expected given the sample input you've shown?Moonshine
Thanks for the pointers! First experience with js and python. I have some learning to do before I solve this oneChowchow
M
8

Here are some hints.

Parse the input to a list of lists with csv.reader:

>>> rows = list(csv.reader(source.splitlines()))

Loop over the list to buildi up your dictionary and summarize the costs. Depending on the structure you're looking to create the build-up might look something like this:

>>> summary = []
>>> for region, company, department, expense, cost in rows[1:]:
    summary.setdefault(*region, company, department), []).append((expense, cost))

Write the result out with json.dump:

>>> json.dump(summary, open('dest.json', 'wb'))

Hopefully, the recursive function below will help get you started. It builds a tree from the input. Please be aware of what type you want your leaves to be in, which we label as the "cost". You'll need to elaborate on the function to build-up the exact structure you intend:

import csv, itertools, json

def cluster(rows):
    result = []
    for key, group in itertools.groupby(rows, key=lambda r: r[0]):
        group_rows = [row[1:] for row in group]
        if len(group_rows[0]) == 2:
            result.append({key: dict(group_rows)})
        else:
            result.append({key: cluster(group_rows)})
    return result

if __name__ == '__main__':
    s = '''\
Gondwanaland,Bobs Bits,Operations,nuts,332
Gondwanaland,Bobs Bits,Operations,bolts,254
Gondwanaland,Maureens Melons,Operations,nuts,123
'''
    rows = list(csv.reader(s.splitlines()))
    r = cluster(rows)
    print json.dumps(r, indent=4)
Moonshine answered 31/10, 2011 at 0:6 Comment(2)
It is only showing: { "Gondwanaland": [ { "Bobs Bits": [ { "Operations": [ "nuts" ] } ], "Maureens Melons": [ { "Operations": [ "nuts" ] } ] } ] } The second line `Gondwanaland,Bobs Bits,Operations,bolts,254' is not showing. Also, how to add the last column as value(nuts:332)? Can you please help?Agee
Fixed-up the last level of grouping. Thanks for noticing the issue.Moonshine

© 2022 - 2024 — McMap. All rights reserved.