Using Cypher to return nested, hierarchical JSON from a tree

Asked 11/12, 2015 at 23:42 Answered 1/7, 2022 at 17:45

I'm currently using the example data on console.neo4j.org to write a query that outputs hierarchical JSON.

The example data is created with

create (Neo:Crew {name:'Neo'}), (Morpheus:Crew {name: 'Morpheus'}), (Trinity:Crew {name: 'Trinity'}), (Cypher:Crew:Matrix {name: 'Cypher'}), (Smith:Matrix {name: 'Agent Smith'}), (Architect:Matrix {name:'The Architect'}),
(Neo)-[:KNOWS]->(Morpheus), (Neo)-[:LOVES]->(Trinity), (Morpheus)-[:KNOWS]->(Trinity),
(Morpheus)-[:KNOWS]->(Cypher), (Cypher)-[:KNOWS]->(Smith), (Smith)-[:CODED_BY]->(Architect)

The ideal output is as follows

name:"Neo"
children: [
  { 
    name: "Morpheus",
    children: [
      {name: "Trinity", children: []}
      {name: "Cypher", children: [
        {name: "Agent Smith", children: []}
      ]}
    ]
  }
]
}

Right now, I'm using the following query

MATCH p =(:Crew { name: "Neo" })-[q:KNOWS*0..]-m
RETURN extract(n IN nodes(p)| n)

and getting this

[(0:Crew {name:"Neo"})]
[(0:Crew {name:"Neo"}), (1:Crew {name:"Morpheus"})]
[(0:Crew {name:"Neo"}), (1:Crew {name:"Morpheus"}), (2:Crew {name:"Trinity"})]
[(0:Crew {name:"Neo"}), (1:Crew {name:"Morpheus"}), (3:Crew:Matrix {name:"Cypher"})]
[(0:Crew {name:"Neo"}), (1:Crew {name:"Morpheus"}), (3:Crew:Matrix {name:"Cypher"}), (4:Matrix {name:"Agent Smith"})]

Any tips to figure this out? Thanks

Beefeater answered 11/12, 2015 at 23:42 Comment(0)

In neo4j 3.x, after you install the APOC plugin on the neo4j server, you can call the apoc.convert.toTree procedure to generate similar results.

For example:

MATCH p=(n:Crew {name:'Neo'})-[:KNOWS*]->(m)
WITH COLLECT(p) AS ps
CALL apoc.convert.toTree(ps) yield value
RETURN value;

... would return a result row that looks like this:

    {
      "_id": 127,
      "_type": "Crew",
      "name": "Neo",
      "knows": [
        {
          "_id": 128,
          "_type": "Crew",
          "name": "Morpheus",
          "knows": [
            {
              "_id": 129,
              "_type": "Crew",
              "name": "Trinity"
            },
            {
              "_id": 130,
              "_type": "Crew:Matrix",
              "name": "Cypher",
              "knows": [
                {
                  "_id": 131,
                  "_type": "Matrix",
                  "name": "Agent Smith"
                }
              ]
            }
          ]
        }
      ]
    }

Ecdysiast answered 26/7, 2016 at 16:40 Comment(3)

The APOC repo gives the history of the name. – Ecdysiast 27/7, 2016 at 19:27

Yes. I was referring to the fact that the sample in this question is also from the Matrix. – Humfried 29/7, 2016 at 5:47

This is by far the most powerful feature (well for me at least) of the APOC plugin, and very easy to bring into Docker with Neo4j image (see documentation), if using docker-compose, use volumes: - <directory where you put the .jar>:/plugins, "build" and then "up". Thanks for sharing this @cybersam, changed my life and free time! – Quantitative 3/5, 2017 at 16:1

This was such a useful thread on this important topic, I thought I'd add a few thoughts after digging into this a bit further.

First off, using the APOC "toTree" proc has some limits, or better said, dependencies. It really matters how "tree-like" your architecture is. E.g., the LOVES relation is missing in the APOC call above and I understand why – that relationship is hard to include when using "toTree" – that simple addition is a bit like adding an attribute in a hierarchy, but as a relationship. Not bad to do but confounds the simple KNOWS tree. Point being, a good question to ask is “how do I handle such challenges”. This reply is about that.

I do recommend upping ones JSON skills as this will give you much more granular control. Personally, I found my initial exploration somewhat painful. Might be because I'm an XML person :) but once you figure out all the [, {, and ('s, it is really a powerful way to efficiently pull what's best described as a report on your data. And given the JSON is something that can easily become a class, it allows for a nice way to push that back to your app.

I have found perf to also be a challenge with "toTree" vs. just asking for the JSON. I've added below a very simplistic look into what your RETURN could look like. It follows the following BN format. I'd love to see this more maturely created as the possibilities are quite varied, but this was something I'd have found useful thus I’ll post this immature version for now. As they say; “a deeper dive is left up to the readers” 😊

I've obfuscated the values, but this is an actual query on what I’ll term a very poor example of a graph architecture, whose many design “mistakes” cause some significant performance headaches when trying to access a holistic report on the graph. As in this example, the initial report query I inherited took many minutes on a server, and could not run on my laptop - using this strategy, the updated query now runs in about 5 seconds on my rather wimpy laptop on a db of about 200K nodes and .5M relationships. I added the “persons” grouping alias as a reminder that "persons" will be different in each array element, but the parent construct will be repeated over and over again. Where you put that in your hand-grown tree, will matter, but having the ability to do that is powerful.

Bottom line, a mature use of JSON in the RETURN statement, gives you a powerful control over the results in a Cypher query.

RETURN STATEMENT CONTENT:    
<cypher_alias> 
      {.<cypher_alias_attribute>,
        ...,
        <grouping_alias>:
          (<cypher_alias>
            {.<cypher_alias_attribute,
              ...
            }
          )
        ...
      }

MATCH (j:J{uuid:'abcdef'})-[:J_S]->(s:S)<-[:N_S]-(n:N)-[:N_I]->(i:I), (j)-[:J_A]->(a:P)
WHERE i.title IN ['title1', 'title2']
WITH a,j, s, i, collect(n.description) as desc
RETURN j{.title,persons:(a{.email,.name}), s_i_note:
 (s{.title, i_notes:(i{.title,desc})})}

Lounging answered 14/6, 2020 at 18:56 Comment(0)

if you know how deep your tree is, you can write something like this

MATCH p =(:Crew { name: "Neo" })-[q:KNOWS*0..]-(m)

WITH nodes(p)[0] AS a, nodes(p)[1] AS b, nodes(p)[2] AS c, nodes(p)[3] AS d, nodes(p)[4] AS e
WITH (a{.name}) AS ab, (b{.name}) AS bb, (c{.name}) AS cb, (d{.name}) AS db, (e{.name}) AS eb

WITH ab, bb, cb, db{.*,children:COLLECT(eb)} AS ra
WITH ab, bb, cb{.*,children:COLLECT(ra)} AS rb
WITH ab, bb{.*,children:COLLECT(rb)} AS rc
WITH ab{.*,children:COLLECT(rc)} AS rd

RETURN rd

Line 1 is your query. You save all paths from Neo to m in p.
In line 2 p is split into a, b, c, d and e.
Line 3 takes just the namens of the nodes. If you want all properties you can write (a{.*}) AS ab. This step is optional you can also work with nodes if you want to.

In line 4 you replace db and eb with a map containing all properties of db and the new property children containing all entries of eb for the same db.
Lines 5, 6 and 7 are basically the same. You reduce the result list by grouping.

Finally you return the tree. It looks like this:

{
  "name": "Neo",
  "children": [
    {
      "name": "Morpheus",
      "children": [
        {"name": "Trinity", "children": []},
        {"name": "Cypher","children": [
            {"name": "Agent Smith","children": []}
          ]
        }
      ]
    }
  ]
}

Unfortunately this solution only works when you know how deep your tree is and you have to add a row if your tree is one step deeper.
If someone has an idea how to solve this with dynamic tree depth, please comment.

Yellowweed answered 1/7, 2022 at 17:45 Comment(0)

Recommended topics

Hot tags