Is it possible to iterate through all nodes with py2neo
Asked Answered
C

3

12

Is there a way to iterate through every node in a neo4j database using py2neo?

My first thought was iterating through GraphDatabaseService, but that didn't work. If there isn't a way to do it with py2neo, is there another python interface that would let me?

Edit: I'm accepting @Nicholas's answer for now, but I'll update it if someone can give me a way that returns a generator.

Conviction answered 18/6, 2012 at 2:9 Comment(0)
M
13

I would suggest doing that with asynchronous Cypher, something like:

    from py2neo import neo4j, cypher

    graph_db = neo4j.GraphDatabaseService()

    def handle_row(row):
        node = row[0]
        # do something with `node` here

    cypher.execute(graph_db, "START z=node(*) RETURN z", row_handler=handle_row)

Of course you might want to exclude the reference node or otherwise tweak the query.

Nige

Merth answered 18/6, 2012 at 14:51 Comment(4)
Thanks, looks like this works. I'm assuming for a large graph it won't load all of them into python memory at once, correct?Conviction
Correct. The asynchronous Cypher execution submits each row for handling as it's received from the HTTP response stream.Merth
As of py2neo 1.6 (due for release October 2013) this will be possible with a streamed set of Cypher query results and standard Python iteration.Merth
It gives me the error TypeError: <function handle_row at 0x7eff53168578> is not JSON serializable. What would be the equivalent for recent versions of py2neo?Inessive
A
4

One of two solutions come to mind. Either do a cypher query

START n=node(*) return n

The other, and I'm not familiar with python so I'm going to give the example in Java is

GlobalGraphOperations.at(graphDatabaseService).getAllNodes()

which is the way the the old deprecated graphDatabaseService.getAllNodes() recommends.

Andyane answered 18/6, 2012 at 2:51 Comment(2)
Thanks. Executing the cypher query START n=node(*) return n returns a list, but couldn't find an analog to your second answer. Now accepting answers that return generators.Conviction
I have considered several options for implementing a generator to iterate though all nodes in the database. Unfortunately, I don't think there is a way to achieve this without either (i) keeping the HTTP connection open until the application code has iterated through all items or (ii) loading all items into memory beforehand. The key issue with the generator approach is that that traversal is necessarily controlled by the code using the generator instead of that providing it. This is why I feel the callback mechanism is preferable for this purpose.Merth
F
4

For newer versions of py2neo the accepted version no longer works. Instead use:

from py2neo import Graph

graph = Graph("http://user:pass@localhost:7474/db/data/")

for n in graph.cypher.stream("START z=node(*) RETURN z"):
    //do something with node here
    print n
Faker answered 4/2, 2016 at 21:46 Comment(1)
Looks like this one doesn't work in py2neo 4 as it gives error as below AttributeError: 'Graph' object has no attribute 'cypher' :(Sheriff

© 2022 - 2024 — McMap. All rights reserved.