Choosing an appropriate way to use Neo4j in Python

Asked 22/5, 2012 at 13:5 Answered 24/5, 2012 at 19:26

I am currently using embedded python binding for neo4j. I do not have any issues currently since my graph is very small (sparse and upto 100 nodes). The algorithm I am developing involves quite a lot of traversals on the graph, more specifically DFS on the graph in general as well as on different subgraphs. In the future I intend to run the algorithm on large graphs (supposedly sparse and with millions of nodes).

Having read different threads related to the performance of python/neo4j bindings here, here, I wonder whether I should already switch to some REST API client for Python (like bulbflow, py2neo, neo4jrestclient) until I am too far to change all code.

Unfortunately, I did not find any comprehensive source of information to compare different approaches.

Could anyone provide some further insight into this issue? Which criteria should I take into account when choosing one of the options?

Appendicular answered 22/5, 2012 at 13:5 Comment(0)

Django is an MVC web framework so you may be interested in that if yours is to be a web application.

From the point of view of py2neo (of which I am the author), I am trying to focus hard on performance by using the batch execution mechanism automatically where appropriate as well as providing strong Cypher support. I have also recently put a lot of work into providing good options for uniqueness management within indexes - specifically, the get_or_create and add_if_none methods.

Homeward answered 23/5, 2012 at 17:38 Comment(0)

The easiest way to run algorithms from Python is to use Gremlin (https://github.com/tinkerpop/gremlin/wiki).

With Gremlin you can bundle everything into one HTTP request to reduce round-trip overhead.

Here's how to execute Gremlin scripts from Bulbs (http://bulbflow.com):

>>> from bulbs.neo4jserver import Graph
>>> g = Graph()
>>> script = "g.v(id).out('knows').out('knows')"
>>> params = dict(id=3)
>>> g.gremlin.execute(script, params)

The Bulbs Gremlin API docs are here: http://bulbflow.com/docs/api/bulbs/gremlin/

Luannaluanne answered 24/5, 2012 at 19:26 Comment(4)

thanks for the recommendation. I have already read some comparison of Gremlin vs Cypher. So, I guess I have to try out both to decide which one is more appropriate in my use-case. There seems to be a problem with bulbflow website. Do you know whether it will be up soon? – Appendicular 30/5, 2012 at 9:46

There was a DNS issue that's updating. For now you can access it here: bulbflow.herokuapp.com – Luannaluanne 30/5, 2012 at 21:17

is it possible to execute Cypher queries against neo4j using bulbflow? the documentation on this issue seems to be obscure. is it actually better (faster) to stick to Gremlin when working with bulbflow? – Appendicular 10/1, 2013 at 21:51

Yes, see this answer for examples of how to execute Cypher queries in Bulbs: #14281751 – Luannaluanne 6/8, 2013 at 1:42

Not really sure, I am not an expert, but I think it also depends on your Django expectations, and how much of a framework you need. Py2neo is very pragmatic and slim, Bulbflow seems to build up a whole mapping stack etc, and neo4jrestclient is concentrating on Django (that may be wrong)?

Bield answered 23/5, 2012 at 8:50 Comment(1)

I must confess I am not acquainted with Django. Is it not something related to web applications? I am doing everything locally on 1 machine right now. Should I still check it? – Appendicular 23/5, 2012 at 15:52

Recommended topics

Hot tags