Gremlin Python returning empty graph
Asked Answered
S

2

5

I've started playing around with gremlin-python wrapper to interact with my gremlin server.

I did the following steps:

./bin/gremlin.sh

Once the Gremlin console opens up, I loaded configurations using:

graph = JanusGraphFactory.open('conf/gremlin-server/janusgraph-cassandra-es.properties')
g = graph.traversal()
saturn = g.V().has('name', 'saturn')

And the above set of codes in gremlin shell works fine, and I can see verteces listed down, but when I try to do same in python I get an empty graph. The following is my code for python:

graph = Graph()
g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
print(g)

It returns : graphtraversalsource[graph[empty]]

Why am I getting empty graph? As far as I feel, it is unable to connect to same Graph source. Is there somthing I'm missing?

Note that in:

JanusGraphFactory.open('conf/gremlin-server/janusgraph-cassandra-es.properties')

the config filename provided is one used to start gremlin server.

Any help is really appreciated.

Thanks

Spokeswoman answered 6/9, 2017 at 8:25 Comment(0)
E
13

The reason you are seeing graph[empty] is because that's the actual string representation of the Python graph object -- see the code here. The graph may actually contain data though, so it would be better if it was something like graph[remote] or graph[] instead. I've opened up an issue to address this.

Out of the box, JanusGraph isn't configured for Python. You can find docs on how do this in the Apache TinkerPop docs. First install gremlin-python. Here's the command assuming you're using JanusGraph 0.1.1 which uses TinkerPop 3.2.3:

bin/gremlin-server.sh -i org.apache.tinkerpop gremlin-python 3.2.3

Next modify the conf/gremlin-server/gremlin-server.yaml to add the gremlin-python script engine:

scriptEngines: {
  gremlin-groovy: {
    imports: [java.lang.Math],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]},
  gremlin-jython: {},
  gremlin-python: {}
}

To use Gremlin Python, you need to go through a Gremlin Server, so start the JanusGraph pre-packaged distribution:

bin/janusgraph.sh start

From the Gremlin Console:

gremlin> graph = JanusGraphFactory.open('conf/janusgraph-cassandra-es.properties')
==>standardjanusgraph[cassandrathrift:[127.0.0.1]]
gremlin> GraphOfTheGodsFactory.load(graph)
==>null
gremlin> g = graph.traversal()
==>graphtraversalsource[standardjanusgraph[cassandrathrift:[127.0.0.1]], standard]
gremlin> g.V().count()
14:51:58 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [()]. For better performance, use indexes
==>12

Install the Gremlin-Python driver, again matching on the TinkerPop version:

pip install gremlinpython==3.2.3

From the Python 3 shell:

>>> from gremlin_python import statics
>>> from gremlin_python.structure.graph import Graph
>>> from gremlin_python.process.graph_traversal import __
>>> from gremlin_python.process.strategies import *
>>> from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
>>> graph = Graph()
>>> g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
>>> print(graph)
graph[empty]
>>> print(g)
graphtraversalsource[graph[empty]]
>>> g.V().count().next()
12
>>> g.addV('god').property('name', 'mars').property('age', 3500).next()
v[4280]
>>> g.V().count().next()
13

Keep in mind when you are working in the Python shell, the graph traversals are not automatically iterated, so you need to make sure to iterate the traversal with iterate() or next() or toList().

Embrace answered 6/9, 2017 at 18:55 Comment(11)
Thanks for the detailed steps but I'm still facing an issue. When I do bin/janusgraph.sh start Its able to connect to cassandra & es, but timeout on gremlin-server. I went though logs but there was no stacktrace to point out what exactly was error, just that I'm getting time out. I increased the wait time from default 60 to 120 but still same issue. Is that expected? ThanksSpokeswoman
Going to my comment of connection timeout, just made a discovery. If I add gremlin-jython: {}, gremlin-python: {} to scriptEngines in conf/gremlin-server/gremlin-server.yaml I face timeout error but without that I dont. But without that, I'm still unable to fetch any results Even g.V().count().next() throws an error KeyError: NoneSpokeswoman
I've edited my post above to add a couple more steps to install the gremlin-python plugin. If you are getting a timeout on the Gremlin Server, try kill its process then start it again with bin/gremlin-server.sh and then share the output in your original question.Embrace
I was unable to make it working from bin/janusgraph.sh start but got it working by bin/gremlin-server.sh. Also the error of nothing getting fetched is solved now after using Gremlin-Python version 3.2.3. I was using 3.3.0 prior and maybe version mismatch. But now I have another query, how do you commit changes? I was able to add vertex by doing g.addV('god').property('name', 'mars').property('age', 3500) and my result shows my vertex. But how do I commit? I tried g.addV(label, 'god', 'name', 'mars', 'age', 3000).tx().commit() and that failed. Do I need to create my own Traversal()?Spokeswoman
And, adding to my previous point, how do we load GraphSON into gremlin-python? I went to tinkerpop.apache.org/docs/current/reference/#gremlin-python - > Custom Serialization but couldnt understand it. Sorry for bugging so much, and would add to existing question if required, but any help is grately appreciated.Spokeswoman
You're asking so many questions here in the Stack Overflow comments section that are no longer related to your original question. You should start a top-level post on the gremlin-users mailing list.Embrace
Quick answers, 1. A traversal like g.addV('god').property('name', 'mars').property('age', 3500).next() would get auto-committed. You need to make sure to iterate. 2. There is no provided API that can load GraphSON with gremlin-python at the moment. You'd have to parse the document yourself and construct the graph using traversals.Embrace
Hi, Thanks for the response. I wasn't doint next() and that is why I records werent getting auto updated. As for loading GraphSON, if there is no API provided, is there any way to load data containing 1million rows effectively? Manually iterating over each node, and adding to graph doesn't seem like a optimal way to go forward if I'm right.Spokeswoman
As for gremlin mailing list, for some reason my questions arent getting approved by MOD! Thanks for patience, hopefully this would be last question!:-)Spokeswoman
can you help me with another question posted at #46139953 thanksSpokeswoman
I have published a complete walk through on how to connect to JanusGraph from Python medium.com/@BGuigal/janusgraph-python-9e8d6988c36cLynnet
A
1

Your local "g" in the Gremlin Console is an embedded instance of a graph. It therefore "contains" something and is not empty. For your "g" in Python, it is "empty" in the sense that on its own there are no vertices/edges that within it - the vertices/edges are in the remote graph on Gremlin Server that it reflects. I assume that if you were to do a g.V().count() in python you would get the same vertex count back as you would if you did the same in java. If not, then there is some other problem, but do not expect a "remote" graph instance to show vertex/edges of any sort (unless a day comes where gremlin-python is written as a Gremlin virtual machine that has it's own Python native graph databases attached to it - in such a case, "g" would be embedded and thus own vertices/edges and would likely no longer print as "empty").

Agnola answered 6/9, 2017 at 10:41 Comment(9)
So do you mean to say that python's grimlin wrapper is unable to fetch the Data/Graph stored on remote server? If that is the case, fetching empty graph seems like not an issue. But if that is case, then how do we fetch the Graph stored on DB, query on it and fetch results using python?Spokeswoman
no. it is perfectly capable of getting data from the remote graph. all i'm saying is that it says "empty" because the data is not local. it is analogous to EmptyGraph.instance() in Java. you only use it as a reference to a remote graph that actually holds the data. basically, don't be confused by the label "empty" - it bears no significance to the data that is actually available remotely.Agnola
Correct me if m wrong, so you mean it shows empty because it actually doesnt store any data locally, but rather reference my remote dataset? If that is case then as you suggested, g.V().count() should give me some results? The count of remote object right? But even that throws up empty as [['V'], ['count']]Spokeswoman
what you have said is correct and g.V().count() should return something. The output that you mention as "empty as [['V'], ['count']]" is gremlin bytecode representation of that traversal - doing x = g.V().count() the "x" is just the traversal instance and not the result. you need to iterate that traversal in some way. in this case you would want to do, "x = g.V().count().next()"Agnola
So, I did g.V().count().next(), and now its throwing an exception. KeyError: None. Possible reason might be that my graph instance is actually empty. Any ideas regarding this?Spokeswoman
I'm not sure what that error means. I would expect you to get a zero if there were no vertices - not an error. If you do g.V().count() in the gremlin console via a JanusGraph instance does it show you have vertices? perhaps do an addV() from python and then a count to see what happens?Agnola
So, I do g.V().count() from gremlin, and that works like a charm. I also did addV() and then tried printing it back, though I didnt commit, and the result stayed the same!!Spokeswoman
I'm not sure what else to try. If I were you, I would probably simplify. Setup just gremlin server without janusgraph (use TinkerGraph) and get gremlin-python connected to that. Make sure you can add vertices and get counts. If it works with tinkergraph and doesn't in janusgraph then that helps isolate the problem.Agnola
The KeyError: None error is most probably caused by a TinkerPop version mismatch between JanusGraph and gremlin-pythonLynnet

© 2022 - 2024 — McMap. All rights reserved.