I am using embedded janusgraph in my java backend my code depends on janusgraph instanciated from graph = JanusGraphFactory.open(conf)
AFAIK this connects to Cassandra and elastic search directly and run the janusgraph processor in my backend application JVM. But if I want to scale janusgraph I need to run separate janusgraph servers on a cluster and need to connect to these servers as the client from my backend.
According to remote janusgraph example on github this is accomplished using instantiating an EmptyGraph graph = EmptyGraph.instance();
which is not instance of JanusGraph but of org.apache.tinkerpop.gremlin.structure.util.empty.EmptyGraph;
.
I can understand from the example above that I can only use gremlin queries by submitting them to janusgraph server, but I will not be able to use the management APIs directly unless submitting the code as a string to the server.
Finally, I can understand that it is better for scalability to run janusgraph server separately but I will lose the direct access in my code to janusgraph apis so I want to know if something I miss understand and what are the pros and cons in remote deployment approach and what I will lose against embedded approach?
Edit:
According to this answer correct it if wrong:
Pros/Cons of connecting to the remote gremlin server
Pros
- The server has much more control and all the queries are centralized.
- Since every one is running traversal/queries via the remote gremlin server, all are transactionally protected. The remote gremlin server runs your traversal/queries by default in a transaction.
- Central strategy management
- Central schema management
Cons
- Tough to do a manual transaction management
- You have to use groovy script as string and send it to remove (Cluster submit) for transactional execution of your code.