TinkerPop does not really provide bulk-loading tools and relies on the native features of the graph databases to expose such functionality. The only bulk-loading tool TinkerPop does have is the BulkLoaderVertexProgram which you can use to load massive graphs in a parallel distributed fashion. Other than that (especially if you don't have a large graph) you would simply write a Gremlin script to read your source data and then using Gremlin mutation steps (i.e. addV()
and addE()
) load data into your graph. If you are loading in one time fashion, I would just execute such a script from the Gremlin Console and generate your graph.
So, again, three options:
- Write a Gremlin Script to execute in the Gremlin Console to load your data.
- If you have an especially large graph, then consider
BulkLoaderVertexProgram
and Hadoop/Spark
- Consider the bulk loading tools available to the graph database you have chosen.
Whichever choice you make, do the load first and then connect that graph to Gremlin Server. At that point you can query your loaded data with gremlin-python.
You might find this slide deck helpful from Jason Plurad's talk: "Powers of Ten Redux" which builds on the original work I did with Daniel Kuppitz on the "Powers of Ten" blog post series for data loading.