what is the advantage of RDF and Triple Storage to Neo4j?

Asked 28/4, 2012 at 18:1 Answered 4/7, 2022 at 12:40

Neo4j is a really fast and scalable graph database, it seems that it can be used on business projects and it is free, too!

At the same time, there are no RDF triple stores that work well with large data or deliver a high-speed access. And what is more, free RDF triple stores perform even worse.

So what is the advantage of RDF and RDF triple stores to Neo4j?

Refine answered 28/4, 2012 at 18:1 Comment(2)

"really fast", can you quantify this? For example, loading speed... how many vertex|edges per second is 'really fast'? "scale graph database", can you quantify this? For example, how many vertex|edges on a server with X GB of RAM? – Gamesmanship 29/4, 2012 at 12:36

@castagna: insertion or retrieval? With Pythonic bindings it is only twice slower when used on triplet insertion compared to a an optimized SQLAchemy / SQLite stack. For the traversal, if I remember well, it was well over 1 M edges/second on my personal machine (6GB RAM), but I think it can go beyond. For the pure queries on relation (vertexes, relations, etc...), no4j server on my machine is doing well over 1k transaction/s, even if the database is getting close to 1M indexed properties with 100sk of nodes and close to a M relations – Ain 17/7, 2013 at 22:17

The advantage of using a triple store for RDF rather than Neo4j is that that's what they're designed for. Neo4j is pretty good for many use cases, but in my experience its performance for loading and querying RDF is well below all dedicated RDF databases.

It's a fallacy that RDF databases don't scale or are not fast. Sure, they're not yet up to the performance & scale levels that relational databases have, but they have a 50 year head start. Many triple stores scale into the billions of triples, provide 'standard' enterprise features, and provide great performance for many use cases.

If you're going to use RDF for a project, use a triple store; it's going to provide the best performance and set of features/APIs for working with RDF to build your application.

Unchancy answered 29/4, 2012 at 3:7 Comment(1)

Neo4J supports Sparql and Gremlin: blog.neo4j.org/2010/02/top-10-ways-to-get-to-know-neo4j.html Query languages Beyond using Neo4j programmatically, you can also issue queries using a query language. These are the supported options at the moment: SPARQL: Neo4j can be used as a triple- or quadstore, and has SAIL and SPARQL implementations. Go to the components site to find out more about the related components. Gremlin: a graph-based programming-language with different backend implementations in the works as well as a supporting toolset. – Paralogism 26/4, 2013 at 0:34

RDF and SPARQL are standards, so you have a choice of multiple implementations, and can migrate your data from one RDF store to another.

Additionally, version 1.1 of the SPARQL query language is quite sophisticated (more expressive than most SQL implementations) and can do all kinds of queries that would require a lot of code to be written in Neo4J.

Crunch answered 29/4, 2012 at 7:54 Comment(0)

If you are going for graph mining (e.g., graph traversal) upon triples, neo4j is a good choice. For the large triples, you might want to use its batchInserter which is fairly fast.

Hydrolyte answered 20/9, 2012 at 3:58 Comment(0)

So I think it's all about your use case. Both technologies can and do overlap.

In my mind, there its mostly about the use case. Do you want a full knowledge graph including all the ecosystems from the semantic web? Then go for the triple store. If you need a general-purpose graph (e.g. store big data as a graph) use the property graph model. My reasoning is, that the underlying philosophy is very much different and this starts with how the data is stored which has implications for your usage scenario.

let's do some out-of-mind bullet points here to compare. Take it with a grain of salt please as this is not a benchmark paper just some experience-based 5 min write down.

Property graph (neo4j):

Think of nodes/Edges as documents
Implemented on top of e.g. linked list, key-value stores (deep searches, large data e.g. via gremlin)
Support for OWL/RDF but not natively (as i see its on a meta layer)
Really great when it comes to having the data in the graph and doing ML (it stores it as linked lists that gives you nice vectors which is cool for ML out of the box)
Made for large data at scale.
Use Cases: (focus is on the data entities and not their classes)
- Social Graphs and other scenarios where you need deep traversal
- Large data graphs, where you have a lot of documents that need to be searched in a schema-free graph manner .
- Analyzing customer funnels from click data etc. You want to move out of your relational schema because actually, you are in a graph use case...

Triple Store (E.g. rdf4j)

Think of data in maximum normal form as triples (no redundant data at all)
Triples are stored in context triples. Works a lot with index.
Broad but searches and specific knowledge extractions. Deep searches are sometimes cumbersome.
Scale is impressive and can scale to trillions of nodes with fast performance. But i would not recommend storing big data in the graph e.g. time-series or so. The reason is the special way how indexes are used and in order to scale horizontally, you may consider working with subgraphs ...
Support for all the ecosystems like SPARQL, SHACL, SWIRL etc. this is a big plus in case
Use cases:
- It's really about knowledge graphs. Do you need shape testing, rule evaluation, inference, and reasoning? Go for it because you have to focus on the ontology and class structure!
- Also e.g. you have IoT and want to configure relations for logistics and smart factory while the telemetry is stored somewhere else and only referenced in the graph.

Backache answered 4/7, 2022 at 12:40 Comment(0)

-3

I have heard rumors that it takes whole day to load 10M triples into Neo4j (it is actually the slowest one because it's not built primarily for RDF).

Sesame and 4Store are the fastest ones but Jena has powerful API.

Blesbok answered 28/4, 2012 at 18:16 Comment(2)

where did you hear these rumors? – Pelag 9/8, 2013 at 13:58

Please provides references – Cabbala 14/10, 2015 at 12:44

Recommended topics

Hot tags