Triple Stores vs Relational Databases [closed]

Asked 6/2, 2012 at 11:3 Answered 10/7, 2013 at 15:20

Solved relational-database sparql semantic-web jena

I was wondering what are the advantages of using Triple Stores over a relational database?

Dornick answered 6/2, 2012 at 11:3 Comment(4)

They're pretty different things; can you be more specific? – Plante 6/2, 2012 at 11:17

That's kind of like asking about the advantages of using a screwdriver over an apple. Both useful things, but hardly interchangeable. – Adhesion 6/2, 2012 at 12:32

@MikeSherrill'CatRecall' Then explaining why that is the case is the makings of an excellent answer? I for one, certainly don't know. And welcome out triply relational overlords. – Stanford 14/10, 2017 at 21:32

I've nominated for a re-open. I don't think this is opinion based. Certainly, the advantages of particular technological implementations has a degree of subjective opinion. However, the advantages of specific pieces of computer science are not simple opinion. There are likely to be highly salient objective facts here, and an answer could provide them. – Stanford 14/10, 2017 at 21:36

The viewpoint of the CTO of a company that extensively uses RDF Triplestores commercially:

Schema flexibility - it's possible to do the equivalent of a schema change to an RDF store live, and without any downtime, or redesign - it's not a free lunch, you need to be careful with how your software works, but it's a pretty easy thing to do.

More modern - RDF stores are typically queried over HTTP it's very easy to fit them into Service Architectures without hacky bridging solutions, or performance penalties. Also they handle internationalised content better than typical SQL databases - e.g. you can have multiple values in different languages.

Standardisation - the level of standardisation of implementations using RDF and SPARQL is much higher than SQL. It's possible to swap out one triplestore for another, though you have to be careful you're not stepping outside the standards. Moving data between stores is easy, as they all speak the same language.

Expressivity - it's much easier to model complex data in RDF than in SQL, and the query language makes it easier to do things like LEFT JOINs (called OPTIONAL in SPARQL). Conversely though, if your data is very tabular, then SQL is much easier.

Provenance - SPARQL lets you track where each piece of information came from, and you can store metadata about it, letting you easily do sophisticated queries, only taking into account data from certain sources, or with a certain trust level, on from some date range etc.

There are downsides though. SQL databases are generally much more mature, and have more features than typical RDF databases. Things like transactions are often much more crude, or non existent. Also, the cost per unit information stored in RDF v's SQL is noticeably higher. It's hard to generalise, but it can be significant if you have a lot of data - though at least in our case it's an overall benefit financially given the flexibility and power.

Pelaga answered 6/2, 2012 at 16:43 Comment(1)

+1 for all of Steve's points wrt to advantages of using a triple store (and the disadvantages). I'd include reasoning as an advantage, though that's not a ubiquitous feature, so maybe that's half an advantage =) – Pergolesi 6/2, 2012 at 17:1

Both commenters are correct, especially since Semantic Web is not a database, it's a bit more general than that.

But I guess you might mean triple store, rather than Semantic Web in general, as triple store v. relational database is a somewhat more meaningful comparison. I'll preface the rest of my answer by noting that I'm not an expert in relational database systems, but I have a little bit of knowledge about triple stores.

Triple (or quad) stores are basically databases for data on the semantic web, particularly RDF. That's kind of where the similarity between triples stores & relational databases end. Both store data, both have query languages, both can be used to build applications on top of; so I guess if you squint your eyes, they're pretty similar. But the type of data each stores is quite different, so the two technologies optimize for different use cases and data structures, so they're not really interchangeable.

A lot of people have done work in overlaying a triples view of the world on top of a relational database, and that can work, and also will be slower than a system dedicated for storing and retrieving triples. Part of the problems is that SPARQL, the standard query language used by triple stores, can require a lot of self joins, something relational databases are not optimized for. If you look at benchmarks, such as SP2B, you can see that Oracle, which just overlays SPARQL support on its relational system, runs in the middle or at the back of the pack when compared with systems that more natively support RDF.

Of course, the RDF systems would probably get crushed by Oracle if they were doing SQL queries over relational data. But that's kind of the point, you pick the tool that's well suited for the application you want to build.

So if you're thinking about building a semantic web application, or just trying to get some familiarity in the area, I'd recommend ultimately going with a dedicated triple store.

I won't delve into reasoning and how that plays into query answering in triple stores, as that's yet another discussion, but it's another important distinction between relational systems and triple stores that do reasoning.

Pergolesi answered 6/2, 2012 at 14:10 Comment(0)

Some triplestores (Virtuoso, Jena SDB) are based on relational databases and simply provide an RDF / SPARQL interface. So to rephrase the question slighty, are triplestores built from the ground up as a triplestore more performant than those that aren't - @steve-harris definitely knows the answer to that ;) but I wager a yes.

Secondly, what features do triplestores have that RDBMS don't. The simple answer is support for SPARQL, RDF, OWL etc. (i.e the Semantic Web Technology stack) and to make it a fair fight, its better to define the value of SPARQL based on SPARQL 1.1 (it has considerably more features than 1.0). This provides support for federation (so so cool), property path expressions and entailment regimes along with an standards set of update protocols, graph management protocols (that SPARQL 1.0 didn't have and sorely lacked). Also @steve-harris points out that transactions are not part of the standard (can of worms) although many vendors provide non-standardised mechanisms for transactions (Virtuoso supports JDBC and Hibernate compliant connection pooling and management along with all the transactional features of Hibernate)

The big drawback in my mind is that not many triplestores support all of SPARQL 1.1 (since it is still not in recommendation) and this is where the real benefits lie.

Having said that, I am and always have been an advocate of substituting RDBMS with triplestores and platforms I deliver run entirely off triplestores (Volkswagen in my last role was an example of this), deprecating the need for RDBMS. An additional advantage is that Object to RDF mapping is more flexible and provides more options and flexibility than traditional ORM (also known as putting a square peg in a round hole).

Eccles answered 7/2, 2012 at 14:28 Comment(1)

SPARQL 1.1 is in recommendation now, AFAIK. – Recipience 28/4, 2013 at 12:28

Also you can still use a database but use RDF as a data exchange format which is very flexible.

Intersidereal answered 10/7, 2013 at 15:20 Comment(0)

Recommended topics

Hot tags