Versioned RDF store [closed]

Asked 27/12, 2008 at 12:16 Answered 30/7, 2012 at 10:15

Let me try rephrasing this:

I am looking for a robust RDF store or library with the following features:

Named graphs, or some other form of reification.
Version tracking (probably at the named graph level).
Privacy between groups of users, either at named graph or triple level.
Human-readable data input and output, e.g. TriG parser and serialiser.

I've played with Jena, Sesame, Boca, RDFLib, Redland and one or two others some time ago but each had its problems. Have any improved in the above areas recently? Can anything else do what I want, or is RDF not yet ready for prime-time?

Reading around the subject a bit more, I've found that:

Jena, nothing further
Sesame, nothing further
Boca does not appear to be maintained any more and seems only really designed for DB2. OpenAnzo, an open-source fork, appears more promising.
RDFLib, nothing further
Redland, nothing further
Talis Platform appears to support changesets (wiki page and reference in Kniblet Tutorial Part 5) but it's a hosted-only service. Still may look into it though.
SemVersion sounded promising, but appears to be stale.

Chromatograph answered 27/12, 2008 at 12:16 Comment(2)

Thanks for the update. OpenAnzo does look very cool. – Gao 1/1, 2009 at 13:19

Also a little outline of what is lacking in Jena/Sesame/Boca would be really useful. – Gao 1/1, 2009 at 13:19

Talis is the obvious choice, but privacy may be an issue, or perceived issue anyway, since its a SaaS offering. I say obvious because the three emboldened features in your list are core features of their platform IIRC.

They don't have a features list as such - which makes it hard to back up this answer, but they do say that stores of data can be individually secured. I suppose you could - at a pinch - sign up to a separate store on behalf of each of your own users.

Human readable input is often best supported by writing custom interfaces for each user-task, so you best be prepared to do that as needs demand.

Regarding prime-time readiness. I'd say yes for some applications but otherwise "not quite". Mostly the community needs to integrate with existing developer toolsets and write good documentation aimed at "ordinary" developers - probably OO developers using Java, .NET and Ruby/Groovy - and then I predict it will snowball.

See also Temporal Scope for RDF triples

Keturahkeung answered 8/1, 2009 at 18:24 Comment(0)

From: http://www.semanticoverflow.com/questions/453/how-to-implement-semantic-data-versioning/748#748

I personally quite like the pragmatic approach which Freebase has adopted.

Browse and edit views for humans:

http ://www.freebase.com/view/guid/9202a8c04000641f80000000041ecebd
http ://www.freebase.com/edit/topic/guid/9202a8c04000641f80000000041ecebd

The data model exposed here:

http ://www.freebase.com/tools/explore/guid/9202a8c04000641f80000000041ecebd

Stricly speaking, it's not RDF (it's probably a superset of it), but part of it can be exposed as RDF:

http ://rdf.freebase.com/rdf/guid.9202a8c04000641f80000000041ecebd

Since it's a community driven website, not only they need to track who said what, when... but they are probably keeping the history as well (never delete anything):

http ://www.freebase.com/history/view/guid/9202a8c04000641f80000000041ecebd

To conclude, the way I would tackle your problem is very similar and pragmatic. AFAIK, you will not find a solution which works out-of-the-box. But, you could use a "tuple" store (3 or 4 aren't enough to keep history at the finest granularity (i.e. triples|quads)).

I would use TDB code as a library (since it gives you B+Trees and a lot of useful things you need) and I would use a data model which allows me to: count quads, assign an ownership to a quad, a timestamp and previous/next quad(s) if available:

[ id | g | s | p | o | user | timestamp | prev | next ]

Where:

   id - long (unique identifier, same (g,s,p,o) will have different id... 
        a lot of space, but you can count quads... and when you have a 
        community driven website (like this one) counting things it's 
        important.
    g - URI (or blank node?|absent (i.e. default graph))
    s - URI|blank node
    p - URI
    o - URI|blank node|literal
 user - URI

timestamp - when the quad was created prev - id of the previous quad (if present) next - id of the next quad (if present)

Then, you need to think about which indexes you need and this would depend on the way you want to expose and access your data.

You do not need to expose all your internal structures/indexes to external users/people/applications. And, when (and if), RDF vocabularies or ontologies for representing versioning, etc. will emerge, you are able to quickly expose your data using them (if you want to).

Be warned, this is not common practice and it you look at it with your "semantic web glasses" it's probably wrong, bad, etc. But, I am sharing the idea, since I believe it's not harmful, it allows to provide a solution to your question (it will be slower and use more space than a quad store), part of it can be exposed to the semantic web as RDF / LinkedData.

My 2 (heretic) cents.

Crosswind answered 16/4, 2010 at 7:17 Comment(0)

LMF comes with a versioning module: http://code.google.com/p/lmf/wiki/ModuleVersioning

The Linked Media Framework is an easy-to-setup server application developed in JavaEE that bundles core Semantic Web technologies to offer many advanced services.

Couchant answered 22/5, 2012 at 17:56 Comment(0)

Take a look to see if Virtuoso's RDF support meets your needs, it sounds as though it might go quite a way, and it plays nice with XML and web services too. There's a commercial and a GPL'd version.

Shun answered 2/1, 2009 at 12:24 Comment(0)

Mulgara/Fedora-Commons might fit the bill. I belive that privacy is currently a major project, and I understand that it supports versioning, but it might be too much in that is is an object-store too.

Seavir answered 15/4, 2009 at 10:44 Comment(0)

(years later)

I think both Oracle's RDF store:

http://www.oracle.com/technetwork/database/options/semantic-tech/index.html

and the recently announced graph store in IBMs DB2 supports much of this:

http://www-01.ibm.com/software/data/db2/linux-unix-windows/graph-store.html

Allenaallenby answered 30/7, 2012 at 10:15 Comment(0)

Recommended topics

Hot tags