tbloader vs SPARQL INSERT - Why different behaviour with named graphs?
Asked Answered
H

1

6

There is a strange behaviour in the connection of the commandline tools of ARQ, TDB and Named Graphs. If importing data via tdbloader in a named graph it can not be queried via GRAPH clause in a SPARQL SELECT query. However, this query is possible when inserting the data in the same graph with SPARQL INSERT.

I have following assembler description file tdb.ttl:

@prefix rdfs:   <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ja:     <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .


[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

[] rdf:type         tdb:DatasetTDB ;
    tdb:location "DB" ;
.

There is a dataset in the file data.ttl:

<a> <b> <c>.

Now, I am inserting this data with tdbloader and secondly another triple with SPARQL INSERT, both in the named graph data:

tdbloader --desc tdb.ttl --graph data data.ttl
update --desc tdb.ttl "INSERT DATA {GRAPH <data> {<d> <e> <f>.}}"

Now, the data can be queried with SPARQL via:

$arq --desc tdb.ttl "SELECT *  WHERE{ GRAPH ?g {?s ?p ?o.}}"
----------------------------
| s   | p   | o   | g      |
============================
| <a> | <b> | <c> | <data> |
| <d> | <e> | <f> | <data> |
----------------------------

Everything seems perfect. But now I want to query only this specifc named graph data:

$ arq --desc tdb.ttl "SELECT *  WHERE{ GRAPH <data> {?s ?p ?o.}}"
-------------------
| s   | p   | o   |
===================
| <d> | <e> | <f> |
-------------------

Why is the data imported from tdbloader missing? What is wrong with this query? How can I get results back from both imports?

Herd answered 19/9, 2013 at 10:0 Comment(5)
I don't know why this is happening, but thank you for a very well thought out question with a minimal working example that we can use to reproduce the problem!Curative
For three more interesting results, see pastebin.com/3cT4fagi. In short, SELECT * WHERE { values ?g { <data> } graph ?g { ?s ?p ?o }} returns only one value, but SELECT * WHERE { values ?s { <a> <d> } graph ?g { ?s ?p ?o }} returns two, and SELECT * WHERE { values ?g { <data> UNDEF } graph ?g { ?s ?p ?o }} returns three (there's a duplicate result).Curative
Interestingly, if you use an absolute URI instead of a relative one, as I've shown in pastebin.com/ViWmNmWT, you'll get the kinds of results that you want.Curative
I wonder if what's going on here is that the relative URIs in tdbloader --desc tdb.ttl --graph data data.ttl and update --desc tdb.ttl "INSERT DATA {GRAPH <data> {<d> <e> <f>.}}" are getting resolved differently so that you're ending up with two different URIs that happen to be printed in the same way in these results.Curative
Yes - thank you for a complete, minimal example. I have spent more time answering this that I might otherwise have done.Tanker
T
8

Try this query:

PREFIX : <data>
SELECT * { { ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } } }

and the output is

----------------------------
| s   | p   | o   | g      |
============================
| <a> | <b> | <c> | <data> |
| <d> | <e> | <f> | :      |
----------------------------

or try:

 tdbquery --loc DB --file Q.rq -results srj

to get the results in a different form.

The text output is makign things look nice but two different things end up as <data>.

What you are seeing is that

tdbloader --desc tdb.ttl --graph data data.ttl

used data exactly as is to name the graph. But

INSERT DATA {GRAPH <data> {<d> <e> <f>.}}

does a full SPARQL parse, and resolves against the base URI, probably looking like file://*currentdirectory*.

When printing in text, URIs get abbreviated, including using the base. So both the original data (from tdbloader) and file:///path/data appear as <data>.

PREFIX : <data>

gives the text output a different way to write it as :.

Finally try:

BASE <http://example/>
SELECT * { { ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } } }

which sets the base URI to something no where near your data URIs so switching off nice formatting by base URI:

----------------------------------------------------------------------------------------------------------------
| s                        | p                        | o                        | g                           |
================================================================================================================
| <file:///home/afs/tmp/a> | <file:///home/afs/tmp/b> | <file:///home/afs/tmp/c> | <data>                      |
| <file:///home/afs/tmp/d> | <file:///home/afs/tmp/e> | <file:///home/afs/tmp/f> | <file:///home/afs/tmp/data> |
----------------------------------------------------------------------------------------------------------------
Tanker answered 19/9, 2013 at 13:19 Comment(1)
Thanks for the detailed explanation of this issue. Then I will use absolute URIs. Then, the UNION is not necessary.Herd

© 2022 - 2024 — McMap. All rights reserved.