Alternative for OPTIONAL Keyword in SPARQL-Queries?

PREFIX mbo: <http://creativeartefact.org/ontology/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT * WHERE { ?uri a mbo:LiveMusicEvent. OPTIONAL {?uri rdfs:label ?label}. OPTIONAL {?uri mbo:organisedBy ?organiser}. OPTIONAL {?uri mbo:takesPlaceAt ?venue}. OPTIONAL {?uri mbo:begin ?begin}. OPTIONAL {?uri mbo:end ?end}. }

OPTIONAL patterns are generally expensive to evaluate (compared to "normal" join patterns) for a SPARQL engine. In this case, the error indicates that Virtuoso's query planner estimates the query to be too complex to perform within the set time limit (notice that it estimates this - so the precise value may be wrong).

You have several alternatives. Most of them involve doing more than one query, though. A common pattern is the "retrieve-and-iterate" pattern - you first do a query that retrieves all instances of mbo:LiveMusicEvent:

 SELECT ?uri WHERE { ?uri a mbo:LiveMusicEvent }

and then you iterate over the result and retrieve each instance's optional properties :

SELECT * 
WHERE { VALUES(?uri) { <http://example.org/instance1> } 
        OPTIONAL {?uri rdfs:label ?label}. 
        OPTIONAL {?uri mbo:organisedBy ?organiser}. 
        OPTIONAL {?uri mbo:takesPlaceAt ?venue}. 
        OPTIONAL {?uri mbo:begin ?begin}. 
        OPTIONAL {?uri mbo:end ?end}. 
}

As you can see I use a VALUES clause to insert the instance id results from the first query into this second query. In this example, I am assuming you iterate one by one and therefore do a query for each instance, but as a further optimization you might tinker with adding more than one instance into the VALUES clause in one go (obviously not all of them at once though, as that would make the query the same complexity as the original one).

By the way, VALUES is a SPARQL 1.1 feature, and I am not certain that Virtuoso supports it. If not, you can achieve the same effect either by using a FILTER clause or by just 'manually' replacing all occurrences of the variable ?uri with the instance id for each iteration.

Another way to handle it is to first do a CONSTRUCT query that retrieves a relevant subset of data from the larger source, and then do your more complex query with optionals on that subset. For example:

 CONSTRUCT 
 WHERE { 
    ?uri a mbo:LiveMusicEvent; 
         ?p ?o . 
 }

will retrieve all data about the LiveMusicEvent instances as an RDF graph. Pop that graph into a local RDF model (e.g. a Sesame Model or in-memory Repository if you're working in Java), and query it further from there.

Recommended topics

Hot tags