use another sparql inside sparql IN clause

Asked 30/11, 2012 at 18:17 Answered 12/2, 2020 at 22:4

I'm using SPARQL and I wonder if I can put an sparql inside in clause? To be more specific, I need to get entities(s1,s2) who have specific condition for this sparql query[s1's aggregate value over a field is more than say 5]

select 
?s1 ?x ?s2.
WHERE {
         {?s1 rdf:type dbpedia-owl:Scientist.}
         {?s2 rdf:type dbpedia-owl:Scientist.}
         {?s2 dbpedia-owl:field ?x.}
         {?s1 dbpedia-owl:field ?x.}
}

so I need to add an extra IN clause like this

SELECT
?s1 ?x ?s2.
WHERE {
      {?s1 rdf:type dbpedia-owl:Scientist.}
      {?s2 rdf:type dbpedia-owl:Scientist.}
      {?s2 dbpedia-owl:field ?x.}
      {?s1 dbpedia-owl:field ?x.}
      {?s1 IN 
            {
             SELECT  ?s1  WHERE {
                                    SELECT ?s1 (COUNT(?p) AS ?prizes) {
                                    ?s1 dbpprop:prizes ?p.
                                    } group by (?s1) 
                                 }FILTER (?prizes > 2) 
             }
       }
 }

But I got error on the sparql query parser..... does anybody know how to fix it?

Adalie answered 30/11, 2012 at 18:17 Comment(0)

IN has a somewhat different usage in SPARQL than SQL, it can only be used within a FILTER like so:

FILTER(?s IN (<this>, <that>, <another>))

However just using the sub-query on it's own should give you the desired result because of the bottom up join semantics of SPARQL evaluation:

SELECT ?s1 ?x ?s2
WHERE 
{
  ?s1 rdf:type dbpedia-owl:Scientist.
  ?s2 rdf:type dbpedia-owl:Scientist.
  ?s2 dbpedia-owl:field ?x.
  ?s1 dbpedia-owl:field ?x.
  {
    SELECT ?s1 WHERE 
    {      
      ?s1 dbpprop:prizes ?p.
    }
    GROUP BY ?s1 
    HAVING (COUNT(?p) > 2) 
  }
}

You may notice I simplified some other parts of your query as well. There is no need to use two nested sub-queries because you can specify an aggregate condition using the HAVING clause.

Also you do not need to put { } around each individual triple pattern and in fact doing so may significantly harm performance.

Spendthrift answered 30/11, 2012 at 18:34 Comment(3)

good answer but I beg to differ with "because of the bottom up join semantics". The "join" does the work, the "bottom up" is a curse because it never binds vars into the subquery, so if the subquery is expensive, you're basically stuck. But that's for another question – Koan 25/1, 2017 at 7:36

Not entirely true. A good optimiser should still be able to determine whether it is safe to pass variable bindings into the inner query in order to perform a more efficient join – Spendthrift 25/1, 2017 at 10:13

Just checked with Apache Jena and it will correctly spot that this particular query can be evaluated using a linear index join i.e. Passing in bindings from the first BGP into the inner query – Spendthrift 25/1, 2017 at 10:21

As per official W3C documentation,

boolean  rdfTerm IN (expression, ...)

The IN operator tests whether the RDF term on the left-hand side is found in the values of list of expressions on the right-hand side. The test is done with "=" operator, which tests for the same value, as determined by the operator mapping.

A list of zero terms on the right-hand side is legal.

Errors in comparisons cause the IN expression to raise an error if the RDF term being tested is not found elsewhere in the list of terms.

The IN operator is equivalent to the SPARQL expression:

(lhs = expression1) || (lhs = expression2) || ...

Examples:

So, IN operator accepts list of vales, where as when we use a nested SPARQL query using select operator (As shown in your example), it returns Resultset, you can think of like list of statements. Hence you can not do it in that way.

However below is an example, you can try with as per SPARQL FILTER +IN syntax:

SELECT
?s1 ?x ?s2.
WHERE {
      {?s1 rdf:type dbpedia-owl:Scientist.}
      {?s2 rdf:type dbpedia-owl:Scientist.}
      {?s2 dbpedia-owl:field ?x.}
      {?s1 dbpedia-owl:field ?x.}
      FILTER (?s1 IN (<http://example.com/#1>,<http://example.com/#2>, <http://example.com/#3>)) 
 }

Erminna answered 29/3, 2018 at 4:6 Comment(0)

You might want to try FILTER(EXISTS ...)

SELECT
  ?s1 ?x ?s2
WHERE {
      {?s1 rdf:type dbpedia-owl:Scientist.}
      {?s2 rdf:type dbpedia-owl:Scientist.}
      {?s2 dbpedia-owl:field ?x.}
      {?s1 dbpedia-owl:field ?x.}
      FILTER (EXISTS {
        SELECT ?sx
        WHERE {
          {
            SELECT ?sx (COUNT(?p) AS ?prizes) 
            {?sx dbpprop:prizes ?p.}
            GROUP BY ?sx
            HAVING (?prizes > 2)
          } .
          FILTER(?sx = ?s1)
        }
      })
 }

Bolduc answered 12/2, 2020 at 22:4 Comment(0)

Recommended topics

Hot tags