sparql queries with round brackets throw exception
Asked Answered
L

1

6

I am trying to extract labels from DBpedia for some persons. I am partially successful now, but I got stuck in the following problem. The following code works.

public class DbPediaQueryExtractor {
    public static void main(String [] args) {
        String entity = "Aharon_Barak";
        String queryString ="PREFIX dbres: <http://dbpedia.org/resource/> SELECT * WHERE {dbres:"+ entity+ "<http://www.w3.org/2000/01/rdf-schema#label> ?o FILTER (langMatches(lang(?o),\"en\"))}";
        //String queryString="select *     where { ?instance <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>;  <http://www.w3.org/2000/01/rdf-schema#label>  ?o FILTER (langMatches(lang(?o),\"en\"))  } LIMIT 5000000";
        QueryExecution qexec = getResult(queryString);
        try {
            ResultSet results = qexec.execSelect();
            for ( ; results.hasNext(); )
            {
                QuerySolution soln = results.nextSolution();
                System.out.print(soln.get("?o") + "\n");
            }
        }
        finally {
            qexec.close();
        }
    }

    public static QueryExecution getResult(String queryString){
        Query query = QueryFactory.create(queryString);
        //VirtuosoQueryExecution vqe = VirtuosoQueryExecutionFactory.create (sparql, graph);
        QueryExecution qexec = QueryExecutionFactory.sparqlService("http://dbpedia.org/sparql", query);
        return qexec;
    }
}

However, when the entity contains brackets, it does not work. For example,

String entity = "William_H._Miller_(writer)";

leads to this exception:

Exception in thread "main" com.hp.hpl.jena.query.QueryParseException: Encountered " "(" "( "" at line 1, column 86.`

What is the problem?

Lacerta answered 14/8, 2013 at 12:39 Comment(2)
can you give the content of line 86? Also it looks a little bit like a syntax error for me.Hyperemia
Round brackets (or parentheses, in my local dialect) are used to surround function arguments in SPARQL, e.g., in concat('[',?x,']'), so I'd expect that this is a syntax error. You'll probably need to use the full form of your URI, surrounded by < and >.Nosedive
N
6

It took some copying and pasting to see what exactly was going on. I'd suggest that you put newlines in your query for easier readability. The query you're using is:

PREFIX dbres: <http://dbpedia.org/resource/>
SELECT * WHERE
{
  dbres:??? <http://www.w3.org/2000/01/rdf-schema#label> ?o 
  FILTER (langMatches(lang(?o),"en"))
}

where ??? is being replaced by the contents of the string entity. You're doing absolutely no input validation here to ensure that the value of entity will be legal to paste in. Based on your question, it sounds like entity contains William_H._Miller_(writer), so you're getting the query:

PREFIX dbres: <http://dbpedia.org/resource/>
SELECT * WHERE
{
  dbres:William_H._Miller_(writer) <http://www.w3.org/2000/01/rdf-schema#label> ?o 
  FILTER (langMatches(lang(?o),"en"))
}

You can paste that into the public DBpedia endpoint, and you'll get a similar parse error message:

Virtuoso 37000 Error SP030: SPARQL compiler, line 6: syntax error at 'writer' before ')'

SPARQL query:
define sql:big-data-const 0 
#output-format:text/html
define sql:signal-void-variables 1 define input:default-graph-uri <http://dbpedia.org> PREFIX dbres: <http://dbpedia.org/resource/>
SELECT * WHERE
{
  dbres:William_H._Miller_(writer) <http://www.w3.org/2000/01/rdf-schema#label> ?o 
  FILTER (langMatches(lang(?o),"en"))
}

Better than hitting DBpedia's endpoint with bad queries, you can also use the SPARQL query validator, which reports for that query:

Syntax error: Lexical error at line 4, column 34. Encountered: ")" (41), after : "writer"

In Jena, you can use the ParameterizedSparqlString to avoid these sorts of issues. Here's your example, reworked to use a parameterized string:

import com.hp.hpl.jena.query.ParameterizedSparqlString;

public class PSSExample {
    public static void main( String[] args ) {
        // Create a parameterized SPARQL string for the particular query, and add the 
        // dbres prefix to it, for later use.
        final ParameterizedSparqlString queryString = new ParameterizedSparqlString(
                "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n" +
                "SELECT * WHERE\n" +
                "{\n" +
                "  ?entity rdfs:label ?o\n" +
                "  FILTER (langMatches(lang(?o),\"en\"))\n" +
                "}\n"
                ) {{
            setNsPrefix( "dbres", "http://dbpedia.org/resource/" );
        }};

        // Entity is the same. 
        final String entity = "William_H._Miller_(writer)";

        // Now retrieve the URI for dbres, concatentate it with entity, and use
        // it as the value of ?entity in the query.
        queryString.setIri( "?entity", queryString.getNsPrefixURI( "dbres" )+entity );

        // Show the query.
        System.out.println( queryString.toString() );
    }
}

The output is:

PREFIX dbres: <http://dbpedia.org/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE
{
  <http://dbpedia.org/resource/William_H._Miller_(writer)> rdfs:label ?o
  FILTER (langMatches(lang(?o),"en"))
}

You can run this query at the public endpoint and get the expected results. Notice that if you use an entity that doesn't need special escaping, e.g.,

final String entity = "George_Washington";

then the query output will use the prefixed form:

PREFIX dbres: <http://dbpedia.org/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE
{
  dbres:George_Washington rdfs:label ?o
  FILTER (langMatches(lang(?o),"en"))
}

This is very convenient, because you don't have to do any checking about whether your suffix, i.e., entity, has any characters that need to be escaped; Jena takes care of that for you.

Nosedive answered 14/8, 2013 at 13:17 Comment(5)
Thank you so much Joshua Taylor for taking the time and helping. I run the generated query and it worked. However, running your code throws another exception: Exception in thread "main" java.lang.NoClassDefFoundError: com/hp/hpl/jena/graph/NodeFactory at com.hp.hpl.jena.query.ParameterizedSparqlString.setIri(ParameterizedSparqlString.java:720) at nl.cwi.kba2013.apps.PssExample.main(PssExample.java:26) at com.hp.hpl.jena.query.ParameterizedSparqlString.setIri(ParameterizedSparqlString.java:720) ....Lacerta
@use1967220 If you get that error that suggests a class path issue (likely multiple versions of ARQ on your class path). What version(s) of ARQ (and any other Jena libraries) are you using?Planchette
@RobV, Thank you so much. That was indeed the problemLacerta
@Joshua_Taylor the "expected results" hyperlink is generating an errorKidderminster
@Kidderminster Interesting; it was copied and pasted from the DBpedia search results, but something was breaking in between. I fixed the link (by linking to the results of an equivalent, but slightly modified, query).Nosedive

© 2022 - 2024 — McMap. All rights reserved.