I need to perform a query against DBpedia:
SELECT DISTINCT ?poi ?lat ?long ?photos ?template ?type ?label WHERE {
?poi <http://www.w3.org/2000/01/rdf-schema#label> ?label .
?poi <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?poi <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?poi <http://dbpedia.org/property/hasPhotoCollection> ?photos .
OPTIONAL {?poi <http://dbpedia.org/property/wikiPageUsesTemplate> ?template } .
OPTIONAL {?poi <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type } .
FILTER ( ?lat > x && ?lat < y &&
?long > z && ?long < ω &&
langMatches( lang(?label), "EN" ))
}
I'm guessing this information is scattered among different dumps (.nt) files and somehow the SPARQL endpoint serves us with a result set. I need to download these different .nt files locally (not all DBpedia), perform only once my query and store the results locally (I don't want to use the SPARQL endpoint).
- What parts of Jena should I use for this one run?
I m a bit confused reading from this post:
So, you can load the entire DBPedia data into a single TDB location on disk (i.e. a single directory). This way, you can run SPARQL queries over it.
How do I load the DBpedia into a single TDB location, in Jena terms, if we got three .nt DBpedia files? How do we apply the above query on those .nt files? (Any code would help.)
Example, is this wrong?
String tdbDirectory = "C:\\TDB";
String dbdump1 = "C:\\Users\\dump1_en.nt";
String dbdump2 = "C:\\Users\\dump2_en.nt";
String dbdump3 = "C:\\Users\\dump3_en.nt";
Dataset dataset = TDBFactory.createDataset(tdbDirectory);
Model tdb = dataset.getDefaultModel(); //<-- What is the default model?Should I care?
//Model tdb = TDBFactory.createModel(tdbdirectory) ;//<--is this prefered?
FileManager.get().readModel( tdb, dbdump1, "N-TRIPLES" );
FileManager.get().readModel( tdb, dbdump2, "N-TRIPLES" );
FileManager.get().readModel( tdb, dbdump3, "N-TRIPLES" );
String q = "my big fat query";
Query query = QueryFactory.create(q);
QueryExecution qexec = QueryExecutionFactory.create(query, tdb);
ResultSet results = qexec.execSelect();
while (results.hasNext()) {
//do something significant with it
}
qexec.close()
tdb.close() ;
dataset.close();
- In the above code we used
"dataset.getDefaultModel"
(to get the default graph as a JenaModel
). Is this statement valid? Do we need to create a dataset to perform the query, or should we go withTDBFactory.createModel(tdbdirectory)
?