Load DBpedia locally using Jena TDB?

I need to perform a query against DBpedia:

SELECT DISTINCT ?poi ?lat ?long ?photos ?template ?type ?label WHERE {
  ?poi  <http://www.w3.org/2000/01/rdf-schema#label> ?label .
  ?poi <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
  ?poi <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
  ?poi <http://dbpedia.org/property/hasPhotoCollection> ?photos .                      
  OPTIONAL {?poi <http://dbpedia.org/property/wikiPageUsesTemplate> ?template } .
  OPTIONAL {?poi <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type } .
  FILTER ( ?lat > x && ?lat < y &&
           ?long > z && ?long < ω && 
           langMatches( lang(?label), "EN" ))
}

I'm guessing this information is scattered among different dumps (.nt) files and somehow the SPARQL endpoint serves us with a result set. I need to download these different .nt files locally (not all DBpedia), perform only once my query and store the results locally (I don't want to use the SPARQL endpoint).

What parts of Jena should I use for this one run?

I m a bit confused reading from this post:

So, you can load the entire DBPedia data into a single TDB location on disk (i.e. a single directory). This way, you can run SPARQL queries over it.

How do I load the DBpedia into a single TDB location, in Jena terms, if we got three .nt DBpedia files? How do we apply the above query on those .nt files? (Any code would help.)
Example, is this wrong?

 String tdbDirectory = "C:\\TDB";
 String dbdump1 = "C:\\Users\\dump1_en.nt";
 String dbdump2 = "C:\\Users\\dump2_en.nt";
 String dbdump3 = "C:\\Users\\dump3_en.nt";
 Dataset dataset = TDBFactory.createDataset(tdbDirectory);
 Model tdb = dataset.getDefaultModel(); //<-- What is the default model?Should I care?
 //Model tdb = TDBFactory.createModel(tdbdirectory) ;//<--is this prefered?
 FileManager.get().readModel( tdb, dbdump1, "N-TRIPLES" );
 FileManager.get().readModel( tdb, dbdump2, "N-TRIPLES" );
 FileManager.get().readModel( tdb, dbdump3, "N-TRIPLES" );
 String q = "my big fat query";
 Query query = QueryFactory.create(q);
        QueryExecution qexec = QueryExecutionFactory.create(query, tdb);
        ResultSet results = qexec.execSelect();
         while (results.hasNext()) {
         //do something significant with it
 }
qexec.close()
tdb.close() ;
dataset.close();

In the above code we used "dataset.getDefaultModel" (to get the default graph as a Jena Model). Is this statement valid? Do we need to create a dataset to perform the query, or should we go with TDBFactory.createModel(tdbdirectory)?

/** The Constant tdbDirectory. */ public static final String tdbDirectory = "C:\\TDBLoadGeoCoordinatesAndLabels"; /** The Constant dbdump0. */ public static final String dbdump0 = "C:\\Users\\Public\\Documents\\TDB\\dbpedia_3.8\\dbpedia_3.8.owl"; /** The Constant dbdump1. */ public static final String dbdump1 = "C:\\Users\\Public\\Documents\\TDB\\geo_coordinates_en\\geo_coordinates_en.nt"; ... Model tdbModel = TDBFactory.createModel(tdbDirectory);<\n> /*Incrementally read data to the Model, once per run , RAM > 6 GB*/ FileManager.get().readModel( tdbModel, dbdump0); FileManager.get().readModel( tdbModel, dbdump1, "N-TRIPLES"); FileManager.get().readModel( tdbModel, dbdump2, "N-TRIPLES"); FileManager.get().readModel( tdbModel, dbdump3, "N-TRIPLES"); FileManager.get().readModel( tdbModel, dbdump4, "N-TRIPLES"); FileManager.get().readModel( tdbModel, dbdump5, "N-TRIPLES"); FileManager.get().readModel( tdbModel, dbdump6, "N-TRIPLES"); tdbModel.close();

String queryStr = "dbpedia query "; Dataset dataset = TDBFactory.createDataset(tdbDirectory); Model tdb = dataset.getDefaultModel(); Query query = QueryFactory.create(queryStr); QueryExecution qexec = QueryExecutionFactory.create(query, tdb); /*Execute the Query*/ ResultSet results = qexec.execSelect(); while (results.hasNext()) { // Do something important } qexec.close(); tdb.close() ;

Recommended topics

Hot tags