I am working on querying from a RDF dataset of 2.37 GB with approx 17 million triples in it and lucence index of the dataset is also maintained. I tried text queries of jena-text module which search's on the basis of stored lucene indexes. But its performance is quite slow, it is taking 4 or more seconds for a search query which is very slow.
However when I use luncene index viewer 'luke'. Indexes seems to have no problem and when I search for a particular term in from indexes it takes few miliseconds to search it.
So the problem is that I am unable to recognize that why is it taking so much time when it comes to 'jena-texr'.
Following is the sparql query:
SELECT ?subj ?status ?version ?label
WHERE {
?subj rdf:type ts:Valueset;
text:query 'cancer';
ts:entityStatus ?status;
OPTIONAL { ?subj ts:versionID ?version . } .
OPTIONAL { ?subj rdfs:label ?label . } .
}
LIMIT <limit>
OFFSET <offset>
Here is the jena code:
store.getDataset().begin(ReadWrite.READ) ;
Query query = QueryFactory.create(queryStr);
QueryExecution qexec = QueryExecutionFactory.create(query , store.getDataset()) ;
ResultSet results = qexec.execSelect();
while(results.hasNext()){
QuerySolution qs = results.next();
And Here is the code for creating indexed dataset.
Dataset baseDS = TDBFactory.createDataset(storePath.trim());
//define index mapping
EntityDefinition entityDef = new EntityDefinition("uri", "property", RDFS.label.asNode());
entityDef.set("property", TS.conceptCode.asNode());
entityDef.set("property", SKOS_XL.literalForm.asNode());
entityDef.set("property", SKOS.note.asNode());
entityDef.set("property", SKOS.definition.asNode());
//create in file lucene
File indexDir = new File(textIndexPath);
Directory luceneDir = null;
try {
luceneDir = FSDirectory.open(indexDir);
} catch (IOException e) {
e.printStackTrace();
}
// Join together into a dataset
Dataset indexedDS = TextDatasetFactory.createLucene(baseDS, luceneDir, entityDef) ;
Kindly can anyone identify if there is any problem with the code and the way indexed dataset is configured. Thanks