populate an existing ontology from a csv file using Jena
Asked Answered
D

1

2

How to read an ontology (owl file) using jena and populate this ontology (ontModel) from a CSV file then write the populated OntModel into OWL file

Disconnection answered 4/12, 2011 at 5:7 Comment(0)
R
5

There are three parts to your question:

  • reading an OWL file into a Jena Model
  • converting a CSV file into RDF
  • writing the contents of a Jena Model out to a file

The first and third of these are easy with Jena (see Model.read() and Model.write() methods, and the FileManager for some additional convenience support for reading from different locations).

The second part is the tricky one. Typically, when converting a CSV file to RDF, we assume that each row represents one RDF resource and its properties. You have three tasks to achieve:

  1. Determining the URI that represents the resource, based on some key in the row of data
  2. Determining the URI of the RDF property that represents the value of a given column
  3. Mapping each column value to an appropriate resource URI or literal value.

For example, consider the following CSV:

id,name,age,occupation
2718,fred,107,ninja

We can use the first row of the CSV to suggest RDF predicate names. foaf:name and foaf:age would be appropriate choices for the first two columns, but we may need a new predicate in our namespace for the third column http://example.com/vocab#occupation. The resource URI will be based on whatever the key is for the data, in this case the id column, suggesting that the URI for the resource denoted by the first row will be http://example.com/data/employee/2718. Finally we have to map the data. The name is just a string, the age is an integer and the occupation is a resource. Given those choices, we may end up with output like:

<http://example.org/data/employee/2718>
    a foaf:Person;
    foaf:name "fred";
    foaf:age "107"^^xsd:integer;
    example_com:occupation <http://dbpedia.org/resource/Ninja>.

The W3C working draft R2RML defines a standardised mapping language for performing these kinds of translations. Various implementations of R2RML are available. Of course, if your mapping is fairly stable it would be perfectly straightforward just to write some code to perform the translation from CSV for your particular input data.

Rancid answered 5/12, 2011 at 14:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.