Parsing schema.org ttl/owl file using Jena
Asked Answered
N

1

2

I'm writing a code generator that generate entities (POJO's in Java language) from the schema defined here http://schema.rdfs.org/all.ttl. I'm using Jena to parse the ttl file and retrieve the meta data that I need to generate them.

Jena parses the file successfully, however, for some reason it does not list all the attributes of a given entity, e.g., Person. I'm not sure whether I'm doing something wrong, using the wrong API, etc. Here's the code sample that recreates the scenario:

    public class PersonParser {

    public static void main(String[] args) {
        OntModel model = ModelFactory.createOntologyModel();
        URL url = Thread.currentThread().getContextClassLoader().getResource("schema_org.ttl");
        model.read(url.toString(), "TURTLE");
        OntClass ontclass = model.getOntClass("http://schema.org/Person");
        Iterator<OntProperty> props = ontclass.listDeclaredProperties();
        while (props.hasNext()) {
            OntProperty p = props.next();
            System.out.println("p:" + p.getLocalName());
        }
    }
}

Basically, I'm looking for only one class called Person and trying to list all its properties and what I get is:

p:alternateName
p:deathDate
p:alumniOf
p:sameAs
p:url
p:additionalName
p:homeLocation
p:description
p:nationality
p:sibling
p:follows
p:siblings
p:colleagues
p:memberOf
p:knows
p:name
p:gender
p:birthDate
p:children
p:familyName
p:jobTitle
p:workLocation
p:parents
p:affiliation
p:givenName
p:honorificPrefix
p:parent
p:colleague
p:additionalType
p:honorificSuffix
p:image
p:worksFor
p:relatedTo
p:spouse
p:performerIn

But if you look at http://schema.org/Person, it's got a bunch of properties that it did not list (for example address). The declaration of schema:address in http://schema.rdfs.org/all.ttl is:

schema:address a rdf:Property;
    rdfs:label "Address"@en;
    rdfs:comment "Physical address of the item."@en;
    rdfs:domain [ a owl:Class; owl:unionOf (schema:Person schema:Place schema:Organization) ];
    rdfs:range schema:PostalAddress;
    rdfs:isDefinedBy <http://schema.org/Person>;
    rdfs:isDefinedBy <http://schema.org/Place>;
    rdfs:isDefinedBy <http://schema.org/Organization>;
    .

Has anyone come across this? Should I be using a different Jena interface to parse the schema?

Nathan answered 1/4, 2014 at 21:40 Comment(0)
S
3

Note that the documentation on listDeclaredProperties is (emphasis added):

listDeclaredProperties

com.hp.hpl.jena.util.iterator.ExtendedIterator<OntProperty> listDeclaredProperties(boolean direct)

Return an iterator over the properties associated with a frame-like view of this class. This captures an intuitive notion of the properties of a class. This can be useful in presenting an ontology class in a user interface, for example by automatically constructing a form to instantiate instances of the class. The properties in the frame-like view of the class are determined by comparing the domain of properties in this class's OntModel with the class itself. See: Presenting RDF as frames for more details.

Note that many cases of determining whether a property is associated with a class depends on RDFS or OWL reasoning. This method may therefore return complete results only in models that have an attached reasoner.

Parameters:

  • direct - If true, restrict the properties returned to those directly associated with this class. If false, the properties of super-classes of this class will not be listed among the declared properties of this class.

Returns:

An iteration of the properties that are associated with this class by their domain.

So, even before looking at the particular schema, it's important to note that unless you're using a reasoner, you might not get all the results you expect. Then, notice how the address property is declared:

schema:address a rdf:Property;
    rdfs:label "Address"@en;
    rdfs:comment "Physical address of the item."@en;
    rdfs:domain [ a owl:Class; owl:unionOf (schema:Person schema:Place schema:Organization) ];
    rdfs:range schema:PostalAddress;
    rdfs:isDefinedBy <http://schema.org/Person>;
    rdfs:isDefinedBy <http://schema.org/Place>;
    rdfs:isDefinedBy <http://schema.org/Organization>;

The domain of address is a union class: Person or Place or Organization. That's a superclass of Person, but it's a complex class expression, not just a simple named class, so you'll probably need a reasoner, as the documentation mentions, to get Jena to recognize that it's a superclass of Person.

Comparison with OWL semantics

I think that using a reasoner will allow Jena to recognize that the domain of address is a superclass of Person, and thus include it in the result of listDeclaredProperties. It's worth noting how this differs from OWL semantics, though.

In OWL, what it means for a class D to be the domain of a property P means that whenever we have a triple with the property P, we can infer that the subject is a D. This can be expressed by the rule

P rdfs:domain D     X P Y
-------------------------
    X rdf:type D

So, even though a Person might have an address, just because something has an address isn't enough to tell us that that something is a Person; it could still be a Place or Organization.

Syntactics answered 2/4, 2014 at 14:22 Comment(1)
then what I understand is that the listDeclaredProperties for some class (also gets the properties of super classes) which differs from the semantics familiar in OWL (a class is a domain for properties of subclasses). However, is there some way that can get the properties in an OWL semantics fashion other than iterating properties and asking whether it has domain while having a running reasoner?Finally

© 2022 - 2024 — McMap. All rights reserved.