OWL Property Restrictions vs. SHACL

Asked 26/6, 2017 at 19:23 Answered 17/3, 2020 at 11:45

Solved constraints rdf owl semantic-web shacl

Given a choice between OWL Property Restrictions and SHACL, is there any reason to choose the OWL approach any more?

Particularly with respect to cardinality constraints, I'm wondering whether SHACL is considered to supercede OWL. The syntax appears similar, to my casual inspection.

I am probably missing the purpose of OWL cardinality constraints. As part of an ontology, they should facilitate inferencing (a separate concern from validation). But how do cardinality constraints facilitate inferencing?

Crazyweed answered 26/6, 2017 at 19:23 Comment(2)

If you want to use an OWL reasoner, yes. – Bookbinding 26/6, 2017 at 20:13

The last part is not clear. I mean, having something like A EquivalentTo p min 2 and an individual x with x p y1 and x p y2 and ` y1 differentFrom y2` you could infer that x belongs to class A – Bookbinding 26/6, 2017 at 20:15

The differences between OWL and SHACL are presented in the table below.

OWL	SHACL
Based on open world assumption	Based on closed world assumption
Designed for inferencing	Designed for validation
Computationally cheap (typical problems are decidable)	?
A lot of inferences almost "out of the box"	One have to define a lot of constraints manually
Is useful as documentation for RDF

As to cardinality constraints in OWL, these constraints allow to close the world in some respects in some cases, in order to get additional inferences.

The logic of cardinality constraints is opposite in OWL and in SHACL. Informally:

In SHACL,

ex:PersonShape
    a sh:NodeShape ;
    sh:targetClass ex:Person ;
    sh:path ex:parent ;
    sh:minCount 1 .`

means that if somebody is a person, then he/she has to have at least one parent.

In OWL,

ex:Person owl:equivalentClass [ rdf:type owl:Restriction ;
                                 owl:onProperty ex:parent ;
                                 owl:minCardinality "1" ] . `

means that if somebody has at least one parent, then he/she is a person.

From TopBraid marketing materials:

How is SHACL different from RDF Schema and OWL? RDFS and OWL were designed for an “Open World” in which data may be assembled from many places on the Semantic Web. This design goal has caused a lot of frustration over the years, because it made it impossible to check even the most obvious integrity constraints, such as whether a property has a certain number of values. In contrast, SHACL assumes a “Closed World”, aligning with the expectations of typical business users. Furthermore, OWL has been optimized for a certain type of classification problems, yet it could not be used to do routine operations necessary for data validation such as mathematical computations or text operations. SHACL is much more expressive. Further it seamlessly integrates with SPARQL to express almost arbitrary conditions. BTW it is perfectly fine to incrementally extend an RDFS or OWL model with SHACL statements, supporting both worlds.

Pubis answered 26/6, 2017 at 20:32 Comment(4)

I do not thing the parent analogue is appropriate. Using owl:equivalentClass means both that a person must have at least one parent, and that anything with at least one parent is a person. If you used rdfs:subClassOf, the semantics would be better aligned with SHACL. However the effect is still different (SHACL rejects the data if a parent is missing, while OWL informs you that there must be a parent somewhere and rejects the data only if it can prove that a person has no parent). – Louque 23/1, 2021 at 14:0

Yes, this should be reformulated. – Pubis 23/1, 2021 at 14:12

You said SHACL is based on closed-world assumption. Can one deduce it is more appropriate to be used to model business logic? – Giannagianni 17/3, 2022 at 9:36

@k00ni, the "world" of typical enterprise information system is usually "closed". However, that might me not the case when systems are multiple. As for UNA, that is definitely not the case when dealing with customers data :). – Pubis 17/3, 2022 at 9:54

In my experience, most users of OWL have not really understood or do not care about the actual semantics of OWL (open-world assumption etc). In many cases, OWL cardinality restrictions have been used because there was no other alternative. Yet, as pointed out elsewhere, the semantics of an owl:maxCardinality 1 is backwards from what most people expect: it means that if the property has two values then those values are assumed to be the same (owl:sameAs). In SHACL, a sh:maxCount 1 means that if the property has two values then one of them needs to be deleted.

The main reasons for continuing to use OWL in favor of SHACL are that OWL has a longer history (i.e. more tools, reusable ontologies and examples), and in case you want to use OWL (DL) inferencing. But if you need traditional closed-world semantics, use SHACL. Note that SHACL and OWL can be mixed, for example define classes and properties in one file, then define OWL restrictions in another file and SHACL constraints in yet another file.

Howlet answered 29/6, 2017 at 0:28 Comment(1)

Yes, and that I earned my living developing OWL tools in 10 years before starting with SHACL. We still have customers who use OWL. – Howlet 28/9, 2018 at 23:4

In my experience OWL reasoning is very rarely used, and the complex OWL constructs (including Restriction and unionOf) are not very useful.

Even rdfs:domain/range cause reuse problems because they are monomorphic: use them with several values, and you're calling for trouble.

So we at Ontotext have been using lately example-based models, non-committal schema:domain/rangeIncludes, and shapes to express how classes and props are used together.

Daven answered 14/3, 2019 at 10:59 Comment(4)

Thank you for that @Vladimir Alexiev. Ive been wondering about how to best use rdfs:domain & rdfs:range.. I naturally tend to use them as expected value type, but that's not how its intended. Are you saying that for example multiple rdfs:range values for a single rdf:Property will get you in trouble? And this is better with schema:domainIncludes and schema:rangeIncludes because it allows for values to be any of these types rather than automatically getting all the types. Is that correct? Can you explain what you mean by non-committal? – Featured 11/4, 2021 at 0:45

@Flion: I think he refers to multiple values for rdfs:domain/range implying an intersection of types whereas people often want to model a union instead. For example, let's say both cars and baloons "drive" (they do in German, don't know why) but they don't have a good superclass to use as rdfs:domain for the :drive property. However I get around this problem by using two different properties with the same label and different rdfs:domain values, that is more clean in my opinion and I never have problems with domain and range. – Rotorua 28/4, 2022 at 10:50

@Flion: yes, if you define :name rdfs:domain :Person, :Organization then you'll shoot yourself in the foot because everything with a name will become both Person and Organization. – Daven 17/8, 2022 at 15:46

@KonradHöffner If you find props like baloonName and carName "more clean" then you are a very patient person :-) I find over-specific props to be a major impediment to reuse and querying. – Daven 17/8, 2022 at 15:47

I think the fact that OWL is fully based on the Open World Assumption makes it quite unique. There are use cases where you bring together many datasets from many different sources that need this unique feature. For any given fact there may always be different opinions from different sources. Fundamental support in your "data fabric" (or Enterprise Knowledge Graph) for "Multiple versions of the Truth" is critical or even stronger: it is the single most important enabler for enterprise-wide use cases. For the EKG we need OWL to be the core. To form the "unbiased" representation of all your data, not forcing any particular closed world view of the world, inferring all the right facts. With lots of translation languages in the onion ring around that such as SHACL (strict context-specific closed-world shapes of objects), SPARQL (graph 2 tabular), R2RML (tabular 2 graph) and so forth.

Mudskipper answered 17/3, 2020 at 11:45 Comment(0)

Recommended topics

Hot tags