SPARQL: return all intersections fulfilled by specified or equivalent classes
Asked Answered
S

1

1

If I have classes ABC and CDE defined as intersections of classes A,B,C,D,E as follows:

<Class rdf:about="&blah;ABC">
    <equivalentClass>
        <Class>
            <intersectionOf rdf:parseType="Collection">
                <Restriction>
                    <onProperty rdf:resource="&blah;hasMarker"/>
                    <someValuesFrom rdf:resource="&blah;A"/>
                </Restriction>
                <Restriction>
                    <onProperty rdf:resource="&blah;hasMarker"/>
                    <someValuesFrom rdf:resource="&blah;B"/>
                </Restriction>
                <Restriction>
                    <onProperty rdf:resource="&blah;hasMarker"/>
                    <someValuesFrom rdf:resource="&blah;C"/>
                </Restriction>
            </intersectionOf>
        </Class>
    </equivalentClass>
</Class>

<Class rdf:about="&blah;CDE">
    <equivalentClass>
        <Class>
            <intersectionOf rdf:parseType="Collection">
                <Restriction>
                    <onProperty rdf:resource="&blah;hasMarker"/>
                    <someValuesFrom rdf:resource="&blah;C"/>
                </Restriction>
                <Restriction>
                    <onProperty rdf:resource="&blah;hasMarker"/>
                    <someValuesFrom rdf:resource="&blah;D"/>
                </Restriction>
                <Restriction>
                    <onProperty rdf:resource="&blah;hasMarker"/>
                    <someValuesFrom rdf:resource="&blah;E"/>
                </Restriction>
            </intersectionOf>
        </Class>
    </equivalentClass>
</Class>

How would I query all the intersection classes whose restrictions are fulfilled by a given set of input classes in SPARQL? For instance, if I fed A,B,C,D,E,F,G into this query, I'd expect to get back

ABC A
    B
    C
CDE C
    D
    E

Two further wrinkles: if I query A,Z,C where Z is an equivalence class of B, then this should match and ideally return

ABC A
    Z
    C

Second, the result should only return maximal matches; so if there exists a class ABCD and I query across A,B,C,D, it would return ABCD and not ABC.

Thanks in advance!

UPDATE:

To clarify, I don't want to match against an intersection class unless ALL of the constituent classes are in the supplied input list. For instance, if I supply A,B to the query, I DON'T want to get ABC back. If I supply A,B,C,D, I do want to get ABC back.

My use-case is this: I have a set of datapoints, in each of which I identify some arbitrary set of basic concepts A,B,C,D... etc, each with a different likelihood. I want to ask the ontology "what higher-level concepts (i.e. intersections) does this list contain?"

Currently, my query looks like this (accounting for the restrictions and onProperty in the ontology I outlined above):

SELECT DISTINCT ?intclass ?inputclasses
WHERE
{
  ?intclass owl:equivalentClass /
    owl:intersectionOf /
    rdf:rest*/rdf:first /
    owl:someValuesFrom ?inputclasses
}
ORDER BY ?intclass
BINDINGS ?inputclasses { (:A) (:B) (:C) (:D) }

Unfortunately, this gives back every intersection in my ontology that contains ANY of the input classes. I presume this is because the rest/first evalutates each of the intersection's constituent classes against the input list in turn, and matches if it finds any of them.

What I want to do is (a) match only if ALL the classes in the intersection are present in the input list, (b) infer matches from classes that are equivalent to the classes in the input list, and (c) return the intersection class along with the subset of classes from the input list which matched it. Maybe this just isn't feasible through SPARQL?

Shimkus answered 14/3, 2014 at 4:25 Comment(1)
what did you try so far?Tidwell
M
3

First, I don't think you'll be able to do exactly what you want to do, but I think you'll be able to get fairly close. In particular, I think that the maximality constraint you mention will be particularly difficult to achieve. It's generally sort of difficult to work with sets of things in SPARQL in this way. Nonetheless, we can see what we can do.

Data to work with

It's much easier to answer these kinds of questions with some sample data that we can actually work with. I've also started by simplifying the problem, so that ABC is just the intersection of A, B, and C, and CDE of C, D, and E. There are no restrictions classes yet (they won't add much complexity, actually). For testing purposes (being able to ensure that our queries won't return values that don't want), I've also added classes F and DEF. It's also easier to look at the data in the Turtle serialization, since it's closer to the SPARQL pattern syntax. Here's the simplified ontology:

@prefix :      <https://mcmap.net/q/1325469/-sparql-return-all-intersections-fulfilled-by-specified-or-equivalent-classes/1281433/intersections#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl:   <http://www.w3.org/2002/07/owl#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<https://mcmap.net/q/1325469/-sparql-return-all-intersections-fulfilled-by-specified-or-equivalent-classes/1281433/intersections>
        a       owl:Ontology .

:A      a       owl:Class .
:B      a       owl:Class .
:C      a       owl:Class .
:D      a       owl:Class .
:E      a       owl:Class .
:F      a       owl:Class .

:ABC    a                    owl:Class ;
        owl:equivalentClass  [ a                   owl:Class ;
                               owl:intersectionOf  ( :A :B :C )
                             ] .

:CDE    a                    owl:Class ;
        owl:equivalentClass  [ a                   owl:Class ;
                               owl:intersectionOf  ( :C :D :E )
                             ] .

:DEF    a                    owl:Class ;
        owl:equivalentClass  [ a                   owl:Class ;
                               owl:intersectionOf  ( :D :E :F )
                             ] .

Finding intersections of classes

For each class that is equivalent to an intersection class, there's a path from the class to each of the intersected classes. We can exploit that fact to find any classes equivalent to intersections that include A, B, and C:

prefix :      <https://mcmap.net/q/1325469/-sparql-return-all-intersections-fulfilled-by-specified-or-equivalent-classes/1281433/intersections#>
prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#>
prefix owl:   <http://www.w3.org/2002/07/owl#>
prefix xsd:   <http://www.w3.org/2001/XMLSchema#>
prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

select distinct ?class where {
  ?class owl:equivalentClass/
         owl:intersectionOf/
         rdf:rest*/rdf:first :A, :B, :C .
}
---------
| class |
=========
| :ABC  |
---------

This doesn't find CDE, though, because this query is asking for thing that have all of the specified values. It sounds like what you want, though, is to ask for things that have at least one of some specified values, and no non-specified values. You might have to write your list of classes twice, but you can do that with this:

prefix :      <https://mcmap.net/q/1325469/-sparql-return-all-intersections-fulfilled-by-specified-or-equivalent-classes/1281433/intersections#>
prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#>
prefix owl:   <http://www.w3.org/2002/07/owl#>
prefix xsd:   <http://www.w3.org/2001/XMLSchema#>
prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

select ?class ?i where {
  values ?i { :A :B :C :D :E }
  ?class owl:equivalentClass/
         owl:intersectionOf/
         rdf:rest*/rdf:first ?i .

  filter not exists { 
    ?class owl:equivalentClass/
           owl:intersectionOf/
           rdf:rest*/rdf:first ?j .
    filter( !(?j in (:A, :B, :C, :D, :E )) )
  }
}
order by ?class ?i
--------------
| class | i  |
==============
| :ABC  | :A |
| :ABC  | :B |
| :ABC  | :C |
| :CDE  | :C |
| :CDE  | :D |
| :CDE  | :E |
--------------

Note that DEF isn't in the results because while it does have D and E, it also has a value that isn't any of the specified classes, F.

Since we filter out each intersection class that has an element that's not in the input list, we're guaranteed that every intersection class that we keep has only elements that are in the input list. Given that phrasing, we can actually make the query a bit simpler:

prefix :      <https://mcmap.net/q/1325469/-sparql-return-all-intersections-fulfilled-by-specified-or-equivalent-classes/1281433/intersections#>
prefix owl:   <http://www.w3.org/2002/07/owl#>
prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

select ?class ?i where {
  # find each ?class that's equivalent to an intersection
  ?class owl:equivalentClass/owl:intersectionOf ?list .

  # and grab the intersecting classes for the results
  ?list rdf:rest*/rdf:first ?i .

  # but filter out any ?class that has an intersecting
  # class that's not in the input list.
  filter not exists { 
    ?list rdf:rest*/rdf:first ?element .
    filter( !(?element in (:A, :B, :C, :D, :E )) )
  }
}
--------------
| class | i  |
==============
| :ABC  | :A |
| :ABC  | :B |
| :ABC  | :C |
| :CDE  | :C |
| :CDE  | :D |
| :CDE  | :E |
--------------

This might be less efficient though, since now you're finding every intersection class and filtering out ineligible ones, rather than only finding those that might be acceptable, and then filtering out some. How significant this is probably depends on your actual data.

I think that this answers the main part of your question. To work with intersections of restrictions, you just need to note that the path between the classes in question is a bit different. Rather than matching the element in the list, you want to match the value of the owl:someValuesFrom property of the list elements, so the paths needs a final owl:someValuesFrom:

?class owl:equivalentClass/
       owl:intersectionOf/
       rdf:rest*/rdf:first/
       owl:someValuesFrom ?i .

Going beyond this

Handling other equivalences

if I query A,Z,C where Z is an equivalence class of B, then this should match and ideally return

ABC A
    Z
    C

The query here starts to get a bit more complex, but it's still manageable. The trick is that rather than selecting ?i as a simple member of the intersection list, you need to select ?i as the member of the input that list that the element of the intersection list is equivalent to. Then, filtering out intersections is a bit more complicated, too. You need to make sure that there's no element such that there's no element of the input list that's equivalent to the intersection element. Putting that all together, you get this query:

prefix :      <https://mcmap.net/q/1325469/-sparql-return-all-intersections-fulfilled-by-specified-or-equivalent-classes/1281433/intersections#>
prefix owl:   <http://www.w3.org/2002/07/owl#>
prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

select ?class ?i where {
  ?class owl:equivalentClass/
         owl:intersectionOf ?list .
  ?list rdf:rest*/rdf:first/(owl:equivalentClass|^owl:equivalentClass)* ?i .
  filter( ?i in (:A, :B, :C, :D, :E ))

  filter not exists { 
    ?list rdf:rest*/rdf:first ?element .
    filter not exists {
      ?element (owl:equivalentClass|^owl:equivalentClass)* ?j
      filter( ?j in (:A, :B, :C, :D, :E ))
    }
  }
}

If you add the following data

:Z      a       owl:Class ;
        owl:equivalentClass :B .

:AZC    a                    owl:Class ;
        owl:equivalentClass  [ a                   owl:Class ;
                               owl:intersectionOf  ( :A :Z :C )
                             ] .

then you get these results:

--------------
| class | i  |
==============
| :ABC  | :A |
| :ABC  | :B |
| :ABC  | :C |
| :AZC  | :A |
| :AZC  | :B |
| :AZC  | :C |
| :CDE  | :C |
| :CDE  | :D |
| :CDE  | :E |
--------------

This might not be too hard (though getting the final query will be sort of tricky). The important part is that equivalent classes will be related by a path (owl:equivalentClass|^owl:equivalentClass)*.

Maximal results

Second, the result should only return maximal matches; so if there exists a class ABCD and I query across A,B,C,D, it would return ABCD and not ABC.

This part will probably be rather hard, if you can do it at all. SPARQL really isn't designed to handle this kind of query. It's easy enough to count how many classes the intersection class intersects, but comparing those sets for subset relationships will be rather hard if you can do it.

Mendelssohn answered 14/3, 2014 at 14:23 Comment(4)
Hi Joshua, thanks for your (very comprehensive) help! See my edit above re: matching all classes in the intersection. Regarding maximal results - that's OK, it just would've been convenient to express this as part of the query. If I can get the rest to work as intended, I can just discard what I don't need from the result set.Shimkus
@Shimkus The second query I provided does what you're asking for. The filter ensures that none of the intersectands are not in the input list. If none of the intersectands are not in the input list, then all of the intersectands are in the input list. In fact, in light of that, you probably don't even need to check that at least one of the intersectands is in the input list; you can just search for all intersection classes that have no intersectands not in the input list.Mendelssohn
@Shimkus I've edited the answer to include the simplification that I mentioned in the comments. I also thought about the problem with equivalent classes, and added a working answer for that.Mendelssohn
Hey Joshua - many thanks for the assistance, I've marked this as the answer!Shimkus

© 2022 - 2024 — McMap. All rights reserved.