obtain the matrix in protege
Asked Answered
D

1

3

My work is about library book of recommendation systems . that as input I need book Classification ontology . in my ontology classify library books. this classification has 14 categories, beside the sibling classes Author, book, Isbn. Individuals in book class are book’s subject(about 600 subjects) , and individuals in author class are name’s author and also isbn class. I design this ontology with protege 4.1.

also I collected and Have got in part of belong book to categories manually. That a object properties is name “hasSubject” related individual book class with categories. Example book “A” hasSubject Categories “S” and “F” and... As result I want to get the matrix belonging to Book Categories. This is the way that if the book belongs to a categories then get 1 and Otherwise Takes the value 0. Like this:

     cat1   cat2   cat3   
book1   1      0      0   
book2   1      0      1   
book3   1      1      0  

In this example Expresses that book1 belong to category 1 and Does not belong category 2 and 3. How can I do this work with sparql in protege?

Dissenter answered 29/7, 2013 at 13:4 Comment(0)
F
6

Handling a fixed number of categories

Given data like

@prefix : <http://example.org/books/> .

:book1 a :Book, :Cat1 .
:book2 a :Book, :Cat1, :Cat3 .
:book3 a :Book, :Cat1, :Cat2 .

you can use a query like

prefix : <http://example.org/books/>

select ?individual
       (if(bound(?cat1),1,0) as ?Cat1)
       (if(bound(?cat2),1,0) as ?Cat2)
       (if(bound(?cat3),1,0) as ?Cat3)
where {
  ?individual a :Book .
  OPTIONAL { ?individual a :Cat1 . bind( ?individual as ?cat1 ) } 
  OPTIONAL { ?individual a :Cat2 . bind( ?individual as ?cat2 ) }
  OPTIONAL { ?individual a :Cat3 . bind( ?individual as ?cat3 ) }
}
order by ?book

in which certain variables are bound (the particular value to which they are bound doesn't really matter though) based on the whether certain triples are present to get results like these:

$ arq --data data.n3 --query matrix.sparql
-----------------------------------
| individual | Cat1 | Cat2 | Cat3 |
===================================
| :book1     | 1    | 0    | 0    |
| :book2     | 1    | 0    | 1    |
| :book3     | 1    | 1    | 0    |
-----------------------------------

Handling an arbitrary number of categories

Here's a solution that seems to work in Jena, though I'm not sure that the specific results are guaranteed. (Update: Based on this answers.semanticweb.com question and answer, it seems that this behavior is not guaranteed by the SPARQL specification.) If we have a little bit more data, e.g., about which things are categories and which are books, e.g.,

@prefix : <http://example.org/books/> .

:book1 a :Book, :Cat1 .
:book2 a :Book, :Cat1, :Cat3 .
:book3 a :Book, :Cat1, :Cat2 .

:Cat1 a :Category .
:Cat2 a :Category .
:Cat3 a :Category .

then we can run a subquery that selects all the categories in order, and then for each book computes a string indicating whether or not the book is in each category.

prefix : <http://example.org/books/>

select ?book (group_concat(?isCat) as ?matrix) where { 
  { 
    select ?category where { 
      ?category a :Category 
    }
    order by ?category 
  }
  ?book a :Book .
  OPTIONAL { bind( 1 as ?isCat )              ?book a ?category . }
  OPTIONAL { bind( 0 as ?isCat ) NOT EXISTS { ?book a ?category } }
}
group by ?book
order by ?book

This has the output:

$ arq --data data.n3 --query matrix2.query
--------------------
| book   | matrix  |
====================
| :book1 | "1 0 0" |
| :book2 | "1 0 1" |
| :book3 | "1 1 0" |
--------------------

which is much closer to the output in the question, and handles an arbitrary number categories. However, it depends on the values of ?category being processed in the same order for each ?book, and I'm not sure whether that's guaranteed or not.

We can even use this approach to generate a header row for the table. Again, this depends on the ?category values being processed in the same order for each ?book, which might not be guaranteed, but seems to work in Jena. To get a category header, all we need to do is create a row where ?book is unbound, and the value of the ?isCat indicates the particular category:

prefix : <http://example.org/books/>

select ?book (group_concat(?isCat) as ?matrix) where { 
  { 
    select ?category where { 
      ?category a :Category 
    }
    order by ?category 
  }

  # This generates the header row where ?isCat is just
  # the category, so the group_concat gives headers.
  { 
    bind(?category as ?isCat) 
  }
  UNION 
  # This is the table as before
  {
    ?book a :Book .
    OPTIONAL { bind( 1 as ?isCat )              ?book a ?category . }
    OPTIONAL { bind( 0 as ?isCat ) NOT EXISTS { ?book a ?category } }
  }
}
group by ?book
order by ?book

We get this output:

--------------------------------------------------------------------------------------------------------
| book   | matrix                                                                                      |
========================================================================================================
|        | "http://example.org/books/Cat1 http://example.org/books/Cat2 http://example.org/books/Cat3" |
| :book1 | "1 0 0"                                                                                     |
| :book2 | "1 0 1"                                                                                     |
| :book3 | "1 1 0"                                                                                     |
--------------------------------------------------------------------------------------------------------

Using some string manipulation, you could shorten the URIs used for the categories, or widen the array entries to get correct alignment. One possibility is this:

prefix : <http://example.org/books/>

select ?book (group_concat(?isCat) as ?categories) where { 
  { 
    select ?category
           (strafter(str(?category),"http://example.org/books/") as ?name)
     where { 
      ?category a :Category 
    }
    order by ?category 
  }

  { 
    bind(?name as ?isCat)
  }
  UNION 
  {
    ?book a :Book .
    # The string manipulation here takes the name of the category (which should
    # be at least two character), trims off the first character (string indexing
    # in XPath functions starts at 1), and replaces the rest with " ". The resulting
    # spaces are concatenated with "1" or "0" depending on whether the book is a
    # member of the category.  The resulting string has the same width as the
    #  category name, and makes for a nice table.
    OPTIONAL { bind( concat(replace(substr(?name,2),"."," "),"1") as ?isCat )              ?book a ?category . }
    OPTIONAL { bind( concat(replace(substr(?name,2),"."," "),"0") as ?isCat ) NOT EXISTS { ?book a ?category } }
  }
}
group by ?book
order by ?book

which produces this output:

$ arq --data data.n3 --query matrix3.query
-----------------------------
| book   | categories       |
=============================
|        | "Cat1 Cat2 Cat3" |
| :book1 | "   1    0    0" |
| :book2 | "   1    0    1" |
| :book3 | "   1    1    0" |
-----------------------------

which is almost exactly what you had in the question.

Fauman answered 29/7, 2013 at 16:6 Comment(6)
thank you Jashua Taylor. but if I have a large number of books and categories so I should do this work for every books and categories?Dissenter
@Dissenter If you have a large number of books and categories, this kind of query will become tedious to write, and you might be better off doing something manually. E.g., you could first select all the categories, and then for each category, check whether the book is in the category, and then create the table based on that data. SPARQL can help you get the data, but you still probably need to handle some of the presentation aspects.Fauman
@Dissenter I think I was wrong in my previous comment. While it still may be a good idea to do more advanced formatting outside of SPARQL, it's actually not all that hard to get some nicely formatted output for an arbitrary number of categories. The result is almost exactly what you'd had in the question. I've updated my answer accordingly.Fauman
ok . thank you very much Mr Taylor.you know I want to take the matrix and then apply the cosine similarity formula.Dissenter
@Dissenter I don't know what the cosine similarity formula is, and that's probably better posed as a separate question anyhow. If it's posted as a separate question; it will get more attention that way, and helps keep each question self-contained.Fauman
ok thanks I want to obtain two matrix that the another step ,I combine both.one matrix : This matrix is 0 or 1 (the above matrix). and another matrix posed as a separate question.(According to you). stackoverflow.com/questions/17972155/apply-formula-in-protege . thank you very much.Dissenter

© 2022 - 2024 — McMap. All rights reserved.