How to get property labels from Wikidata using SPARQL
Asked Answered
A

1

9

I am using SPARQLWrapper to send SPARQL queries to Wikidata. At the moment I am trying to find all properties for an entity. Eg. with a simple tuple such as: wd:Q11663 ?a ?b. This in itself works, but I am trying to find human readable labels for the returned properties and entities.

Although SERVICE wikibase:label works using Wikidata's GUI interface, this does not work with SPARQLWrapper - which insists on returning identical values for a variable and its 'label'.

Querying on the property rdfs:label works for the entity (?b), but this approach does not work with the property (?a).

it would appear the property is being returned as a full URI such as http://www.wikidata.org/prop/direct/P1536 . Using the GUI I can successfully query wd:P1536 ?a ?b.. This works with SPARQLWrapper if I send it as a second query - but not in the first query.

Here is my code:

from SPARQLWrapper import SPARQLWrapper, JSON

sparql = SPARQLWrapper("http://query.wikidata.org/sparql")

sparql.setQuery("""
  SELECT ?a ?aLabel ?propLabel ?b ?bLabel
  WHERE
  {
    wd:Q11663 ?a ?b.

    # Doesn't work with SPARQLWrapper
    #SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    #?prop wikibase:directClaim ?p

    # but this does (and is more portable)
    ?b rdfs:label ?bLabel. filter(lang(?bLabel) = "en").

    # doesn't work
    #?a rdfs:label ?aLabel. 

    # property code can be extracted successfully
    BIND(  strafter(str(?a), "prop/direct/") AS ?propLabel).
    #BIND( CONCAT("wd:", strafter(str(?a), "prop/direct/") ) AS ?propLabel).

    # No matches, even if I concat 'wd:' to ?propLabel
    ?propLabel rdfs:label ?aLabel
    # generic search for any properties also fails
    #?propLabel ?zz ?aLabel.
   }
 """)

# However, this returns a label for P1536 - which is one of wd:Q11663's properties
sparql.setQuery("""SELECT ?b WHERE
   {
      wd:P1536 rdfs:label ?b.
   }
""")

So how can I get the labels for the properties in one query (which should be more efficient)?

[aside: yes I'm a bit rough & ready with the EN filter - often dropping it if I'm not getting anything back]

Audet answered 7/6, 2019 at 1:21 Comment(7)
your query is a bit confusing. you said it doesn't work with the label service but you used #?prop wikibase:directClaim ?p whereas the property is called ?a in the triple pattern above. That would indeed not work. You also would have to put something like ?b rdfs:label ?bLabel. filter(lang(?bLabel) = "en"). into an OPTIONAL clause, otherwise you won't get any literal values which never have a label. The line BIND( strafter(str(?a), "prop/direct/") AS ?propLabel). makes propLabel being a plain string literal, thus, ?propLabel rdfs:label ?aLabel can't work.Differentiation
My suggestion: SELECT ?a ?propLabel ?b ?bLabel WHERE { wd:Q11663 ?a ?b. SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } ?prop wikibase:directClaim ?a . }Differentiation
@AKSW: Yes a bit confusing for all of us - I'm early on the learning curve with SPARQL :-) . I didn't try much at all with directClaim because the SERVICE line wasn't working with SPARQLWrapper. I've just tried your suggestion and that looks to work - thanks. I need to read up on directClaim etc.Audet
re. the string literal: I wondered if that was what was going on, but I couldn't find a way to convert a string literal into an element. How would I get the property label in a generic/standard SPARQL way without relying on Wikidata's extensions?Audet
@AKSW : Okay, I've gone through your code comparing with mine. I believe I now understand what is going on - thanks! It looks like the label service defaults to the entity value if it can't find anything - so it looked like the results were messed up and returning identical entity names and labels. In reality some were, some weren't. Adding the OPTIONAL to the rdfs:label clause highlighted that. I also now understand the wikibase:directClaim - quite simple really! You're welcome to post it as an answer, or I can post an answer with my code and an explanation as I understand it.Audet
I'm still trying to understand why you're saying "SERVICE line wasn't working with SPARQLWrapper" - especially the expression "not working" is always not meaningful in computer science. But yes, the label service of Wikidata has a fallback to the entity URI/literal itself - that's why we don't have to care about literals at all here given that in RDF literals can't be the subject of an RDF triple, thus, they can't have a label.Differentiation
yeah - feel free and post it as an answer. I guess it's always easier to understand if the TO describes the solution given that he knows best about the issues during writing the queryDifferentiation
A
7

I was having problems with two approaches - and the code above contains a mixture of both. Also, SPARQLWrapper isn't a problem here.

The first approach using the wikibase Label service should be like this:

SELECT ?a ?aLabel ?propLabel ?b ?bLabel
WHERE
{
  ?item rdfs:label "weather"@en.
  ?item ?a ?b.

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } 
  ?prop wikibase:directClaim ?a .
}

This code also includes a lookup from the label ('weather') to the query entity (?item).

The SERVICE was working, but if there isn't an rdfs:label definition then it just returns the entity. The GUI and SPARQLWrapper (to the SPARQL endpoint) were simply returning the results in a different order - so it looked like I was seeing lots of 'failed' output (ie. entities and failed labels both being reported as the same).

This became clear when I started adding an OPTIONAL clause to the approach below.

The ?prop wikibase:directClaim ?a . line turns out to be pretty simple. Wikibase defines directClaim to map properties to entities. This then allows it to define tuples about properties (ie. a label). Many other ontologies just use the same identifiers.

My second (more generic approach) is the approach you find in many of the books and online tutorials. The problem here is that wikibase's properties have the full URL in them, and I needed to convert them into an entity. I tried string manipulation but this produces a string literal - not an entity. The solution is to use directClaim again:

?prop wikibase:directClaim ?a .
?prop rdfs:label ?propLabel.  filter(lang(?propLabel) = "en").

Note that this only returns a result if rdfs:label is defined. Adding an OPTIONAL will return results even if there is no label defined.

Audet answered 8/6, 2019 at 15:45 Comment(1)
Thank you! See example at Wikidata Query Service, how to get a name of property: w.wiki/6c7ARaskind

© 2022 - 2024 — McMap. All rights reserved.