SPARQL concat plus group_concat on multiple fields
Asked Answered
U

1

6

I have the following RDF structure that I cannot change: enter image description here

Multiple Assignments can be associated to each employee (Manager). The output I'd like would be (including the word "in" and "&):

Employee Name   | Assignment

Name 1          | Assignment1 in Location1 & Assignment2 in Location2 &....
Name 2          | Assignment1 in Location2 & Assignment3 in Location1 &....

Is there a way to do this in Sparql?

This is what I have so far:

select ?name group_concat(DISTINCT ?description; separator("&"))
where
{
  ?employee :hasName ?name
  {
  select concat(?name, "In", ?location)
  ?employee ^:hasManager/:hasAsstName ?name
  ?employee ^:hasManager/:hasLocation ?location
  }

}

This gives me empty employee name and lots of ?Descriptions. It does not seem to reflect what I was expecting.

Ungula answered 13/12, 2016 at 4:25 Comment(1)
Do you have a query that goes part of the way? If so, edit your question to include. If not: why not?Affirmative
C
5

Assuming the nested query is fine, you should assign a variable there to group concatenate and then group the results for all not concatenated variables. The query should look something like this:

select ?name (group_concat(DISTINCT ?description; separator = " & ") as ?descriptions)
where
{
  ?employee :hasName ?name
  {
  select (concat(?name, " in ", ?location) AS ?description)
  ?employee ^:hasManager/:hasAsstName ?name
  ?employee ^:hasManager/:hasLocation ?location
  }

}

GROUP BY ?name

Note the syntax for GROUP_CONCAT.

If you remove the subquery, it will be much faster. As I don't have your data, here's a very similar query on DBpedia, not using subquery:

SELECT ?name (GROUP_CONCAT(DISTINCT ?SpouseInfo; separator = " & ") AS ?SpousesInfo)

{
    ?name a foaf:Person;
    dbo:spouse ?spouse.
    ?spouse dbo:residence/rdfs:label ?residence;
    rdfs:label ?spouse_name

    BIND (CONCAT(?spouse_name, " lives in ",?residence) AS ?SpouseInfo)
}
GROUP BY ?name
ORDER BY ?name

LIMIT 100

Here's the result.

Creepie answered 13/12, 2016 at 9:8 Comment(8)
Thanks! I missed out adding "as ?description". Is there a better way to do this? The query seems to take a long time to execute and sometimes just times out.Ungula
Which triple store. How big is your data. group_concat is often expensive and depending on the implementation might block some optimizations.Farthing
You can speed it up by removing the nested query and bringing its content in the upper query.Creepie
Virtuoso; close to a billion triples. Are you'll sure that this is the right query logic, though?Ungula
I have provided an example with a very similar query on DBpedia. Let me know if that helps.Creepie
Yes, helps and works for my data as well. Thank you! If anyone comes up with a way to this with enhanced performance, please let me know.Ungula
What`s wrong with the performance of the dbpedia query? Have you tried removing the subquery?Creepie
The dbpedia query works well. For my triple store, I expect around a million return triples on average. Because the external service that consumes the output of this query expects a full output I cannot use LIMITS and OFFSETS to break the output into smaller parts. So far so good though. Things seem to be working. Thanks!Ungula

© 2022 - 2024 — McMap. All rights reserved.