SPARQL concat plus group_concat on multiple fields

About

Asked 13/12, 2016 at 4:25 Answered 13/12, 2016 at 9:8

I have the following RDF structure that I cannot change:

Multiple Assignments can be associated to each employee (Manager). The output I'd like would be (including the word "in" and "&):

Employee Name   | Assignment

Name 1          | Assignment1 in Location1 & Assignment2 in Location2 &....
Name 2          | Assignment1 in Location2 & Assignment3 in Location1 &....

Is there a way to do this in Sparql?

This is what I have so far:

select ?name group_concat(DISTINCT ?description; separator("&"))
where
{
  ?employee :hasName ?name
  {
  select concat(?name, "In", ?location)
  ?employee ^:hasManager/:hasAsstName ?name
  ?employee ^:hasManager/:hasLocation ?location
  }

}

This gives me empty employee name and lots of ?Descriptions. It does not seem to reflect what I was expecting.

Ungula answered 13/12, 2016 at 4:25 Comment(1)

Do you have a query that goes part of the way? If so, edit your question to include. If not: why not? – Affirmative 13/12, 2016 at 7:13

Assuming the nested query is fine, you should assign a variable there to group concatenate and then group the results for all not concatenated variables. The query should look something like this:

select ?name (group_concat(DISTINCT ?description; separator = " & ") as ?descriptions)
where
{
  ?employee :hasName ?name
  {
  select (concat(?name, " in ", ?location) AS ?description)
  ?employee ^:hasManager/:hasAsstName ?name
  ?employee ^:hasManager/:hasLocation ?location
  }

}

GROUP BY ?name

Note the syntax for GROUP_CONCAT.

If you remove the subquery, it will be much faster. As I don't have your data, here's a very similar query on DBpedia, not using subquery:

SELECT ?name (GROUP_CONCAT(DISTINCT ?SpouseInfo; separator = " & ") AS ?SpousesInfo)

{
    ?name a foaf:Person;
    dbo:spouse ?spouse.
    ?spouse dbo:residence/rdfs:label ?residence;
    rdfs:label ?spouse_name

    BIND (CONCAT(?spouse_name, " lives in ",?residence) AS ?SpouseInfo)
}
GROUP BY ?name
ORDER BY ?name

LIMIT 100

Here's the result.

Creepie answered 13/12, 2016 at 9:8 Comment(8)

Thanks! I missed out adding "as ?description". Is there a better way to do this? The query seems to take a long time to execute and sometimes just times out. – Ungula 13/12, 2016 at 9:51

Which triple store. How big is your data. group_concat is often expensive and depending on the implementation might block some optimizations. – Farthing 13/12, 2016 at 9:58

You can speed it up by removing the nested query and bringing its content in the upper query. – Creepie 13/12, 2016 at 10:6

Virtuoso; close to a billion triples. Are you'll sure that this is the right query logic, though? – Ungula 13/12, 2016 at 10:24

I have provided an example with a very similar query on DBpedia. Let me know if that helps. – Creepie 13/12, 2016 at 14:50

Yes, helps and works for my data as well. Thank you! If anyone comes up with a way to this with enhanced performance, please let me know. – Ungula 13/12, 2016 at 15:14

What`s wrong with the performance of the dbpedia query? Have you tried removing the subquery? – Creepie 13/12, 2016 at 15:36

The dbpedia query works well. For my triple store, I expect around a million return triples on average. Because the external service that consumes the output of this query expects a full output I cannot use LIMITS and OFFSETS to break the output into smaller parts. So far so good though. Things seem to be working. Thanks! – Ungula 14/12, 2016 at 3:27

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags