"Transposing" objects in jq
Asked Answered
C

3

7

I'm unsure if "transpose" is the correct term here, but I'm looking to use jq to transpose a 2-dimensional object such as this:

[
    {
        "name": "A",
        "keys": ["k1", "k2", "k3"]
    },
    {
        "name": "B",
        "keys": ["k2", "k3", "k4"]
    }
]

I'd like to transform it to:

{
    "k1": ["A"],
    "k2": ["A", "B"],
    "k3": ["A", "B"],
    "k4": ["A"],
}

I can split out the object with .[] | {key: .keys[], name} to get a list of keys and names, or I could use .[] | {(.keys[]): [.name]} to get a collection of key–value pairs {"k1": ["A"]} and so on, but I'm unsure of the final concatenation step for either approach.

Are either of these approaches heading in the right direction? Is there a better way?

Calle answered 2/9, 2015 at 15:28 Comment(0)
J
9

This should work:

map({ name, key: .keys[] })
    | group_by(.key)
    | map({ key: .[0].key, value: map(.name) })
    | from_entries

The basic approach is to convert each object to name/key pairs, regroup them by key, then map them out to entries of an object.

This produces the following output:

{
  "k1": [ "A" ],
  "k2": [ "A", "B" ],
  "k3": [ "A", "B" ],
  "k4": [ "B" ]
}
Jaquesdalcroze answered 2/9, 2015 at 17:22 Comment(7)
Thanks! I'd got as far as the group_by, but I have to admit the nested map following it is still throwing me a little. Is there a simpler example or documentation of that behaviour?Calle
When you do a group_by, it is placing all items that has matching keys into an array. Consequentially, every item in that inner array will have the same key values. So the goal at that point was to convert the array of arrays to an array of objects. We wanted the value property to be the names found in that array, thus the inner map.Jaquesdalcroze
It's specifically this step that's confusing me, I get the rest; the key is clear but I can't get my head around map(.name). I'm thinking of it as a nested foreach, and I guess that's my problem :-)Calle
To understand "specifically this step", break it down a bit, and look at: .[0] | {key: .[0].key,value:map(.name)}Zeniazenith
That helped, and seems rather obvious now! Not sure why it wasn't sinking in yesterday.Calle
wow. I was looking for this for a long while. I still do not get value: map(.name) part though :(Ligamentous
oh, I got it by applying the definition of map(), and looking at the second line instead: .[1] | {key: .[0].key,value:[.[].name]}Ligamentous
Z
1

Here is a simple solution that may also be easier to understand. It is based on the idea that a dictionary (a JSON object) can be extended by adding details about additional (key -> value) pairs:

# input: a dictionary to be extended by key -> value 
# for each key in keys
def extend_dictionary(keys; value):
  reduce keys[] as $key (.; .[$key] += [value]);

reduce .[] as $o ({}; extend_dictionary($o.keys; $o.name) )


$ jq -c -f transpose-object.jq input.json
{"k1":["A"],"k2":["A","B"],"k3":["A","B"],"k4":["B"]}
Zeniazenith answered 2/9, 2015 at 21:26 Comment(3)
This example is certainly more understandable for an imperative programmer! However I'm keen to pick up some of the more advanced data-driven techniques, which feel more like "idiomatic jq" :-)Calle
"map" and "reduce" are like "yin" and "yang" so I'm not sure why you think one is "more advanced" than the other. jq embraces the map/reduce paradigm very nicely, so I'm puzzled why you think one is more "idiomatic" than the other. Is it because the syntax for jq's reduce does not take the form of a function call?Zeniazenith
I understand map/reduce as a paradigm and appreciate the difference between both these solutions, but what I meant was that this feels like a more procedural approach. Now that I see the nested map above is doing the same as the nested reduce here, I'm less inclined to refer to it as advanced, however I do like its brevity!Calle
Z
0

Here is a better solution for the case that all the values of "name" are distinct. It is better because it uses a completely generic filter, invertMapping; that is, invertMapping could be a built-in or library function. With the help of this function, the solution becomes a simple three-liner.

Furthermore, if the values of "name" are not all unique, then the solution below can easily be tweaked by modifying the initial reduction of the input (i.e. the line immediately above the invocation of invertMapping).

# input: a JSON object of (key, values) pairs, in which "values" is an array of strings; 
# output: a JSON object representing the inverse relation
def invertMapping: 
  reduce to_entries[] as $pair
    ({}; reduce $pair.value[] as $v (.; .[$v] += [$pair.key] ));


map( { (.name) : .keys} )
| add
| invertMapping
Zeniazenith answered 3/9, 2015 at 15:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.