How to recursively merge inherited json array elements?
Asked Answered
V

2

1

I have the following json file named CMakePresets.json that is a cmake-preset file:

{
  "configurePresets": [
    {
      "name": "default",
      "hidden": true,
      "generator": "Ninja",
      "binaryDir": "${sourceDir}/_build/${presetName}",
      "cacheVariables": {
        "YIO_DEV": "1",
        "BUILD_TESTING": "1"
      }
    },
    {
      "name": "debug",
      "inherits": "default",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Debug"
      }
    },
    {
      "name": "release",
      "inherits": "default",
      "binaryDir": "${sourceDir}/_build/Debug",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Release"
      }
    },
    {
      "name": "arm",
      "inherits": "debug",
      "cacheVariables": {
        "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/Toolchain/arm-none-eabi-gcc.cmake"
      }
    }
  ]
}

I want recursively merge with * the configurePresets elements that inherit themselves for a specific entry name. I have example a node with name arm and want to have resulting json object with resolved inheritance. The parent has the name stored inside .inherits of each element. arm inherits over debug which inherits over default.

I could write a bash shell loop that I believe works, with the help of Remove a key:value from an JSON object using jq and this answer:

input=arm
# extract one element
g() { jq --arg name "$1" '.configurePresets[] | select(.name == $name)' CMakePresets.json; };
# get arm element
acc=$(g "$input");
# If .inherits field exists
while i=$(<<<"$acc" jq -r .inherits) && [[ -n "$i" && "$i" != "null" ]]; do
   # remove it from input
   a=$(<<<"$acc" jq 'del(.inherits)');
   # get parent element
   b=$(g "$i");
   # merge parent with current
   acc=$(printf "%s\n" "$b" "$a" | jq -s 'reduce .[] as $item ({}; . * $item)');
done;
echo "$acc"

outputs, which I believe is the expected output for arm:

{
  "name": "arm",
  "hidden": true,
  "generator": "Ninja",
  "binaryDir": "${sourceDir}/_build/${presetName}",
  "cacheVariables": {
    "YIO_DEV": "1",
    "BUILD_TESTING": "1",
    "CMAKE_BUILD_TYPE": "Debug",
    "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/Toolchain/arm-none-eabi-gcc.cmake"
  }
}

But I want to write it in jq. I tried and jq language is not intuitive for me. I can do it for example for two (ie. countable) elements:

< CMakePresets.json jq --arg name "arm" '
   def g(n): .configurePresets[] | select(.name == n);
   g($name) * (g($name) | .inherits) as $name2 | g($name2)
'

But I do not know how to do reduce .[] as $item ({}; . * $item) when the $item is really g($name) that depends on the last g($name) | .inherits. I tried reading jq manual and learning about variables and loops, but jq has a very different syntax. I tried to use while, but that's just syntax error that I do not understand and do not know how to fix. I guess while and until might not be right here, as they operate on previous loop output, while the elements are always from root.

$ < CMakePresets.json jq --arg name "arm" 'def g(n): .configurePresets[] | select(.name == n);
while(g($name) | .inherits as $name; g($name))   
'
jq: error: syntax error, unexpected ';', expecting '|' (Unix shell quoting issues?) at <top-level>, line 2:
while(g($name) | .inherits as $name; g($name))                                      
jq: 1 compile error

How to write such loop in jq language?

Vassili answered 24/3, 2021 at 17:32 Comment(0)
D
1

Assuming the inheritance hierarchy contains no loops, as is the case with the example, we can break the problem down into the pieces shown below:

# Use an inner function of arity 0 to take advantage of jq's TCO
def inherits_from($dict):
  def from:
    if .name == "default" then .
    else $dict[.inherits] as $next
    | ., ($next | from)
    end;
  from;

def chain($start):
  INDEX(.configurePresets[]; .name) as $dict
  | $dict[$start] | inherits_from($dict);

reduce chain("arm") as $x (null;
  ($x.cacheVariables + .cacheVariables) as $cv
  | $x + .
  | .cacheVariables = $cv)
| del(.inherits)

This produces the desired output efficiently.

One advantage of the above formulation of a solution is that it can easily be modified to handle circular dependencies.

Using recurse/1

inherits_from/1 could also be defined using the built-in function recurse/1:

def inherits_from($dict):
  recurse( select(.name != "default") | $dict[.inherits]) ;

or perhaps more interestingly:

def inherits_from($dict):
  recurse( select(.inherits) | $dict[.inherits]) ;

Using *

Using * to combine objects has a high overhead because of its recursive semantics, which is often either not required or not wanted. However, if it is acceptable here to use * for combining the objects, the above can be simplified to:

def inherits_from($dict):
  recurse( select(.inherits) | $dict[.inherits]) ;

INDEX(.configurePresets[]; .name) as $dict
| $dict["arm"] 
| reduce inherits_from($dict) as $x ({};  $x * .)
| del(.inherits)
Dosia answered 24/3, 2021 at 18:31 Comment(0)
V
1

Writing a recursive function is actually simple, once you get the hang of it:

jq --arg name "$1" '
    def _get_in(input; n):
        (input[] | select(.name == n)) |
        (if .inherits then .inherits as $n | _get_in(input; $n) else {} end) * .;
    def get(name):
        .configurePresets as $input | _get_in($input; name);
    get($name)
' "$presetfile"

First I filter only .configurePresets then in a function I get input[] | select(.name == n) only the part I am interested in. Then if .inherits if it has inherits, then .inherits as $n | _get_in(input; $n) take the name in inherits and call itself again. Else return else {} end empty. Then that is * . merged with the result of input[] | select(.name == n) - the itself. So it recursively loads all the {} * (input[]|select()) * (input[]|select()) * (input[]|select()).

Vassili answered 26/3, 2021 at 12:47 Comment(2)
This solution is rather inefficient: (a) input[] | select(.name == n) vs having a lookup table; (b) in jq it is generally preferable to avoid having functions of arity greater than 0 calling themselves directly.Dosia
A stylistic point: since input/0 is a built-in function, it would be better not to use "input" as a named argument in a def.Dosia

© 2022 - 2024 — McMap. All rights reserved.