Is there YAML syntax for sharing part of a list or map?
Asked Answered
G

5

134

So, I know I can do something like this:

sitelist: &sites
  - www.foo.com
  - www.bar.com

anotherlist: *sites

And have sitelist and anotherlist both contain www.foo.com and www.bar.com. However, what I really want is for anotherlist to also contain www.baz.com, without having to repeat www.foo.com and www.baz.com.

Doing this gives me a syntax error in the YAML parser:

sitelist: &sites
  - www.foo.com
  - www.bar.com

anotherlist: *sites
  - www.baz.com

Just using anchors and aliases it doesn't seem possible to do what I want without adding another level of substructure, such as:

sitelist: &sites
  - www.foo.com
  - www.bar.com

anotherlist:
  - *sites
  - www.baz.com

Which means the consumer of this YAML file has to be aware of it.

Is there a pure YAML way of doing something like this? Or will I have to use some post-YAML processing, such as implementing variable substitution or auto-lifting of certain kinds of substructure? I'm already doing that kind of post-processing to handle a couple of other use-cases, so I'm not totally averse to it. But my YAML files are going to be written by humans, not machine generated, so I would like to minimise the number of rules that need to be memorised by my users on top of standard YAML syntax.

I'd also like to be able to do the analogous thing with maps:

namedsites: &sites
  Foo: www.foo.com
  Bar: www.bar.com

moresites: *sites
  Baz: www.baz.com

I've had a search through the YAML spec, and couldn't find anything, so I suspect the answer is just "no you can't do this". But if anyone has any ideas that would be great.


EDIT: Since there have been no answers, I'm presuming that no one has spotted anything I haven't in the YAML spec and that this can't be done at the YAML layer. So I'm opening up the question to idea for post-processing the YAML to help with this, in case anyone finds this question in future.

Germinative answered 13/2, 2012 at 0:37 Comment(2)
Note: This issue may also be addressed with standard use of Anchors and Aliases in YAML. See also: How to merge YAML arrays?Yeung
See also: https://mcmap.net/q/109704/-what-do-the-amp-lt-lt-mean-in-this-database-yml-file/974555Ewen
M
69

The merge key type is probably what you want. It uses a special << mapping key to indicate merges, allowing an alias to a mapping (or a sequence of such aliases) to be used as an initializer to merge into a single mapping. Additionally, you can still explicitly override values, or add more that weren't present in the merge list.

It's important to note that it works with mappings, not sequences as your first example. This makes sense when you think about it, and your example looks like it probably doesn't need to be sequential anyway. Simply changing your sequence values to mapping keys should do the trick, as in the following (untested) example:

sitelist: &sites
  ? www.foo.com  # "www.foo.com" is the key, the value is null
  ? www.bar.com

anotherlist:
  << : *sites    # merge *sites into this mapping
  ? www.baz.com  # add extra stuff

Some things to notice. Firstly, since << is a key, it can only be specified once per node. Secondly, when using a sequence as the value, the order is significant. This doesn't matter in the example here, since there aren't associated values, but it's worth being aware.

Miticide answered 1/3, 2012 at 23:34 Comment(7)
Ah, thank you! That's pretty helpful. It's a shame it doesn't work for sequences, though. You're right that order isn't important for this example; what I have is conceptually a set, but that maps much more closely to a sequence than to a mapping. And the structure of what I get out of this matters (which is why I didn't want to just add another layer of nesting to merge my structures), so having a mapping for which I need to ignore the (all null) values doesn't really work.Germinative
I don't see anything on it in the current official YAML spec: yaml.org/spec/1.2/spec.html. That page doesn't contain the word "merge", nor the text "<<", nor the phrase "key type". The << syntax does work in the Python yaml package though. Do you know where I can find out more about these sorts of extra features?Germinative
It's not directly in the spec, it's described in the tag repository. Other Schemas has a general description and link. Besides merge keys, there are also sets and ordered sets; however, YAML considers sets as a type of mapping (e.g., the above example could be implemented as a set). Does your language allow you to swap keys with values in the resulting mapping? Even if you have to implement that yourself, I think it would be cleaner; you would at least have all the data grouped together already, and your YAML would be standard.Miticide
Sets aren't mappings though; a mapping is a set of key-value associations. When I yaml.load(...) in Python, I get a dictionary as the representation of a YAML mapping. Yes, it's easy to post-process that into a set, but I have to know that that's happened (and the semantic complexity when reading/writing the config files is much higher if the rule is "sets are written as maps with null values"). Given that I need a post-processing between yaml.load(...) and using the resulting data whether I use << or MERGE, I'll probably stick with MERGE (which I've already implemented now).Germinative
You could try explicitly tagging it as a set, using the !!set notation. A Python implementation should support sets natively. PythonTagScheme looks informative.Miticide
Yeah, I did find that !!set works. Too much obscure boilerplate though. These files are made to be human readable/writeable, by people who aren't necessarily YAML experts. People are going to write down their lists of sites as YAML lists, then want to merge them and have to convert the whole thing to a set AND remember to explicitly tag it as a set... I have a couple of other standardised post-processing things along with MERGE anyway. Thanks for your help though!Germinative
How is a list with linear lookup time that allows duplicates remotely similar to a set (or hash with null values) that must have unique keys and has constant time lookup? :pIrrational
A
20

As the previous answers have pointed out, there is no built-in support for extending lists in YAML. I am offering yet another way to implement it yourself. Consider this:

defaults: &defaults
  sites:
    - www.foo.com
    - www.bar.com
     
setup1:
  <<: *defaults
  sites+:
    - www.baz.com

This will be processed into:

defaults:
  sites:
    - www.foo.com
    - www.bar.com

setup1:
  sites:
    - www.foo.com
    - www.bar.com
    - www.baz.com

The idea is to merge the contents of a key ending with a '+' to the corresponding key without a '+'. I implemented this in Python and published here.

Archespore answered 18/1, 2017 at 20:11 Comment(3)
Note: This issue may also be addressed with standard use of Anchors and Aliases in YAML. See also: How to merge YAML arrays?Yeung
Does this mean this approach only works with a separate tool that merges sites and sites+. I mean a tool that has to be implemented by the user as this is not a default yaml behaviour?Beeeater
This is not default YMAL behavior.Greenwood
G
10

(Answering my own question in case the solution I'm using is useful for anyone who searches for this in future)

With no pure-YAML way to do this, I'm going to implement this as a "syntax transformation" sitting between the YAML parser and the code that actually uses the configuration file. So my core application doesn't have to worry at all about any human-friendly redundancy-avoidance measures, and can just act directly on the resulting structures.

The structure I'm going to use looks like this:

foo:
  MERGE:
    - - a
      - b
      - c
    - - 1
      - 2
      - 3

Which would be transformed to the equivalent of:

foo:
  - a
  - b
  - c
  - 1
  - 2
  - 3

Or, with maps:

foo:
  MERGE:
    - fork: a
      spoon: b
      knife: c
    - cup: 1
      mug: 2
      glass: 3

Would be transformed to:

foo:
  fork: a
  spoon: b
  knife: c
  cup: 1
  mug: 2
  glass: 3

More formally, after calling the YAML parser to get native objects from a config file, but before passing the objects to the rest of the application, my application will walk the object graph looking for mappings containing the single key MERGE. The value associated with MERGE must be either a list of lists, or a list of maps; any other substructure is an error.

In the list-of-lists case, the entire map containing MERGE will be replaced by the child lists concatenated together in the order they appeared.

In the list-of-maps case, the entire map containing MERGE will be replaced by a single map containing all of the key/value pairs in the child maps. Where there is overlap in the keys, the value from the child map occurring last in the MERGE list will be used.

The examples given above are not that useful, since you could have just written the structure you wanted directly. It's more likely to appear as:

foo:
  MERGE:
    - *salt
    - *pepper

Allowing you to create a list or map containing everything in nodes salt and pepper being used elsewhere.

(I keep giving that foo: outer map to show that MERGE must be the only key in its mapping, which means that MERGE cannot appear as a top-level name unless there are no other top level names)

Germinative answered 17/2, 2012 at 3:20 Comment(0)
D
8

To clarify something from the two answers here, this is not supported directly in YAML for lists (but it is supported for dictionaries, see kittemon's answer).

Drier answered 20/8, 2014 at 19:35 Comment(1)
Note: This issue may also be addressed with standard use of Anchors and Aliases in YAML. See also: How to merge YAML arrays?Yeung
H
5

To piggyback off of Kittemon's answer, note that you can create mappings with null values using the alternative syntax

foo:
    << : myanchor
    bar:
    baz:

instead of the suggested syntax

foo:
    << : myanchor
    ? bar
    ? baz

Like Kittemon's suggestion, this will allow you to use references to anchors within the mapping and avoid the sequence issue. I found myself needing to do this after discovering that the Symfony Yaml component v2.4.4 doesn't recorgnize the ? bar syntax.

Huckster answered 24/9, 2014 at 18:33 Comment(2)
what does myanchor look like ?Shiism
myanchor looks like &myanchor where it's declared and like *myanchor where it's used. Example in JSON-ish syntax because comments don't allow for code blocks: { original: &myanchor { foo: "Hi", bar: "World" }, later: { <<: *myanchor, foo: "Bye"} }Paddock

© 2022 - 2025 — McMap. All rights reserved.