How can I do string concatenation, or string replacement in YAML?
Asked Answered
A

10

74

I have this:

user_dir: /home/user
user_pics: /home/user/pics

How could I use the user_dir for user_pics? If I have to specify other properties like this, it would not be very DRY.

Adlay answered 30/3, 2011 at 8:44 Comment(0)
C
65

You can use a repeated node, like this:

user_dir: &user_home /home/user
user_pics: *user_home

I don't think you can concatenate though, so this wouldn't work:

user_dir: &user_home /home/user
user_pics: *user_home/pics
Carltoncarly answered 10/5, 2011 at 18:36 Comment(4)
Thanks for clearing that up. Silly me for thinking this was possible.Guntar
tbh - I don't understand what is the point of those pointers if you can't use them to concatenate.. :/Glennisglennon
This answer shows how to define !join so that you can concatenate.Diffraction
@BrunoAmbrozio Because you might want your variables to be part of multiple nested structures in your yaml file and you don't want to copy the same value everywhere and keep track of multiple values.Wallywalnut
P
49

It's surprising, since the purpose of YAML anchors & references is to factor duplication out of YAML data files, that there isn't a built-in way to concatenate strings using references. Your use case of building up a path name from parts is a good example. There must be many such uses.

Fortunately there's a simple way to add string concatenation to YAML via user-defined tags.

User-defined tags is a standard YAML capability - the YAML 1.2 spec says YAML schemas allow the "use of arbitrary explicit tags". Handlers for those custom tags need to be implemented in a custom way in each language you're targeting. Doing that in Python looks like this:

## in your python code

import yaml

## define custom tag handler
def join(loader, node):
    seq = loader.construct_sequence(node)
    return ''.join([str(i) for i in seq])

## register the tag handler
yaml.add_constructor('!join', join)

## using your sample data
yaml.load("""
user_dir: &DIR /home/user
user_pics: !join [*DIR, /pics]
""")

Which results in:

{'user_dir': '/home/user', 'user_pics': '/home/user/pics'}

You can add more items to the array, like " " or "-", if the strings should be delimited.

Something similar can be done in other languages, depending on what their parsers can do.

There are several comments along the lines of "this seems wrong since YAML is a standard, implementation-neutral language". Actually that's not what YAML is. YAML is a framework for mapping YAML schemas (consisting of tags) to implementation-specific data types, e.g. how does int map to of Python, Javascript, C++ etc. There are multiple standard YAML schemas, and which one(s) are supported by a parser is an implementation decision. When it's useful, you can create schemas with additional custom tags, and of course that requires additional parser implementation. Whether adding custom tags is a good idea or not depends on your use case. The capability exists in YAML; whether and how to apply it is up to you. Use good judgement :).

Pneuma answered 22/4, 2014 at 6:53 Comment(11)
Err… So, what does it have to do with YAML, a markup language independent of Python, Haskell, or whatever?Ephesus
The YAML spec says: "Explicit typing is denoted with a tag using the exclamation point (“!”) ... Application-specific local tags may also be used." I provided an implementation of an explicit type that meets the spec. The implementation has to be in some language. I used Python because the OP said he wanted a more "DRY" approach for his YAML, and DRY is a term most often used by Python people. The same custom tag could be implemented in other languages. In other words, what the OP asked isn't available in vanilla YAML, but is available via an extension mechanism defined by YAML.Pneuma
@Ephesus Whenever someone's using YAML they're going to be using some parser written in some specific language - a similar extension can be made in that language. This is just one example for the case of Python.Irons
@KenWilliams okay, I will bear it in mind, so the next time I encounter a proprietary app using YAML for configuration, should I be in need of string concatenation, I should ask authors to provide me with parser source code and all the necessary build system pieces.Ephesus
@Ephesus As Chris Johnson said, this is an explicit extension mechanism defined by the YAML specification itself. And there's no need to be snarky.Irons
@KenWilliams okay, maybe I was confused, I'm sorry if this is the case. If this is true, I think Chris Johnson may get much more upvotes and much less confused comments if he showed a complete YAML example that is using that extension mechanism. Because right now when I'm looking at the answer, I basically see a suggestion to modify the YAML parser code. Maybe a link to some other SO answer which says "even if you don't have source code for the app, you still can easily extend YAML parser in a language you like as follows…" — that would've greatly helped.Ephesus
The existing answer does not modify the YAML parser code. It does not presume you have access to the source code of the yaml package. It simply uses its public APIs to extend it in the way allowed by the YAML spec.Irons
Hi @Strictly - Thanks for the code. It's not working when my YAML is a file though. I get: "could not determine a constructor for the tag '!join'" Could you help, please? I've also tried adding double "!" but then I get "could not determine a constructor for the tag 'tag:yaml.org,2002:join'"Glennisglennon
@BrunoAmbrozio hi! It's Chris Johnson's answer, I just edited it and it was more than a year ago, I don't remember anything about that, sorry. You should ask Chris JohnsonStrictly
How can I use it in Azure DevOps pipeline?Frightened
"DRY is a term most often used by Python people". Absolutely not. DRY is an acronym common in software engineering. It has nothing to do with Python.Enjoyable
M
9

If you are using python with PyYaml, joining strings is possible within the YAML file. Unfortunately this is only a Python solution, not a universal one:

with os.path.join:

user_dir: &home /home/user
user_pics: !!python/object/apply:os.path.join [*home, pics]

with string.join (for completeness sake - this method has the flexibility to be used for multiple forms of string joining:

user_dir: &home /home/user
user_pics: !!python/object/apply:string.join [[*home, pics], /]
Messick answered 6/3, 2014 at 21:58 Comment(3)
Be clear on what this approach means -- you are allowing the Pyyaml parser to execute arbitrary Python code from the yaml input file. This is an extremely dangerous security problem if you don't control the source of the input. Many programs will use the yaml.safe_load() function to avoid the security issue. In that case, your code won't work.Pneuma
do you have a python3 example for string joining? cannot find module 'str' (No module named 'str')Gesture
Does not work with yaml.safe_load() as pointed out above.Appellant
E
7

I would use an array, then join the string together with the current OS Separator Symbol

like this:

default: &default_path "you should not use paths in config"
pictures:
  - *default_path
  - pics
Eyre answered 27/2, 2014 at 22:3 Comment(0)
N
5

Seems to me that YAML itself does not define way to do this.

Good news are that YAML consumer might be able to understand variables.
What will use Your YAML?

Namtar answered 30/3, 2011 at 8:51 Comment(2)
My yaml file serves as a configuration file. What I pasted above was just as an example, to illustrate the problem.Adlay
@ArnisL. for example this one to define a path to source code directories upon building a separate module.Ephesus
D
3

string.join() won't work in Python3, but you can define a !join like this:

import functools
import yaml

class StringConcatinator(yaml.YAMLObject):
    yaml_loader = yaml.SafeLoader
    yaml_tag = '!join'
    @classmethod
    def from_yaml(cls, loader, node):
        return functools.reduce(lambda a, b: a.value + b.value, node.value)

c=yaml.safe_load('''
user_dir: &user_dir /home/user
user_pics: !join [*user_dir, /pics]''')
print(c)
Diffraction answered 19/6, 2019 at 0:24 Comment(0)
H
3

As of August 2019:

To make Chris' solution work, you actually need to add Loader=yaml.Loader to yaml.load(). Eventually, the code would look like this:

import yaml

## define custom tag handler
def join(loader, node):
    seq = loader.construct_sequence(node)
    return ''.join([str(i) for i in seq])

## register the tag handler
yaml.add_constructor('!join', join)

## using your sample data
yaml.load("""
user_dir: &DIR /home/user
user_pics: !join [*DIR, /pics]
""", Loader=yaml.Loader)

See this GitHub issue for further discussion.

Heisel answered 2/8, 2019 at 13:18 Comment(0)
I
0

A solution similar to @Chris but using Node.JS:

const yourYaml = `
user_dir: &user_home /home/user
user_pics: !join [*user_home, '/pics']
`;

const JoinYamlType = new jsyaml.Type('!join', {
    kind: 'sequence',
    construct: (data) => data.join(''),    
})

const schema = jsyaml.DEFAULT_SCHEMA.extend([JoinYamlType]);

console.log(jsyaml.load(yourYaml, { schema }));
<script src="https://cdnjs.cloudflare.com/ajax/libs/js-yaml/4.1.0/js-yaml.min.js"></script>

To use yaml in Javascript / NodeJS we can use js-yaml:

import jsyaml from 'js-yaml';
// or
const jsyaml = require('js-yaml');
Inellineloquent answered 17/9, 2022 at 0:3 Comment(0)
A
0

Here is the example of join tag implementation in Python with ruamel.yaml

from ruamel.yaml import YAML

class JoinTag:
    """a tag to join strings in a list"""

    yaml_tag = u'!join'

    @classmethod
    def from_yaml(cls, constructor, node):
        seq = constructor.construct_sequence(node)
        return ''.join([str(i) for i in seq])

    @classmethod
    def to_yaml(cls, dumper, data):
        # do nothing
        return dumper.represent_sequence(cls.yaml_tag, data)

    @classmethod
    def register(cls, yaml: YAML):
        yaml.register_class(cls)


if __name__ == '__main__':
    import io
    f = io.StringIO('''\
base_dir: &base_dir /this/is/a/very/very/long/path/
data_file: !join [*base_dir, data.csv]
    ''')
    yaml = YAML(typ='safe')
    JoinTag.register(yaml)
    print(yaml.load(f))

And the output will be

{'base_dir': '/this/is/a/very/very/long/path/', 'data_file': '/this/is/a/very/very/long/path/data.csv'}
Abadan answered 26/6, 2023 at 7:12 Comment(0)
M
0

yaml files do support variable substitution but it is executed in a lazy approach by default.

the syntax for variable substitution in yaml file is

# this is test.yaml file and its contents.
server:
  host: localhost
  port: 80

client:
  url: http://${server.host}:${server.port}/
  server_port: ${server.port}
  # relative interpolation
  description: Client of ${.url}

if we use this default lazy approach:

from omegaconf import OmegaConf

conf = Omegaconf.load("test.yaml")

print(f"type: {type(conf).__name__}, value: {repr(conf)}")
print(f"url: {conf.client.url}\n")
print(f"server_port: {conf.client.server_port}\n")
print(f"description: {conf.client.description}\n")

output:

type: DictConfig, value: {'server': {'host': 'localhost', 'port': 80}, 'client': {'url': 'http://${server.host}:${server.port}/', 'server_port': '${server.port}', 'description': 'Client of ${.url}'}}

url: http://localhost:80/
server_port: 80
description: Client of http://localhost:80/

Notice now when we accessed and printed the values the variable as been substituted.

but when we want to pass the entire dict for a parameter then we should use this approach:

from omegaconf import OmegaConf

conf = Omegaconf.load("test.yaml")
conf = OmegaConf.to_container(conf, resolve=True)
print(conf)

# output
type: dict, value: {'server': {'host': 'localhost', 'port': 80}, 'client': {'url': 'http://localhost:80/', 'server_port': 80, 'description': 'Client of http://localhost:80/'}}
Matthieu answered 22/7, 2023 at 14:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.