Search & Replace String Value in YAML with format using Shell [duplicate]
Asked Answered
S

2

6

I have YAML find and replace value with correct YAML format (with space and quote). Below Sample YAML, I am able to replace jdbcUrl value using below Sed command. But, need help how to prefix a space and quote of the value using Sed. Below Sed will find and replace a required jdbcUrl. But, it won't prefix a space (YAML standard) and add quote of the value.

Script for find and replace URL:

DB_URL='jdbc:mysql://localhost:3306/sd?autoReconnect=true'
sed -i -e 's, MYDATABASE,'$DB_URL',g' input.yaml

Sample Input Yaml:

- name: AP_DB
      description: "datasource"
      jndiConfig:
        name: jdbc/AP_DB
      definition:
        type: RDBMS
        configuration:
          jdbcUrl: MYDATABASE
          username: username
          password: password
          driverClassName: com.mysql.jdbc.Driver

Required Output Yaml:

- name: AP_DB
      description: "datasource"
      jndiConfig:
        name: jdbc/AP_DB
      definition:
        type: RDBMS
        configuration:
          jdbcUrl: 'jdbc:mysql://localhost:3306/sd?autoReconnect=true'
          username: username
          password: password
          driverClassName: com.mysql.jdbc.Driver
Swansdown answered 9/2, 2019 at 15:44 Comment(2)
sed is the wrong tool for the job if you want a guarantee that output will always be valid YAML, or that all possible formulations of the same input document will be parsed (YAML has lots of different ways to represent the same content, so this is a major concern in practice). See yq as a jq wrapper that parses and generates YAML.Protozoon
@CharlesDuffy, Re "guarantee": an over-semi-formal and perhaps pedagogically counter-productive nitpick -- sed is Turing complete, therefore it necessarily must be possible to somehow write sed code that would guarantee the desired output, presumably by using conditional branching to transcend those limitations of regex that make it unsuitable for parsing YAML. (Aside from that, sed would be a poor tool for this job.)Reposit
F
6

You seem to have few misconceptions that seem to hinder you from solving this:

  • Your input file is invalid YAML and the replacement of MYDATABASE is not going to fix that. You cannot have a scalar value for name and a mapping (starting with key description) at the same time. I assume that your file would need to look like:

    - name: AP_DB
      description: "datasource"
      jndiConfig:
        name: jdbc/AP_DB
      definition:
        type: RDBMS
        configuration:
          jdbcUrl: MYDATABASE
          username: username
          password: password
          driverClassName: com.mysql.jdbc.Driver
    
  • Adding quotes to the value assigned to DB_URL in the shell doesn't make a difference

  • you are not using the shell, but sed to make the changes. Your shell is just used to invoke sed
  • You invoke sed with -i which overwrites your input.yaml, that makes it difficult to see if the output is correct, and you need to roll-back your changes
  • The space is not a prefix, it is the whitespace that normally needs to followi YAML's value indicator (:)
  • You match that space in your matching pattern, but you don't have it in your substitution pattern, nor are there any quotes in the substitution pattern. You probably think there are surrounding $DB_URL, but of course they are not.
  • The quotes around the URL in your output are superfluous

If you really want the output as you indicate there are several options. First of all you could just change the relevant line in your YAML to include the quotes

      jdbcUrl: 'MYDATABASE'

and slightly change your sed command:

sed -e 's,MYDATABASE,'$DB_URL',g' < input.yaml

If you cannot change the input.yaml, you can just add the quotes (and the space) to the sed substitution:

sed -e 's, MYDATABASE, "'$DB_URL'",g' < input.yaml

Or not use the single quotes, and concacatenate a prefix and postfix to $DB_URL but use double quotes, which do allow $DB_URL to be expanded:

sed -e "s, MYDATABASE, '$DB_URL',g" < input.yaml

Once you verify that any of these solutions work you can re-add the inplace replacment option -i to sed.


sed is not the right kind of tool for this, especially not because you seem unfamiliar with it and with YAML. Use a proper YAML parser to do these kind of things. They tend to keep working when simple pattern matching no longer gets the job done. And the parser's dumping mechanism knows when to insert quotes instead of dumbly inserting them when they are not needed. A parser would also indicate that your input is invalid YAML right from the start.

It is bit more code to do this e.g. in Python, but at least it only matches mapping values that are exactly your substitution string, and doesn't try to do substitutions on keys, sequence items, within YAML comments, or on mapping values like ORIG_MYDATABASE, if these happen to be in the file. Preventing that from happening using sed can be quite a challenge.

Such a Python program could look like subst.py:

import sys
from pathlib import Path
from ruamel.yaml import YAML

val = sys.argv[1]
subst = sys.argv[2]
file_name = Path(sys.argv[3])

def update(d, val, sub):
    if isinstance(d, dict):
        for k in d:
            v = d[k]
            if v == val:
                d[k] = sub
            else:
                update(v, val, sub)
    elif isinstance(d, list):
        for item in d:
            update(item, val, sub)

yaml = YAML()
yaml.preserve_quotes = True  # to preserve superfluous quotes in the input
data = yaml.load(file_name)
update(data, val, subst)
yaml.dump(data, file_name)

and be invoked from the shell, just like sed would have to be, by using:

python subst.py MYDATABASE $DB_URL input.yaml

Of course there will be no quotes in the output around the URL, as they are superfluous and not in the input file, but the superfluous quotes around datasource are preserved.

Flagellate answered 10/2, 2019 at 6:48 Comment(0)
B
7

Here's a more automated way to change a "key : new value", in your yaml file:

yaml_file="input.yaml"
key="jdbcUrl"
new_value="'jdbc:mysql://localhost:3306/sd?autoReconnect=true'"

sed -r "s/^(\s*${key}\s*:\s*).*/\1${new_value}/" -i "$yaml_file"
Bluestone answered 15/9, 2020 at 15:50 Comment(2)
painless solution!Lupitalupo
for the love of god: on macos use gnu-sedChecked
F
6

You seem to have few misconceptions that seem to hinder you from solving this:

  • Your input file is invalid YAML and the replacement of MYDATABASE is not going to fix that. You cannot have a scalar value for name and a mapping (starting with key description) at the same time. I assume that your file would need to look like:

    - name: AP_DB
      description: "datasource"
      jndiConfig:
        name: jdbc/AP_DB
      definition:
        type: RDBMS
        configuration:
          jdbcUrl: MYDATABASE
          username: username
          password: password
          driverClassName: com.mysql.jdbc.Driver
    
  • Adding quotes to the value assigned to DB_URL in the shell doesn't make a difference

  • you are not using the shell, but sed to make the changes. Your shell is just used to invoke sed
  • You invoke sed with -i which overwrites your input.yaml, that makes it difficult to see if the output is correct, and you need to roll-back your changes
  • The space is not a prefix, it is the whitespace that normally needs to followi YAML's value indicator (:)
  • You match that space in your matching pattern, but you don't have it in your substitution pattern, nor are there any quotes in the substitution pattern. You probably think there are surrounding $DB_URL, but of course they are not.
  • The quotes around the URL in your output are superfluous

If you really want the output as you indicate there are several options. First of all you could just change the relevant line in your YAML to include the quotes

      jdbcUrl: 'MYDATABASE'

and slightly change your sed command:

sed -e 's,MYDATABASE,'$DB_URL',g' < input.yaml

If you cannot change the input.yaml, you can just add the quotes (and the space) to the sed substitution:

sed -e 's, MYDATABASE, "'$DB_URL'",g' < input.yaml

Or not use the single quotes, and concacatenate a prefix and postfix to $DB_URL but use double quotes, which do allow $DB_URL to be expanded:

sed -e "s, MYDATABASE, '$DB_URL',g" < input.yaml

Once you verify that any of these solutions work you can re-add the inplace replacment option -i to sed.


sed is not the right kind of tool for this, especially not because you seem unfamiliar with it and with YAML. Use a proper YAML parser to do these kind of things. They tend to keep working when simple pattern matching no longer gets the job done. And the parser's dumping mechanism knows when to insert quotes instead of dumbly inserting them when they are not needed. A parser would also indicate that your input is invalid YAML right from the start.

It is bit more code to do this e.g. in Python, but at least it only matches mapping values that are exactly your substitution string, and doesn't try to do substitutions on keys, sequence items, within YAML comments, or on mapping values like ORIG_MYDATABASE, if these happen to be in the file. Preventing that from happening using sed can be quite a challenge.

Such a Python program could look like subst.py:

import sys
from pathlib import Path
from ruamel.yaml import YAML

val = sys.argv[1]
subst = sys.argv[2]
file_name = Path(sys.argv[3])

def update(d, val, sub):
    if isinstance(d, dict):
        for k in d:
            v = d[k]
            if v == val:
                d[k] = sub
            else:
                update(v, val, sub)
    elif isinstance(d, list):
        for item in d:
            update(item, val, sub)

yaml = YAML()
yaml.preserve_quotes = True  # to preserve superfluous quotes in the input
data = yaml.load(file_name)
update(data, val, subst)
yaml.dump(data, file_name)

and be invoked from the shell, just like sed would have to be, by using:

python subst.py MYDATABASE $DB_URL input.yaml

Of course there will be no quotes in the output around the URL, as they are superfluous and not in the input file, but the superfluous quotes around datasource are preserved.

Flagellate answered 10/2, 2019 at 6:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.