Edit existing yaml file but keeping original comments
Asked Answered
I

1

6

I am trying to create a Python script that will convert our IPtables config to firewall multi within a YAML file. I originally was using pyyaml, however, later found that this removes all comments which I will need to keep, I found that ruamel.yaml can be used to keep comments, however, I am struggling to get this to work.

import sys
import io
import string
from collections import defaultdict
import ruamel.yaml 


#loading the yaml file 

try:
      config = ruamel.yaml.round_trip_load(open('test.yaml'))
except ruamel.yaml.YAMLError as exc:
      print(exc)

print (config)

# Output class
#this = defaultdict(list)
this = {}
rule_number = 200
iptables_key_name = "ha247::firewall_addrule::firewall_multi"


# Do stuff here
for key, value in config.items():
 # Maipulate iptables rules only
   if key == 'iptables::rules':

# Set dic withim iptables_key_name
     this[iptables_key_name] = {}
     for rule, rule_value in value.items():

# prefix rule with ID
         new_rule =("%s %s" % (rule_number,rule))
         rule_number = rule_number + 1



# Set dic within [iptables_key_name][rule]
         this[iptables_key_name][new_rule] = {}
# Ensure we have action
         this[iptables_key_name][new_rule]['action'] = 'accept'
         for b_key, b_value in rule_value.items():
# Change target to action as rule identifier
             b_key = b_key.replace('target','action')
# Save each rule and ensure we are lowrcase
             this[iptables_key_name][new_rule][b_key] = str(b_value).lower()

  elif key == 'ha247::security::enable': 
      this['ha247::security_firewall::enable'] = value

  elif key == 'iptables::safe_ssh':
      this['ha247::security_firewall::safe_ssh'] = value

  else:
# Print to yaml
     this[key] = value


# Write YAML file
  with io.open('result.yaml', 'w', encoding='utf8') as outfile:
       ruamel.yaml.round_trip_dump(this, outfile, default_flow_style=False, allow_unicode=True)

The input file (test.yaml)

---

# Enable default set of security rules


# Configure firewall
iptables::rules:
 ACCEPT_HTTP:
    port: '80'
 HTTPS:
    port: '443'

# Configure the website
simple_nginx::vhosts:
    <doamin>:
     backend: php-fpm
     template: php-magento-template
     server_name: 
     server_alias: www.
     document_root: /var/www/
     ssl_enabled: true
     ssl_managed_enabled: true
     ssl_managed_name: www.
     force_www: true

The output of result.yaml

ha247::firewall_addrule::firewall_multi:
  200 ACCEPT_HTTP:
    action: accept
    port: '80'
  201 HTTPS:
    action: accept
    port: '443'

ha247::security_firewall::enable: true
ha247::security_firewall::safe_ssh: false
simple_nginx::ssl_ciphers:     
simple_nginx::vhosts:
 <domain>:
    backend: php-fpm
    document_root: /var/www/
    force_www: true
    server_alias: www.
    server_name: .com
    ssl_enabled: true
    ssl_managed_enabled: true
    ssl_managed_name: www.
    template: php-magento-template

This is where the problem lies, as you can see it has changed all the formatting and deleted comments which we need to keep, another issue is it has removed the three hyphens at the top which will for configuration manager unable to read the file.

Irrationality answered 17/10, 2017 at 9:28 Comment(2)
Please add information about the input, the expected output and the actual output.Duwalt
I have updated my question, any ideas what is causing this?Irrationality
F
1

You cannot exactly get what you want because you indent mappings inconsistently, as the indent for you mappings are 1, 2, 3, and 4 positions. As documented, ruamel.yaml has only one setting applied to all mappings (which defaults to 2).

Currently document start (and end) markers are not analysed on input, so you'll have to do some minimal extra work.

The biggest problem however is your misconception of what it means to use the round-trip loader and dumper. It is meant to load a YAML document into a Python data structure, change that data structure and then write out that same data structure. You create a new data structure out of the blue (this), assign some values from a YAML loaded data-structure (config) and then write out that new data structure (this). From your call to print(), you see you are loading a CommentedMap as the root data structure, and your normal Python dict of course doesn't know about any comments you might have loaded and that are attached to config.

So first look at what you would get with a minimal program that loads and dumps your input file without changing anything (explicitly). I will be using the new API, and recommend you do so too, although you probably can get this done with the old API as well. In the new API allow_unicode is default True.

import sys
from ruamel.yaml import YAML

yaml = YAML()
yaml.explicit_start = True
yaml.indent(mapping=3)
yaml.preserve_quotes = True  # not necessary for your current input

with open('test.yaml') as fp:
    data = yaml.load(fp)
yaml.dump(data, sys.stdout)

Which gives:

---

# Enable default set of security rules


# Configure firewall
iptables::rules:
   ACCEPT_HTTP:
      port: '80'
   HTTPS:
      port: '443'

# Configure the website
simple_nginx::vhosts:
   <doamin>:
      backend: php-fpm
      template: php-magento-template
      server_name:
      server_alias: www.
      document_root: /var/www/
      ssl_enabled: true
      ssl_managed_enabled: true
      ssl_managed_name: www.
      force_www: true

And that only differs from your input test.yaml in having consistent indentation (i.e. diff -b gives no differences).


Your code doesn't actually work (syntax error because of indentation) and if it did, it is not clear where the

ha247::security_firewall::enable: true
ha247::security_firewall::safe_ssh: false
simple_nginx::ssl_ciphers:   

in the output come from, nor how <doamin> gets changed in <domain> (you are doing something fishy there for real, as otherwise the keys in the value for <domain> would not magically get sorted.

Assuming as input test.yaml:

---

# Enable default set of security rules


# Configure firewall
iptables::rules:
 ACCEPT_HTTP:
    port: '80'
 HTTPS:
    port: '443'

ha247::security::enable: true         # EOL Comment
iptables::safe_ssh: false
simple_nginx::ssl_ciphers:
# Configure the website
simple_nginx::vhosts:
    <doamin>:
     backend: php-fpm
     template: php-magento-template
     server_name:
     server_alias: www.
     document_root: /var/www/
     ssl_enabled: true
     ssl_managed_enabled: true
     ssl_managed_name: www.
     force_www: true

and the following program:

import sys
from ruamel.yaml import YAML

yaml = YAML()
yaml.explicit_start = True
yaml.indent(mapping=3)
yaml.preserve_quotes = True  # not necessary for your current input

with open('test.yaml') as fp:
    data = yaml.load(fp)


key_map = {
    'iptables::rules': ['ha247::firewall_addrule::firewall_multi', None, 200],
    'ha247::security::enable': ['ha247::security_firewall::enable', None],
    'iptables::safe_ssh': ['ha247::security_firewall::safe_ssh', None],
}

for idx, key in enumerate(data):
    if key in key_map:
        key_map[key][1] = idx

rule_number = 200

for key in key_map:
    km_val = key_map[key]
    if km_val[1] is None:  # this is the index in data, if found
        continue
    # pop the value and reinsert it in the right place with the new name
    value = data.pop(key)
    data.insert(km_val[1], km_val[0], value)
    # and move the key related comments
    data.ca._items[km_val[0]] = data.ca._items.pop(key, None)
    if key == 'iptables::rules':
        data[km_val[0]] = xd = {}  # normal dict nor comments preserved
        for rule, rule_value in value.items():
            new_rule = "{} {}".format(rule_number, rule)
            rule_number += 1
            xd[new_rule] = nr = {}
            nr['action'] = 'accept'
            for b_key, b_value in rule_value.items():
                b_key = b_key.replace('target', 'action')
                nr[b_key] = b_value.lower() if isinstance(b_value, str) else b_value


yaml.dump(data, sys.stdout)

you get:

---

# Enable default set of security rules


# Configure firewall
ha247::firewall_addrule::firewall_multi:
   200 ACCEPT_HTTP:
      action: accept
      port: '80'
   201 HTTPS:
      action: accept
      port: '443'

ha247::security_firewall::enable: true # EOL Comment
ha247::security_firewall::safe_ssh: false
simple_nginx::ssl_ciphers:
# Configure the website
simple_nginx::vhosts:
   <doamin>:
      backend: php-fpm
      template: php-magento-template
      server_name:
      server_alias: www.
      document_root: /var/www/
      ssl_enabled: true
      ssl_managed_enabled: true
      ssl_managed_name: www.
      force_www: true

Which should be a good basis to start from.

Please note that I used .format() instead of the old fashioned % formatting. I also only lowercase b_value if it is a string, your code would e.g. convert an integer to a string and that would lead to quotes in your output where there would be none to start with.

Fireplace answered 18/10, 2017 at 19:53 Comment(4)
Thank you for the help, in regard to <domain> this is just a Nginx template for Puppet that I use when this will be used live the domains info will already be there. Again thanks for the help much appreciated.Irrationality
@Irrationality But in your input you have <doamin> is that a typo or correct that way.Fireplace
Think you're looking into this a little too much, it's just a typo I removed any references to the site this document was referring to, just didn't do a very good job.Irrationality
As well as anything that was referring to my company.Irrationality

© 2022 - 2024 — McMap. All rights reserved.