How to insert a comment line to YAML in Python using ruamel.yaml?
Asked Answered
S

1

8

I have a structure like this to which I want to add comment lines using ruamel.yaml:

xyz:
  a: 1    # comment 1
  b: 2

test1:
  test2:
    test3: 3

Now, I want to insert comment-lines (not eol_comments) to make it look like this:

xyz:
  a: 1    # comment 1
  b: 2

# before test1 (top level)
test1:
  # before test2
  test2:
    # after test2
    test3: 3

I know, that I can add eol_comments using ruamel.yaml, but I didn't find a way to add whole comment lines.

Sickly answered 20/11, 2016 at 14:1 Comment(0)
K
12

There is indeed not a function in ruamel.yaml<=0.12.18 to insert a comment on the line before a key, but there is a function to set a comment at beginning of a structure: .yaml_set_start_comment. With that you could already set two of the three comments you want to add:

import sys
import ruamel.yaml

yaml_str = """\
xyz:
  a: 1    # comment 1
  b: 2

test1:
  test2:
    test3: 3
"""

data = ruamel.yaml.round_trip_load(yaml_str)
data['test1'].yaml_set_start_comment('before test2', indent=2)
data['test1']['test2'].yaml_set_start_comment('after test2', indent=4)
ruamel.yaml.round_trip_dump(data, sys.stdout)

gives:

xyz:
  a: 1    # comment 1
  b: 2

test1:
  # before test2
  test2:
    # after test2
    test3: 3

There is actually a "comment" constituting of the empty line between the value for xyz and test1, but if you append your comment to that structure and then insert a new key before test1 things don't show up as you want. Therefore the thing to do is insert the comment explicitly before key test1. You can round-trip load your expected output to see what the internal Comment should look like:

yaml_str_out = """\
xyz:
  a: 1    # comment 1
  b: 2

# before test1 (top level)
test1:
  # before test2
  test2:
    # before test3
    test3: 3
"""
test = ruamel.yaml.round_trip_load(yaml_str_out)
print(test.ca)

gives (wrapped this for easier viewing):

Comment(comment=None,
        items={'test1': [None, 
                        [CommentToken(value='# before test1 (top level)\n')], 
                        None, 
                        [CommentToken(value='# before test2\n')]]})

As you see # before test2 is considered to be a comment after the key. And doing test['test1'].yaml_set_start_comment('xxxxx', indent=2) will not have any effect as the comment associated with test1 overrules that and # xxxxx will not show up in a dump.

With that information and some background knowledge, I adapted some of the code from yaml_set_start_comment() (assuming the original imports and yaml_str):

def yscbak(self, key, before=None, indent=0, after=None, after_indent=None):
    """
    expects comment (before/after) to be without `#` and possible have multiple lines
    """
    from ruamel.yaml.error import Mark
    from ruamel.yaml.tokens import CommentToken

    def comment_token(s, mark):
        # handle empty lines as having no comment
        return CommentToken(('# ' if s else '') + s + '\n', mark, None)

    if after_indent is None:
        after_indent = indent + 2
    if before and before[-1] == '\n':
        before = before[:-1]  # strip final newline if there
    if after and after[-1] == '\n':
        after = after[:-1]  # strip final newline if there
    start_mark = Mark(None, None, None, indent, None, None)
    c = self.ca.items.setdefault(key, [None, [], None, None])
    if before:
        for com in before.split('\n'):
            c[1].append(comment_token(com, start_mark))
    if after:
        start_mark = Mark(None, None, None, after_indent, None, None)
        if c[3] is None:
            c[3] = []
        for com in after.split('\n'):
            c[3].append(comment_token(com, start_mark))

if not hasattr(ruamel.yaml.comments.CommentedMap, 
               'yaml_set_comment_before_after_key'):
    ruamel.yaml.comments.CommentedMap.yaml_set_comment_before_after_key = yscbak


data = ruamel.yaml.round_trip_load(yaml_str)
data.yaml_set_comment_before_after_key('test1', 'before test1 (top level)',
                                       after='before test2', after_indent=2)
data['test1']['test2'].yaml_set_start_comment('after test2', indent=4)
ruamel.yaml.round_trip_dump(data, sys.stdout)

and get:

xyz:
  a: 1    # comment 1
  b: 2

# before test1 (top level)
test1:
  # before test2
  test2:
    # after test2
    test3: 3

The test with hasattr is to make sure you don't overwrite such a function when it gets added to ruamel.yaml

BTW: All comments are end-of-line comments in YAML, there might just be valid YAML before some of those comments.

Kropp answered 20/11, 2016 at 15:13 Comment(7)
The method yaml_set_comment_before_after_key() is available in ruamel.yaml>=0.13.0Kropp
The method yaml_set_comment_before_after_key() works for what I need. Thanks a lot. But what I found out: When I have a multiline list like key: <br> - Entry1 <br> - Entry2 The after-comment gets inserted after the key, not after the list.Sickly
If you have a multiline list you insert it on the list/sequence element using the index in the list as key (i.e integer 0, 1, etc). See here. Once more: just make sure yaml_add_eol_comment is called on the list.Kropp
Examples referenced by @Kropp appear to be viewable here now. Although I'm not sure if line 142 is still the right line or not.Johore
Hm. test_map_set_comment_before_and_after_non_first_key_00 explicitly references this answer.Johore
is there a possibility without using ruamel.yaml but just py.yaml?Compendious
@Compendious I don't know what just py.yaml, it is not available on PyPI. AFAIK there is no other parser that can do this, that is why I developed it. At the same time it is the only parser that supports YAML 1.2 the spec for which has been out for 14 years now, so I recommend you upgrade.Kropp

© 2022 - 2024 — McMap. All rights reserved.