ruamel.yaml equivalent of sort_keys?
Asked Answered
O

3

12

I'm trying to dump a Python dict to a YAML file using ruamel.yaml. I'm familiar with the json module's interface, where pretty-printing a dict is as simple as

import json
with open('outfile.json', 'w') as f:
    json.dump(mydict, f, indent=4, sort_keys=True)

With ruamel.yaml, I've gotten as far as

import ruamel.yaml
with open('outfile.yaml', 'w') as f:
    ruamel.yaml.round_trip_dump(mydict, f, indent=2)

but it doesn't seem to support the sort_keys option. ruamel.yaml also doesn't seem to have any exhaustive docs, and searching Google for "ruamel.yaml sort" or "ruamel.yaml alphabetize" didn't turn up anything at the level of simplicity I'd expect.

Is there a one-or-two-liner for pretty-printing a YAML file with sorted keys?

(Note that I need the keys to be alphabetized down through the whole container, recursively; just alphabetizing the top level is not good enough.)


Notice that if I use round_trip_dump, the keys are not sorted; and if I use safe_dump, the output is not "YAML-style" (or more importantly "Kubernetes-style") YAML. I don't want [] or {} in my output.

$ pip freeze | grep yaml
ruamel.yaml==0.12.5

$ python
>>> import ruamel.yaml
>>> mydict = {'a':1, 'b':[2,3,4], 'c':{'a':1,'b':2}}
>>> print ruamel.yaml.round_trip_dump(mydict)  # right format, wrong sorting
a: 1
c:
  a: 1
  b: 2
b:
- 2
- 3
- 4

>>> print ruamel.yaml.safe_dump(mydict)  # wrong format, right sorting
a: 1
b: [2, 3, 4]
c: {a: 1, b: 2}
Oliverolivera answered 24/10, 2016 at 20:4 Comment(0)
L
7

You need some recursive function that handles mappings/dicts, sequence/list:

import sys
import ruamel.yaml

CM = ruamel.yaml.comments.CommentedMap

yaml = ruamel.yaml.YAML()

data = dict(a=1, c=dict(b=2, a=1), b=[2, dict(e=6, d=5), 4])
yaml.dump(data, sys.stdout)

def rec_sort(d):
    try:
        if isinstance(d, CM):
            return d.sort()
    except AttributeError:
        pass
    if isinstance(d, dict):
        # could use dict in newer python versions
        res = ruamel.yaml.CommentedMap()
        for k in sorted(d.keys()):
            res[k] = rec_sort(d[k])
        return res
    if isinstance(d, list):
        for idx, elem in enumerate(d):
            d[idx] = rec_sort(elem)
    return d

print('---')

yaml.dump(rec_sort(data), sys.stdout)

which gives:

a: 1
c:
  b: 2
  a: 1
b:
- 2
- e: 6
  d: 5
- 4
---
a: 1
b:
- 2
- d: 5
  e: 6
- 4
c:
  a: 1
  b: 2

The commented map is the structure ruamel.yaml uses when doing a round-trip (load+dump) and round-tripping is designed to keep the keys in the order that they were during loading.

The above should do a reasonable job preserving comments on mappings/sequences when you load data from a commented YAML file

Logroll answered 24/10, 2016 at 21:5 Comment(7)
Oho, is default_flow_style=False the appropriate way to enable "Kubernetes-style" YAML output? I may give that a try.Oliverolivera
This is kind of a separate question, and I can ask it separately if you like, but now I'm wondering: is there a nice way to specify "sort keys by default, but if there's a key named name, put it first", again applying recursively to the whole structure? If there were such a way, I would use it.Oliverolivera
@Oliverolivera Just put the (name, value) pair in there first, or use .insert() as I indicated. You can also write a specific serializer that knows about name, but because of the way PyYAML was implemented (and ruamel.yaml still follows) it is difficult to parametrize this for any key (and that answer would require a seperate question).Logroll
@Logroll how would you recurse nested a CommentedMap to sort all keys at all levels?Hough
@Hough I would use a recursive function that handles the three node types: map, sequence and scallar and then probably sort the map in place, so the end-of-line comments are preserved without extra work. Post another question if you need more detail (tag it [ruamel.yaml] and I'll get notified)Logroll
@Logroll as that recursive part was already asked in the question (and asking it again would probably lead to be closed as duplicate of this very question here) it would be great if you could include that here as well.Heartbroken
I updated the answer to both handle nested dicts (even when nested in a list) and updated for the new ruamel.yaml API.Logroll
G
0

As pointed out in @Anthon's example, if you are using Python 3.7 or newer (and do not need to support lower versions), you just need:

import sys
from ruamel.yaml import YAML

yaml = YAML()

data = dict(a=1, c=dict(b=2, a=1), b=[2, dict(e=6, d=5), 4])

def rec_sort(d):
    if isinstance(d, dict):
        res = dict()
        for k in sorted(d.keys()):
            res[k] = rec_sort(d[k])
        return res
    if isinstance(d, list):
        for idx, elem in enumerate(d):
            d[idx] = rec_sort(elem)
    return d

yaml.dump(rec_sort(data), sys.stdout)

Since dict is ordered as of that version.

Glaring answered 4/10, 2021 at 9:33 Comment(0)
V
-1

There is an undocumented sort() in ruamel.yaml that will work on a variation of this problem:

import sys
import ruamel.yaml

yaml = ruamel.yaml.YAML()

test = """- name: a11
  value: 11
- name: a2
  value: 2
- name: a21
  value: 21
- name: a3
  value: 3
- name: a1
  value: 1"""
test_yml = yaml.load(test)

yaml.dump(test_yml, sys.stdout)

not sorted output

  - name: a11
    value: 11
  - name: a2
    value: 2
  - name: a21
    value: 21
  - name: a3
    value: 3
  - name: a1
    value: 1

sort by name

test_yml.sort(lambda x: x['name'])
yaml.dump(test_yml, sys.stdout)

sorted output

  - name: a1
    value: 1
  - name: a11
    value: 11
  - name: a2
    value: 2
  - name: a21
    value: 21
  - name: a3
    value: 3
Varuna answered 14/10, 2020 at 18:55 Comment(3)
is this recursive?Heartbroken
@Heartbroken No, and it doesn't work for the OP's original question.Logroll
The sort() the answer is using is a property of dict and should be generally avoided as implemented; it does not act recursively.Elyssa

© 2022 - 2024 — McMap. All rights reserved.