ruamel.yaml always dumps YAML end marker ("...") even if yaml.explicit_end=False
Asked Answered
L

2

8

I am wondering if this is a bug or intentional, but anyway.

Why does dumping a single value with ruamel.yaml always include an explicit YAML end marker?

import sys
from ruamel.yaml import YAML
yaml=YAML()
yaml.explicit_end=False
yaml.dump(1, sys.stdout)

Produces

1
...

Can the be easily skipped somehow?

Luba answered 9/7, 2019 at 10:19 Comment(0)
A
5

The reason the document-end-marker (...) is added is because the number is dumped as a plain scalar at the root level of the document. The same happens if you dump a string (assuming that string can be dumped without quotes without being misinterpreted, i.e. string consisting of numbers only have to be quoted in order not to be seen as an integer).

Without document-end-marker, on loading from a stream, the parser would not know if the document is complete, or the stream just waiting to filled. The document-end-marker takes away this ambiguity, so this is intentional, but e.g. when parsing a file (instead of generic stream), that can, and will, also be done by checking for end-of-file.

There are several ways around this, one is to transform the output:

import sys
import ruamel.yaml

def strip_document_end_marker(s):
   if s.endswith('...\n'):
       return s[:-4]

yaml = ruamel.yaml.YAML()
yaml.dump("abc", sys.stdout, transform=strip_document_end_marker)

which gives:

abc

The above should also work with dump_all for multiple documents (and the last one being a root level plain scalar).

Another way to achieve this is to reset the open_ended attribute after writing a plain value:

import sys
import ruamel.yaml

yaml = ruamel.yaml.YAML()

def wp(self, *args, **kw):
    self.write_plain_org(*args, **kw)
    self.open_ended = False

yaml.Emitter.write_plain_org = yaml.Emitter.write_plain
yaml.Emitter.write_plain = wp
yaml.dump("abc", sys.stdout)

which also gives:

abc
Apogee answered 11/7, 2019 at 11:19 Comment(1)
"Without document-end-marker, on loading from a stream, the parser would not know if the document is complete, or the stream just waiting to filled." That logic cannot be the reason: if you dump [1, 2, 3] … you don't get the end marker. Yet a parser sitting at the end of that output similarly cannot determine whether this is the end of the document, or not.Colorcast
H
2

I'm not sure of the reason but yaml.dump("1", sys.stdout) does not print the document's ending marker.

It seems that the serializer appends the three dots (document's end marker) is append when the Serializer get a non-iterable value:

dump(1, stream=sys.stdout)
dump([1], stream=sys.stdout)
dump(datetime.datetime.now(), stream=sys.stdout)
dump("1", stream=sys.stdout)

Prints:

1
...
[1]
2019-07-09 12:45:27.013202
...
'1'

So an easy workaround would be to convert your values to string before dumping them (if possible)...

Heathheathberry answered 9/7, 2019 at 10:46 Comment(4)
The reason "1" doesn't print with the document-end-marker is because that string cannot be printed without quotes (as it would be loaded as an integer instead of a string). If you dump "a", you won't get quotes, and so you will get the ...Apogee
Ah yes, your answer made it clear, thanks! What do you think, should I delete my own answer?Heathheathberry
Just leave it, there is always a chance the OP will like your answer better ;-)Apogee
No, yours is the good one. But I'll leave it for the explanation you gave in your comment :)Heathheathberry

© 2022 - 2024 — McMap. All rights reserved.