Why are PyYAML and ruamel.yaml escaping special characters when single quoted?
Asked Answered
O

1

6

I have a YAML file and would like to constrain a certain field to contain no whitespace.

Here's a script that demonstrates my attempt:

test.py

#!/usr/bin/env python3

import os
from ruamel import yaml

def read_conf(path_to_config):
    if os.path.exists(path_to_config):
        conf = open(path_to_config).read()
        return yaml.load(conf)
    return None

if __name__ == "__main__":
    settings = read_conf("hello.yaml")
    print("type of name: {0}, repr of name: {1}".format(type(
             settings['foo']['name']), repr(settings['foo']['name'])))
    if any(c.isspace() for c in settings['foo']['name']):
        raise Exception("No whitespace allowed in name!")

Here is my first cut of the YAML file:

hello.yaml

foo:
    name: "hello\t"

In the above YAML file, an exception is correctly raised:

type of name: <class 'str'>, repr of name: 'hello\t'
Traceback (most recent call last):
  File "./test.py", line 16, in <module>
    raise Exception("No whitespace allowed in name!")
Exception: No whitespace allowed in name!

However, if I change the double quotes to single quotes, no exception is raised:

08:23 $ ./test.py 
type of name: <class 'str'>, repr of name: 'hello\\t'

This behavior occurs both when using ruamel.yaml==0.11.11 and PyYAML=3.11.

Why is there a difference between single and double quotes in these Python YAML parsers when, as I understand it, there is no functional difference between them in YAML specs? How can I prevent special characters from being escaped?

Otto answered 5/7, 2016 at 12:26 Comment(2)
What yaml module would be native to Python3? Neither ruamel.yaml nor PyYAML are part of the standard python library.Incommunicative
@Incommunicative Oops. I had PyYAML installed globally and didn't realize it. :) Will edit.Otto
I
7

There is a vast difference in the YAML specification between single and double quoted strings. Within single quoted scalars you can only escape the single quote:

The single-quoted style is specified by surrounding “'” indicators. Therefore, within a single-quoted scalar, such characters need to be repeated. This is the only form of escaping performed in single-quoted scalars.

Therefore \ in 'hello\t' has no special function and that scalar consists of the letters h, e, l (2x), o. \ and t

Backslash escaping is only supported in double quoted YAML scalars.

Incommunicative answered 5/7, 2016 at 12:42 Comment(2)
Ah, my understanding was buggy. :) I guess I'll need to broaden my constraint to preventing all special characters.Otto
You can use tabs in in single quotes (and in literal and folded block scalars), but you have to use the tab character itself, not the usual escape sequence (\t). Those characters then are, in many circumstances, difficult to distinguish from space(s), but that is correct YAMLIncommunicative

© 2022 - 2024 — McMap. All rights reserved.