ruamel.yaml: clarification on typ and pure=True
Asked Answered
B

1

6

I am trying to understand what typ and pure=True mean in the ruamel.yaml Python library.
I've read the documentation here.
So far, I've understood that typ='safe' uses a safe loader which omits YAML tags parsing in YAML (they can lead to arbitrary code execution).
I haven't found any explanation about round-trip parser typ='rt' in the docs.

Also, I think explanation on pure=True is confusing:

Provide pure=True to enforce using the pure Python implementation (faster C libraries will be used when possible/available)

Are faster C libraries used with pure=True or not? If they do, why do you need to specify this flag in the first place?

Bedder answered 13/7, 2018 at 1:24 Comment(0)
F
11

There are four standard typ parameters:

  • rt: (for round-trip) in this case the document is loaded in special types that preserve comments, etc., used for dumping. This is what ruamel.yaml was created for and this is the default (i.e. what you get if you don't specify typ). This is a subclass of the safe loader/dumper.
  • safe: this only loads/dumps tagged objects, when these are explicitly registered with the loader/dumper
  • unsafe: try to load/dump everything. Classes are automatically resolved to a tag of the form !!python/object:<module>/<class>
  • base: the loader/dumper from which everything is derived. All scalars are loaded as strings (even the types like integer, float, Boolean that are handled specially as mentioned in the YAML specification or the type description

For safe, unsafe, base there is the faster C Loader available. If you install from the .tar.gz file these will only get compiled during installation when the appropriate compiler is available. If they are not available, because they could not be compiled, then they cannot be used.
There is no C version of the rt code. So it is not possible to use C libraries.

The word pure is for when you use Python only modules. The opposite would be "tainted": Python tainted with C extension modules. There is no tainted=True parameter. This is implicit (when possible/available, see previous paragraph) when pure=true is not specified, as the default for pure is False


In order to further confuse you: the above are the four basic (built-in) values for type. If you use plug-ins you can e.g. do

yaml = YAML(typ='jinja2')

as shown in this answer


Some of the above information is available from the YAML() docstring, little of that however made it into the package documentation, primarily as a result of lazyness of ruamel.yaml's author.

Fidel answered 13/7, 2018 at 5:46 Comment(2)
Thank you. Do you know how to check if C loader is present on my machine? Do I understand correctly that for the best performance, I need to have C library installed and provide YAML(typ="safe")?Bedder
Run the following using the Python version that you want to test (i.e. your virtualenv): /path/to/python -c "import ruamel.yaml; print(ruamel.yaml.__with_libyaml__)" (I assume you don't necessarily want to go looking for the shared object yourself). Yes for best perfomance YAML(typ='safe') which is the same as YAML(typ='safe', pure=False)Fidel

© 2022 - 2024 — McMap. All rights reserved.