Concise way of defining your own YAML syntax?
Asked Answered
C

1

1

For XML, there are Document Type Definitions (DTD) which define all elements, but is there something similar for YAML?

I found a post on Validating YAML with an XML DTD which suggests to use DTDs anyhow and/or a simple XML, but I am doubtful whether that is feasible in my case: My project decided to have a (custom) YAML format. From a YAML file in this format a rather intricate XML is algorithmically generated. The YAML contains much less information than the XML, but all significant things a human editor must know.

At the moment, the definition of my YAML is mainly prosaic (as quite abstract requirement text) an as the actual source code which does the parsing and conversion to XML. Both is not suitable for end users which are supposed to maintain the YAML file. Is there a clean and concise way to define my custom YAML syntax?

Chirm answered 30/12, 2019 at 12:4 Comment(1)
I found yamllint hepful, but that is not what I am after.Chirm
J
2

Firstly, YAML is the syntax. What you want to describe is not syntax, but structure.

YAML is a serialization format. Therefore, the type of the data you serialize from and deserialize into is the structure description of a YAML file. Unless you're using YAML for data interexchange, you typically have one application the implements loading the YAML file.

By default, a lot of YAML implementations deserialize to a heterogeneous structure of lists, dictionaries and simple values (string, int, …). However, if you assume a certain structure, you can write down types that define that structure and then load your YAML into an object of that type. Simple example (Java in this case):

public class Book {
    public static class Person {
        public String name;
        public int age;
    }

    public Person author;
    public String title;
}

This type describes the structure of this YAML document:

author:
  name: John Doe
  age: 23
title: Very interesting title

Any YAML implementation that is able to deserialize to types is able to inspect those types; either at runtime via reflection or at compile-time via macros or other means of compile-time evaluation. Therefore, you can inspect that structure as well and autogenerate documentation for the user with it (possibly employing JavaDoc comments for extended documentation).

Now you might use a dynamically typed language. If that language is Python, you can still define classes to define your structure, and you can use type hints to define types of scalar values. This gives you user documentation, however you still need to implement validation manually since type hints are not enforced (PyYAML's add_path_resolver is the important hook here to resolve parts of the document graph to specific types without having to use YAML tags).

For other languages, different solutions may exist. Generally, it's a good idea to maintain a single source of truth (SSOT) that describes the YAML structure and then use that as basis for both user documentation and validation. And since YAML is a serialization format, the target type is a natural choice for the SSOT if the language and YAML implementation allows you to define it.

Janusfaced answered 3/1, 2020 at 12:44 Comment(6)
Thanks so much for you helpful explanations. For a start, the YAML should only be exported and imported from/ into a React-native (mobile) app. Then it should also be parsed to XML on a server e.g. by awk, perl, or python.Chirm
To be more specific, it is about BPMN, and related to stackoverflow.com/a/59349583 (2nd part of my answer). I.a.w. there is already a rudimentary format and structure I want to expand. The pain point behind is explained in the linked question.Chirm
Be aware that YAML has its own problems with plaintext diffing. A problem that is asked about frequently here is the order of keys in mappings, which according to the spec is arbitrary and must not convey content information. This results in a lot of implementations not being able to set the order since mappings are serialized from hashmaps that don't preserve insertion order (PyYAML recently implemented keeping the order).Janusfaced
Great hint! Since you are so knowledgable about these things - would JSON be an alternative to YAML?Chirm
JSON Schema is definitely able to define and validate a structure. If you're not using any YAML structures that are not expressable in JSON (anchors, aliases, tags, multiple documents), you can use JSON or even allow the user to write YAML as there's a tool that validates YAML against a JSON Schema.Janusfaced
Whether you run into any diffing problems with JSON or YAML depends on the implementation you use to write the JSON/YAML content and problems occur primarily if you want to alter a hand-written file programmatically. If you always generate them programmatically, you should be fine.Janusfaced

© 2022 - 2024 — McMap. All rights reserved.