YAML: Do I need quotes for strings in YAML?
Asked Answered
W

8

789

I am trying to write a YAML dictionary for internationalisation of a Rails project. I am a little confused though, as in some files I see strings in double-quotes and in some without. A few points to consider:

  • example 1 - all strings use double quotes;
  • example 2 - no strings (except the last two) use quotes;
  • the YAML cookbook says: Enclosing strings in double quotes allows you to use escaping to represent ASCII and Unicode characters. Does this mean I need to use double quotes only when I want to escape some characters? If yes - why do they use double quotes everywhere in the first example - only for the sake of unity / stylistic reasons?
  • the last two lines of example 2 use ! - the non-specific tag, while the last two lines of the first example don't - and they both work.

My question is: what are the rules for using the different types of quotes in YAML?

Could it be said that:

  • in general, you don't need quotes;
  • if you want to escape characters use double quotes;
  • use ! with single quotes, when... ?!?
Whang answered 1/10, 2013 at 7:0 Comment(1)
Second link is not working anymore, I suggest to put your examples into the question.Obligatory
H
1037

After a brief review of the YAML cookbook cited in the question and some testing, here's my interpretation:

  • In general, you don't need quotes.
  • Use quotes to force a string, e.g. if your key or value is 10 but you want it to return a String and not a Fixnum, write '10' or "10".
  • Use quotes if your value includes special characters, (e.g. :, {, }, [, ], ,, &, *, #, ?, |, -, <, >, =, !, %, @, \).
  • Single quotes let you put almost any character in your string, and won't try to parse escape codes. '\n' would be returned as the string \n.
  • Double quotes parse escape codes. "\n" would be returned as a line feed character.
  • The exclamation mark introduces a method, e.g. !ruby/sym to return a Ruby symbol.

Seems to me that the best approach would be to not use quotes unless you have to, and then to use single quotes unless you specifically want to process escape codes.

Update

"Yes" and "No" should be enclosed in quotes (single or double) or else they will be interpreted as TrueClass and FalseClass values:

en:
  yesno:
    'yes': 'Yes'
    'no': 'No'
Hein answered 6/3, 2014 at 20:19 Comment(10)
That's not quite the full picture. For example, @ and ` can be used anywhere in a plain string except at the beginning, because they are reserved indicators.Chemoprophylaxis
I wasn't trying to provide the full picture, just some rules of thumb. Yes, it looks like sometimes, some special characters (reserved indicators) can be used without quotes (as long as a reserved indicator doesn't start a plain scalar), but it's not wrong to use quotes whenever you see a special character.Hein
Just some additional info: you can start a scalar with % if your document includes a directives end marker line (---) in yaml 1.2Prelature
The rules for strings in YAML are insanely complicated, because there are so many different types of strings. I wrote up a table here: #3790954Melly
Given all these caveats, I'd rather just use quotes everywhere :-/Passant
Also, here's a quite complete reference I wrote: blogs.perl.org/users/tinita/2018/03/…Subdued
"Use quotes if your value includes special characters" -- This is a gross oversimplification, see this excellent blog post for exactly when quotes are needed: blogs.perl.org/users/tinita/2018/03/…Handcraft
Also quote Norway hitchdev.com/strictyaml/why/implicit-typing-removedGerstner
I just noticed one of the hidden answers wanted to add,"Use quotes for 'on' or 'ON' if you don't want it to be converted to true." If haven't checked that so I'll just add it here as a comment. If it's true, I suspect that it would also apply to 'off' / 'OFF'.Hein
yes/no/on/off no longer need to be quoted if you're using YAML 1.2.Moy
P
105

While Mark's answer nicely summarizes when the quotes are needed according to the YAML language rules, I think what many of the developers/administrators are asking themselves, when working with strings in YAML, is "what should be my rule of thumb for handling the strings?"

It may sound subjective, but the number of rules you have to remember, if you want to use the quotes only when they are really needed as per the language spec, is somewhat excessive for such a simple thing as specifying one of the most common datatypes. Don't get me wrong, you will eventually remember them when working with YAML regularly, but what if you use it occasionally, and you didn't develop automatism for writing YAML? Do you really want to spend time remembering all the rules just to specify the string correctly?

The whole point of the "rule of thumb" is to save the cognitive resource and to handle a common task without thinking about it. Our "CPU" time can arguably be used for something more useful than handling the strings correctly.

From this - pure practical - perspective, I think the best rule of thumb is to single quote the strings. The rationale behind it:

  • Single quoted strings work for all scenarios, except when you need to use escape sequences.
  • The only special character you have to handle within a single-quoted string is the single quote itself.

These are just 2 rules to remember for some occasional YAML user, minimizing the cognitive effort.

Pentimento answered 14/5, 2021 at 14:44 Comment(3)
I like this answer. I thought the whole point of YAML to keep it simple. And yet here I am looking for answers why the int value of sizeInBytes: 12345678 had to be "quoted" in my latest YAML b/c something apparently wanted to have a string configuration property (probably?)--but I actually still don't know the answer.Rao
A simpler one is: use double quotes for strings.Flageolet
@Flageolet I am not sure if double quotes is the best default option, since their automatic interpretation of escape sequences may not be what you expect. Though, if you always remember about it, and if you specifically need it, than yeah, why not.Pentimento
G
76

There have been some great answers to this question. However, I would like to extend them and provide some context from the new official YAML v1.2.2 specification (released October 1st 2021) which is the "true source" to all things considering YAML.

There are three different styles that can be used to represent strings, each of them with their own (dis-)advantages:

YAML provides three flow scalar styles: double-quoted, single-quoted and plain (unquoted). Each provides a different trade-off between readability and expressive power.

Double-quoted style:

  • The double-quoted style is specified by surrounding " indicators. This is the only style capable of expressing arbitrary strings, by using \ escape sequences. This comes at the cost of having to escape the \ and " characters.

Single-quoted style:

  • The single-quoted style is specified by surrounding ' indicators. Therefore, within a single-quoted scalar, such characters need to be repeated. This is the only form of escaping performed in single-quoted scalars. In particular, the \ and " characters may be freely used. This restricts single-quoted scalars to printable characters. In addition, it is only possible to break a long single-quoted line where a space character is surrounded by non-spaces.

Plain (unquoted) style:

  • The plain (unquoted) style has no identifying indicators and provides no form of escaping. It is therefore the most readable, most limited and most context sensitive style. In addition to a restricted character set, a plain scalar must not be empty or contain leading or trailing white space characters. It is only possible to break a long plain line where a space character is surrounded by non-spaces. Plain scalars must not begin with most indicators, as this would cause ambiguity with other YAML constructs. However, the :, ? and - indicators may be used as the first character if followed by a non-space “safe” character, as this causes no ambiguity.

TL;DR

With that being said, according to the official YAML specification one should:

  • Whenever applicable use the unquoted style since it is the most readable.
  • Use the single-quoted style (') if characters such as " and \ are being used inside the string to avoid escaping them and therefore improve readability.
  • Use the double-quoted style (") when the first two options aren't sufficient, i.e. in scenarios where more complex line breaks are required or non-printable characters are needed.
Guinness answered 5/11, 2021 at 8:51 Comment(4)
Thanks for the summary. It gets into how to delineate white space, which I hadn't considered in my answer. But it omits one of the main deciding factors about quotes: whether I want to force the data type to be a string when the default would be something else. This is covered briefly in section 2.4: "In YAML, untagged nodes are given a type depending on the application." The simplest example 2.21 shows string: '012345'. That section also covers more complex, explicit typing that I had no idea existed!Hein
Some additional caveats: Plain scalars must never contain the : and ` #` character combinations. In addition, inside flow collections, or when used as implicit keys, plain scalars must not contain the [, ], {, } and , characters.Unlearn
What i don't get is: Why is & allowed at the start of plain-text according to the spec – wouldn't that make it a reference?Unlearn
Is the recommendation in the final paragraph of this answer actually in the spec? It doesn't appear to be a quote.Cupboard
T
25

Strings in yaml only need quotation if (the beginning of) the value can be misinterpreted as a data type or the value contains a ":" (because it could get misinterpreted as key).

For example

foo: '{{ bar }}'

needs quotes, because it can be misinterpreted as datatype dict, but

foo: barbaz{{ bam }}

does not, since it does not begin with a critical char. Next,

foo: '123'

needs quotes, because it can be misinterpreted as datatype int, but

foo: bar1baz234
bar: 123baz

Does not, because it can not be misinterpreted as int

foo: 'yes'

needs quotes, because it can be misinterpreted as datatype bool

foo: "bar:baz:bam"

needs quotes, because the value can be misinterpreted as key.

These are just examples. Using yamllint helps avoiding to start values with a wrong token

foo@bar:/tmp$ yamllint test.yaml 
test.yaml
  3:4       error    syntax error: found character '@' that cannot start any token (syntax)

and is a must, if working productively with yaml.

Quoting all strings as some suggest, is like using brackets in python. It is bad practice, harms readability and throws away the beautiful feature of not having to quote strings.

Torrens answered 19/3, 2021 at 9:19 Comment(3)
Thanks for the examples. It seems we agree; as I said in my answer: "the best approach would be to not use quotes unless you have to." A question on your helpful datatype rule: are you referring specifically to YAML in Ruby on Rails, as in the OP's question? It seems the datatype interpretation could vary by programming language.Hein
@MarkBerry Thanks for the input. Yes, the general rule for me would also be: Don not quote until you have to. And yes, you correctly observed, that I used examples from Python instead of Ruby. I did this on purpose. To highlight the abstract messages: 1) Use a linter 2) Yaml is not bound to a language, but IS a language. Thats why I am using 'key:value' terminology.Torrens
"the best approach would be to not use quotes unless you have to." would basically mean "always quote strings" in YAML... unless you're a godlike being capable of memorizing every YAML corner case of each and every version, subversion and parser. This is basically why people had to re-invent static typing in dynamically-typed languages... because it turns to unmaintainable mess as soon as you stop doing Hello World-type apps and being a smartass about it, and start working enterprise or corporate. Having automatically-quoted strings from a generator or YAML-aware text editor costs you nothing.Phlogopite
P
4

I had this concern when working on a Rails application with Docker.

My most preferred approach is to generally not use quotes. This includes not using quotes for:

  • variables like ${RAILS_ENV}
  • values separated by a colon (:) like postgres-log:/var/log/postgresql
  • other strings values

I, however, use double-quotes for integer values that need to be converted to strings like:

  • docker-compose version like version: "3.8"
  • port numbers like "8080:8080"
  • image "traefik:v2.2.1"

However, for special cases like booleans, floats, integers, and other cases, where using double-quotes for the entry values could be interpreted as strings, please do not use double-quotes.

Here's a sample docker-compose.yml file to explain this concept:

version: "3"

services:
  traefik:
    image: "traefik:v2.2.1"
    command:
      - --api.insecure=true # Don't do that in production
      - --providers.docker=true
      - --providers.docker.exposedbydefault=false
      - --entrypoints.web.address=:80
    ports:
      - "80:80"
      - "8080:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro

That's all.

I hope this helps

Partain answered 26/5, 2020 at 14:29 Comment(1)
violates - Use quotes if your value includes ':' in the other answerLothaire
I
1

Here's a small function (not optimized for performance) that quotes your strings with single quotes if needed and tests if the result could be unmarshalled into the original value: https://go.dev/play/p/AKBzDpVz9hk. Instead of testing for the rules it simply uses the marshaller itself and checks if the marshalled and unmmarshalled value matches the original version.

func yamlQuote(value string) string {
    input := fmt.Sprintf("key: %s", value)

    var res struct {
        Value string `yaml:"key"`
    }

    if err := yaml.Unmarshal([]byte(input), &res); err != nil || value != res.Value {
        quoted := strings.ReplaceAll(value, `'`, `''`)
        return fmt.Sprintf("'%s'", quoted)
    }

    return value
}
Incompressible answered 26/11, 2021 at 12:10 Comment(0)
C
0

If you are trying to escape a string in pytest tavern, !raw could be helpful to avoid parsing of strings to yaml:

some: !raw "{test: 123}"

Check for more info: https://tavern.readthedocs.io/en/latest/basics.html#type-conversions

Capuche answered 15/9, 2021 at 22:1 Comment(0)
R
-3
version: "3.9"

services:
  seunggabi:
    image: seunggabi:v1.0.0
    command:
      api:
        insecure: true
    ports:
      - 80:80
      - 8080:8080
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
docker compoese up docker-compose.yaml

If you use docker compose v2, you don't need to use quotation for boolean.
Only the version needs quotations.

Riffle answered 27/6, 2022 at 17:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.