Why must the delimiters of raw string literals be under 16 chars?
Asked Answered
R

2

15

The following program does not compile:

#include <iostream>

int main() {
    std::cout << R"RAW_STRING_LITERAL(
        hello
        world
        )RAW_STRING_LITERAL";
}

error: raw string delimiter longer than 16 characters.

Why is there a limitation length imposed on raw string delimiters?

Retaretable answered 4/8, 2015 at 20:4 Comment(5)
Do you think there should not be?Kartis
The choice seems arbitrary from the relevant proposal papers but I'll leave answering to someone else.Kartis
This could be implementation specific. Which compiler are you using? What OS?Amazonas
If raw string delimiters could have arbitrary length, they'd probably form yet another Turing complete language.Sorcerer
@LightnessRacesinOrbit, I'm generating some c++ code based on some data, and that data is being put into raw string literals during the generation. Parts of that data have the potential to be more than 16 chars. It doesn't really affect me; I can ensure a unique delimiter, but it was just something I wasn't expecting. For practical purposes, I don't see anything wrong with having 16 as a limit. I thought I would ask though to see if it was indeed arbitrary, had something to do with parsing, or was decided in order to allow for faster compiles.Retaretable
K
13

The earliest proposal I can find for raw string literals is N2146 by Beman Dawes. It contains the text:

The maximum length of d-char-sequence shall be 16 characters.

This seems to be an arbitrary limit imposed by the author, who probably decided 16 characters were sufficient for creating an unambiguous delimiter sequence in all cases.

The proposal also states

The terminating d-char-sequence of a raw string literal shall be the same sequence of characters as the initial d-char-sequence

So a conforming implementation must buffer and process the d-char-sequence to ensure the two sequences match. The absence of any limit on the d-char-sequence would unnecessarily add to the complexity of implementing the feature.

Kennie answered 4/8, 2015 at 20:35 Comment(0)
A
1

The standard specifies that:

A string-literal that has an R in the prefix is a raw string literal. The d-char-sequence serves as a delimiter. The terminating d-char-sequence of a raw-string is the same sequence of characters as the initial d-charsequence. A d-char-sequence shall consist of at most 16 characters

http://open-std.org/JTC1/SC22/WG21/docs/papers/2015/n4527.pdf § 2.13.5 page 28

No reason is given in the standard but to me, this appear as completely random limit as it should make absolutely no difference what the delimiter is.

Amazonas answered 4/8, 2015 at 20:22 Comment(4)
Is a reason given, or can we just assume that the length was arbitrarily decided upon? (I don't mean arbitrary in a negative way, but a literal sense.)Piddle
I'd quite like to know this too.Kartis
In one of the appendices they specify lower bounds on things like the number of template args and class nesting depth. This upper bound appearing in the general standard text is very unusual. This seems like the maximum char sequence of 16 should be a minimum and allowed to be an ID.Expendable
OTOH, maybe they were worried about portability.Expendable

© 2022 - 2024 — McMap. All rights reserved.