When CPP line splicing is undone within C++0x raw strings, is a conforming implementation required to preserve the original newline sequence?
Asked Answered
F

2

6

The latest draft of C++0x, n3126, says:

Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines.

...

Within the r-char-sequence of a raw string literal, any transformations performed in phases 1 and 2 (trigraphs, universal-character-names, and line splicing) are reverted.

Technically this means that the C++ preprocessor only recognizes a backslash followed by the newline character, but I know that some C++ implementations also allow Windows- or classic Mac-style line endings as well.

Will conforming implementations of C++0x be required to preserve the newline sequence that immediately followed a backslash character \ within the r-char-sequence of a raw string? Maybe a better question is: would it be expected of a Windows C++0x compiler to undo each line splice with "\\\r\n" instead of "\\\n"?

Flocky answered 27/12, 2010 at 17:27 Comment(3)
What exactly are you doing that you need to know about this?Drum
@Karl: it would be significant for example in "portable" code that uses line splicing in string literals, then the linebreaks somehow get changed to native style when the code is moved to a new platform (e.g. because some text editor doesn't handle foreign linebreaks well). In that specific case, the question becomes, "does the standard make a guarantee of the value of the string literal, and if so what?". Of course that's not the same as "needing to know" - one could avoid using the feature in preference to seeking clarification of the standard.Greenroom
@Karl: In addition to Steve's point about possibly yielding a different string literal, I am working on a project that involves scanning raw string literals. It was something that I was wondering about.Flocky
B
4

Translation phase 1 starts with

Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set (introducing newline characters for end-of-line indicators) if necessary. Trigraph sequences (2.3) are replaced [...]

I'd interpret the requirement "any transformations performed in phases 1 and 2 (trigraphs, universal-character-names, and line splicing)" as explicitly not reverting the transformation from source file characters to the basic source character set. Instead, source characters are later converted to the execution character set, and you get newline characters there.

Bistoury answered 27/12, 2010 at 17:35 Comment(0)
D
0

If you need a specific line ending sequence, you can insert it explicitly, and use string literal concatenation:

char* nitpicky = "I must have a \\r\\n line ending!\r\n"
"Otherwise, some other piece of code will misinterpret this line!";
Drum answered 27/12, 2010 at 18:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.