What is a raw string?
Asked Answered
W

4

62

I came across this code snippet in C++17 draft n4713:

#define R "x"
const char* s = R"y"; // ill-formed raw string, not "x" "y"

What is a "raw string"? What does it do?

Withhold answered 21/6, 2019 at 20:26 Comment(0)
H
83

Raw string literals are string literals that are designed to make it easier to include nested characters like quotation marks and backslashes that normally have meanings as delimiters and escape sequence starts. They’re useful for, say, encoding text like HTML. For example, contrast

"<a href=\"file\">C:\\Program Files\\</a>"

which is a regular string literal, with

R"(<a href="file">C:\Program Files\</a>)"

which is a raw string literal. Here, the use of parentheses in addition to quotes allows C++ to distinguish a nested quotation mark from the quotation marks delimiting the string itself.

Helpless answered 21/6, 2019 at 20:31 Comment(2)
I did not know why you needed the parenthesis. First thing I've seen on it. So obvious now. Note, you can still prefix it like uR and u8R, and this also all works on C but only on gnu -std=gnu99 onwardsGassing
@LewisKelsey the parentheses is part of the specification. C++11 standard (ISO/IEC 14882:2011): 2.14.5 String literals [lex.string]Delao
I
59

Basically a raw string literal is a string in which the escape characters (like \n \t or \" ) of C++ are not processed. A raw string literal which starts with R"( and ends in )" ,introduced in C++11

prefix(optional) R "delimiter( raw_characters )delimiter"

prefix - One of L, u8, u, U

Thanks to @Remy Lebeau, delimiter is optional and is typically omitted, but there are corner cases where it is actually needed, in particular if the string content contains the character sequence )" in it, eg: R"(...)"...)", so you would need a delimiter to avoid an error, eg: R"x(...)"...)x".

See an example:

#include <iostream>
#include <string> 

int main()
{
    std::string normal_str = "First line.\nSecond line.\nEnd of message.\n";
    std::string raw_str = R"(First line.\nSecond line.\nEnd of message.\n)";
    std::string raw_str_delim = R"x("(First line.\nSecond line...)")x";
    std::cout << normal_str << std::endl;
    std::cout << raw_str << std::endl;
    std::cout << raw_str_delim << std::endl;
    return 0;
}

output:

First line.

Second line.

End of message.

First line.\nSecond line.\nEnd of message.\n

"(First line.\nSecond line...)"

Live on Godbolt

Inainability answered 21/6, 2019 at 20:32 Comment(3)
But here in the code the R is defined as "x" and after expansion of the #define the code is const char* s = "x""y"; and there isn't any R"(.Tiphanie
Can you please add an example of using delimiter into "example"? With output.Incorporator
@Incorporator updated the example.Inainability
D
3

I will make an addition about a concern in one of the comments:

But here in the code the R is defined as "x" and after expansion of the #define the code is const char* s = "x""y"; and there isn't any R"(.

The code fragment in the question is to show invalid uses of the Raw Strings. Let me get the actual 3-lines of code here:

#define R "x"
const char* s = R"y"; // ill-formed raw string literal, not "x" "y"
const char* s2 = R"(a)" "b)"; // a raw string literal followed by a normal string literal
  • The first line is there to not get confused by a macro. macros are preprocessed code fragments that replace parts in the source. Raw String, on the other hand, is a feature of the language that is "parsed" according to language rules.
  • The second line is to show the wrong use of it. Correct way would be R"(x)" where you need parenthesis in it.
  • And the last is to show how it can be a pain if not written carefully. The string inside parenthesis CANNOT include closing sequence of raw string. A correction might be R"_(a)" "b)_". _ can be replaced by any character (but not parentheses, backslash and spaces) and any number of them as long as closing sequence is not included inside: R"___(a)" "b)___" or R"anything(a)" "b)anything"

So if we wrap these correction within a simple C++ code:

#include <iostream>
using namespace std;

#define R "x" // This is just a macro, not Raw String nor definition of it
const char* s = R"(y)"; // R is part of language, not a macro
const char* s2 = R"_(a)" "b)_"; // Raw String shall not include closing sequence of characters; )_"

int main(){ cout << s <<endl << s2 <<endl << R <<endl; }

then the output will be

y
a)" "b
x
Dedededen answered 7/6, 2022 at 23:27 Comment(1)
What a kludgey feature!Paraprofessional
M
1

Raw string literal. Used to avoid escaping of any character. Anything between the delimiters becomes part of the string. prefix, if present, has the same meaning as described above.

C++Reference: string literal

a Raw string is defined like this:

string raw_str=R"(First line.\nSecond line.\nEnd of message.\n)";

and the difference is that a raw string ignores (escapes) all the special characters like \n ant \t and threats them like normal text.

So the above line would be just one line with 3 actual \n in it, instead of 3 separate lines.

You need to remove the define line and add parentheses around your string to be considered as a raw string.

Midwife answered 21/6, 2019 at 20:39 Comment(1)
Are you certain you'd need to remove the define? I would think were that the case, then the example in its current state would not be an ill-defined raw string, but rather a well-defined string literal.Visakhapatnam

© 2022 - 2024 — McMap. All rights reserved.