C++11 introduced the raw string literals which can be pretty useful to represent quoted strings, literals with lots of special symbols like windows file paths, regex expressions etc...
std::string path = R"(C:\teamwork\new_project\project1)"; // no tab nor newline!
std::string quoted = R"("quoted string")";
std::string expression = R"([\w]+[ ]+)";
This raw string literals can also be combined with encoding prefixes (u8
, u
, U
, or L
), but, when no encoding prefix is specified, does the file encoding matters?, lets suppose that I have this code:
auto message = R"(Pick up a card)"; // raw string 1
auto cards = R"(๐ก๐ข๐ฃ๐ค๐ฅ๐ฆ๐ง๐จ๐ฉ๐ช๐ซ๐ฌ๐ญ๐ฎ)"; // raw string 2
If I can write and store the code above, its obvious that my source code is encoded as unicode, so I'm wondering:
- The
raw string 1
would be a unicode literal? (though it only uses ASCII characters), in other words, does the raw string inherits the codification of the file where is written or the compiler auto-detects that unicode isn't needed regardless of the file encoding? - Would be necessary the encoding prefix
U
on theraw string 2
in order to treat it as unicode literal or it would be unicode automatically due to its contents and/or the source file encoding?
Thanks for your attention.
EDIT:
Testing the code above in ideone.com and printing the demangled type of message
and cards
variables, it outputs char const*
:
template<typename T> std::string demangle(T t)
{
int status;
char *const name = abi::__cxa_demangle(typeid(T).name(), 0, 0, &status);
std::string result(name);
free(name);
return result;
}
int main()
{
auto message = R"(Pick up a card)";
auto cards = R"(๐ก๐ข๐ฃ๐ค๐ฅ๐ฆ๐ง๐จ๐ฉ๐ช๐ซ๐ฌ๐ญ๐ฎ)";
std::cout
<< "message type: " << demangle(message) << '\n'
<< "cards type: " << demangle(cards) << '\n';
return 0;
}
Output:
message type: char const*
cards type: char const*
which is even most weird than I thought, I was convinced that the type would be wchar_t
(even without the L
prefix).