If you feed a wchar_t
, char16_t
, or char32_t
value to a narrow ostream, it will print the numeric value of the code point.
#include <iostream>
using std::cout;
int main()
{
cout << 'x' << L'x' << u'x' << U'x' << '\n';
}
prints x120120120
. This is because there is an operator<<
for the specific combination of basic_ostream
with its charT
, but there aren't analogous operators for the other character types, so they get silently converted to int
and printed that way. Similarly, non-narrow string literals (L"x"
, u"x"
, U"X"
) will be silently converted to void*
and printed as the pointer value, and non-narrow string objects (wstring
, u16string
, u32string
) won't even compile.
So, the question: What is the least awful way to print a wchar_t
, char16_t
, or char32_t
value on a narrow ostream, as the character, rather than as the numeric value of the codepoint? It should correctly convert all codepoints that are representable in the encoding of the ostream, to that encoding, and should report an error when the codepoint is not representable. (For instance, given u'…'
and a UTF-8 ostream, the three-byte sequence 0xE2 0x80 0xA6 should be written to the stream; but given u'â'
and a KOI8-R ostream, an error should be reported.)
Similarly, how can one print a non-narrow C-string or string object on a narrow ostream, converting to the output encoding?
If this can't be done within ISO C++11, I'll take platform-specific answers.
(Inspired by this question.)
std::wstring_convert
, or use a library like ICONV or ICU. – Metabolize