I'm trying to print a Chinese character using the types wchar_t, char16_t and char32_t, to no avail. - McMap

About

I'm trying to print a Chinese character using the types wchar_t, char16_t and char32_t, to no avail.

Asked 22/7, 2015 at 18:40 Answered 23/7, 2015 at 2:28

Solved c++c++14 cout

R

1

5

I'm trying to print the Chinese character 中 using the types wchar_t, char16_t and char32_t, without success (live example)

#include <iostream>
int main()
{
    char x[] = "中";            // Chinese character with unicode point U+4E2D
    char y[] = u8"中";
    wchar_t z = L'中';
    char16_t b = u'\u4e2d';
    char32_t a = U'\U00004e2d';

    std::cout << x << '\n';     // Ok
    std::cout << y << '\n';     // Ok
    std::wcout << z << '\n';    // ?? 
    std::cout << a << '\n';     // prints the decimal number (20013) corresponding to the unicode point U+4E2D
    std::cout << b << '\n';     //             "                    "                   "
}

Repellent answered 22/7, 2015 at 18:40 Comment(9)

std::wcout doesn't work if you are trying to write text that cannot be represented in your default locale. – Beitnes 22/7, 2015 at 18:44

C++ does not have a usable Unicode support. If you need (non trivial) Unicode handling, use a dedicated library like ICU. (Yes, you can get something done with std::string on non-Windows and wstring on Windows, but meh). – Blida 22/7, 2015 at 18:46

@BaummitAugen It seems to be working with UTF-8 – Accoutre 22/7, 2015 at 18:49

Console ouput issues are very system dependend. Are you working with windows in console mode ? – Langley 22/7, 2015 at 18:49

very relevant: #8169497 – Beitnes 22/7, 2015 at 18:51

@François-MarieArouet Yes, on non-Windows systems you can usually safe and print utf8 in a normal std::string. But try something like making an existing string like "Fußball" uppercase and you will know what I mean. – Blida 22/7, 2015 at 18:51

Relevant: https://mcmap.net/q/16239/-std-wstring-vs-std-string – Blida 22/7, 2015 at 18:52

@BaummitAugen Thanks for the link, but why doesn't std::wcout << z << '\n'; work in my snippet? – Accoutre 22/7, 2015 at 18:58

@François-MarieArouet Maybe the console Coliru uses only supporting utf-8, as it is a Linux system? Maybe something else. Doing non-ASCII stuff portably yourself in C++ is hard. – Blida 22/7, 2015 at 19:0

V

7

Since you're running your test on a Linux system, source code is UTF-8, which is why x and y are the same thing. Those bytes are shunted, unmodified, into the standard output by std::cout << x and std::cout << y, and when you view the web page (or when you look at the linux terminal), you see the character as you expected.

std::wcout << z will print if you do two things:

std::ios::sync_with_stdio(false);
std::wcout.imbue(std::locale("en_US.utf8"));

without unsynching from C, GNU libstdc++ goes through C IO streams, which can never print a wide char after printing a narrow char on the same stream. LLVM libc++ appears to work even synched, but of course still needs the imbue to tell the stream how to convert the wide chars to the bytes it sends into the standard output.

To print b and a, you will have to convert them to wide or narrow; even with wbuffer_convert setting up a char32_t stream is a lot of work. It would look like this:

std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> conv32;
std::cout << conv32.to_bytes(a) << '\n';

Putting it all together: http://coliru.stacked-crooked.com/a/a809c38e21cc1743

Vilipend answered 23/7, 2015 at 2:28 Comment(0)

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.