char16_t printing
Asked Answered
O

1

16

Recently I had a problem with porting a Windows application to Linux because of the wchar_t size difference between these platforms. I tried to use compiler switches, but there were problems with printing those characters (I presume that GCC wcout thinks that all wchar_t are 32bit).

So, my question: is there a nice way to (w)cout char16_t? I ask because it doesn't work, I'm forced to cast it to wchar_t:

cout << (wchar_t) c;

It doesn't seem like a big problem, but it bugs me.

Oskar answered 10/4, 2011 at 12:39 Comment(15)
What exactly is causing the size problem (cause there shouldn't be any if used correctly)? Casting to wchar_t won't work.Educator
wchar is 32 bits in GCC, 16 bit in win and asm lib that I used was written presuming 16bit wchar. So i decided to escape portability problems by using a type that was guaranteed to be 16bit. And it works, but wcout won't print char16_t.Oskar
@NoSenseEtAI Yes, Windows is breaking the standard with a 16bit wchar_t and UTF-16 encoding. But that has nothing to do with size assumptions. 64bit systems have different sized types then 32bit systems, but that doesn't mean that your code will break on them. The lib you are mentioning works on Linux?Educator
What exactly are you trying to do? Does your output (terminal?) even expect 2- or 4-byte characters? If it's text processing and your terminal expects UTF8, maybe better to convert your data stream into UTF8 and just emit ordinary chars.Saskatoon
Lib works on Linux, it is asm code, the point is that functions return pointer to wchar array. And the functions are asm and they presume that wchar is 16 bit. To be honest i prefer it that way. BTW my question is about couting char16_t, not about making my code work on linux. I solved that by using char16_t. My question is how to cout char16_t, because wcout doesnt work.Oskar
@Let_Me_Be - Windows (like Java) isn't breaking any standards, as 16 bits was the standard when those systems were designed. You can't blame them for Unicode standards changing afterwards!Pegasus
@Bo Java can't logically break C++ standard, since it's Java. Windows implementation of C++ can. And btw. old version of Windows didn't break the standard since they used 16bit with UCS-2 encoding (which is perfectly OK).Educator
@Let_Me_Be - I assumed that was about the Unicode standard, as you cannot easily "break" the C++ standard that doesn't say anything about the size or encoding of a wchar_t.Pegasus
@Bo The C++ standard requires one character to be represented by one wchar_t. Microsoft ignores the entire C part of the C++ standard and also redefines the meaning "string length" which means number of wchar_t not number of characters. This was already discussed many times here on stackoverflow.Educator
OK, that MS part is all cool and interesting and infuriating , but does anybody knows how to (w)cout char16_t. :D BTW I blame the standard, not MS-it's the same stuff like long... there should be 64 bit integer, there should be 16 bit char...Oskar
@Oskar - There is a limitation here, as you have discovered. We get new char types char16_t and char32_t, plus std::strings with those characetrs. However, we still only have cout and wcout, which don't work directly for those character types. Nobody proposed enough extensions to iostreams and locales to make that happen.Pegasus
@Bo Persson Tnx for the answer. That is awful.. I mean its really bad... not bad but really ugly.Oskar
I agree. It's utterly ludicrous.Rufus
The inability to print char16_t and char32_t is really embarrassing for C++11. u16cout and u32cout is badly needed.Corona
Couldn, agree more... this is just awfull... using pointers: void print_char16_t_array(const char16_t * str) { size_t len=char_traits<char16_t>::length(str); assert(len<=1024); for(int i=0;i<len;++i) wcout<<(wchar_t)str[i]; wcout<<endl; }Oskar
P
6

Give this a try:

#include <locale>
#include <codecvt>
#include <string>
#include <iostream>

int main()
{
    std::wstring_convert<std::codecvt_utf8_utf16<wchar_t> > myconv;
    std::wstring ws(L"Your UTF-16 text");
    std::string bs = myconv.to_bytes(ws);
    std::cout << bs << '\n';
}
Paxton answered 11/4, 2011 at 2:31 Comment(5)
So you are saying that there is no std function that can print char16_t? I mean without conversions... It's strange esp if you consider that it took them half of the age of the universe to finalize c++0x revision.Oskar
We were waiting for you to design, test, implement, get field experience, propose, and then shepherd the proposal through the standardization process. We were busy doing other stuff.Paxton
Oh the flame wars... like I said it's not a core language feature and it is sad and funny at the same time that you cant printout one of the built in types. Does boost has built in function for printing char16_t?Oskar
yeah, if only #include <codecvt> would actually work with gcc that would be great ;)Ronna
std::wstring_convert is deprecated in C++17.Organogenesis

© 2022 - 2024 — McMap. All rights reserved.