In the proposal Minimal Unicode support for the standard library (revision 2) it is indicated that there was only support among the Library Working Group for supporting the new character types in strings and codecvt facets. Apparently the majority was opposed to supporing iostream, fstream, facets other than codecvt, and regex.
According to minutes from the Portland meeting in 2006 "the LWG is committed to full support of Unicode, but does not intend to duplicate the library with Unicode character variants of existing library facilities." I haven't found any details, however I would guess that the committee feels that the current library interface is inappropriate for Unicode. One possible complaint could be that it was designed with fixed sized characters in mind, but Unicode completely obsoletes that as, while Unicode data can use fixed sized code points, it does not limit characters to single code points.
Personally I think there's no reason not to standardized the minimal support that's already provided on various platforms (Windows uses UTF-16 for wchar_t, most Unix platforms use UTF-32). More advanced Unicode support will require new library facilities, but supporting char16_t and char32_t in iostreams and facets won't get in the way but would enable basic Unicode i/o.
std::basic_iostream<char32_t>
? Just because there's no predefined types (likestd::iostream
forchar
) doesn't mean there is no support. – Sentimentalismbasic_istringstream<char16_t>
in GCC version 4.7.0. It compiles, but crashes during execution. This, of course, does not prove that support could be present in another environment, but I still find it strange that the standardization committee did not include support on an equal footing with wchar_t. – Triarchychar
andwchar_t
-- all other character types are strictly implementation-defined, so not supporting them isn't necessarily a "bug". – Scandentchar_traits
or containers. – Scandentbasic_istringstream
(and similar) all default the second argument tostd::char_traits<char>
. You'll have to give it both template arguments. – Sianachar, wchar_t, and any other implementation-defined character types...
– Sianachar
andwchar_t
is implementation-defined. Also, I'm not sure that streams could be expected to work directly withchar16_t
in particular, because that data type implies the possibility of multi-byte character sequences (surrogate pairs in this case), and I'm not aware that streams can use multi-byte sequences without a non-default facet. That said, std iostreams are certainly not my area of expertise. – Scandent