On Windows, wchar_t
is a UTF-16(LE) formatted character, which is -- for the most part -- equivalent to char16_t
. However, these two character types are still distinct types in the C++ type-system -- which makes me uncertain whether converting between sequences of these two character types is legal as per the C++ standard.
My question is this: In C++17, is it legal to perform the following casts, and to read from the converted pointers:
reinterpret_cast<const wchar_t*>(char16_ptr)
wheredecltype(char16_ptr)
isconst char16_t*
, andreinterpret_cast<const char16_t*>(wchar_ptr)
wheredecltype(wchar_ptr)
isconst wchar_t*
For the purposes of this question, assume the following:
sizeof(wchar_t) == sizeof(char16_t)
, andwchar_t
is formatted the same aschar16_t
(as is the case on Windows)
Basically, is this a violation of a strict-aliasing?
My understanding that the cast itself is valid thanks to [expr.reinterpret.cast]/7
, but that the result of the cast cannot safely be used since the type is being aliased by something that isn't char
, unsigned char
, or std::byte
. Is this interpretation correct?
Note: Other questions have been asked regarding wchar_t
and char16_t
being the same, but this question is not a duplicate of those as far as I can tell. Notably, the question "Are wchar_t and char16_t the same on Windows?" actually performs a reinterpret_cast
between pointers, but none of the answers actually address whether this cast was ever legal in the first place.
wchar_t
is the same aschar32_t
and you'll end up thinking you've hit a null terminator before the end of the string. – Coeducationwchar_t
on Windows is unsigned. – Chromogen