C++ What does the size of char16_t depend on?
Asked Answered
E

3

12

This is also related to char32_t and any intXX_t. The specification points out that:

2.14.3.2:

The value of a char16_t literal containing a single c-char is equal to its ISO 10646 code point value, provided that the code point is representable with a single 16-bit code unit.

5.3.3.1:

[..] in particular [..] sizeof(char16_t), sizeof(char32_t), and sizeof(wchar_t) are implementation-defined

I can not see anything about the intXX_t types, apart from the comment that they are "optional" (18.4.1).

If a char16_t isn`t guaranteed to be 2 bytes, is it guaranteed to be 16 bit (even on architectures where 1 byte != 8 bit)?

Erased answered 22/6, 2011 at 13:40 Comment(2)
I read that as 'char16_t must be at least 16 bits' but I'm not a standard-lawyer.Othella
Why do you even care about sizeof(char16_t) ? What matters is that it can hold 16-bit values.Candi
M
12

3.9.1 Fundamental types [basic.fundamental]

Types char16_t and char32_t denote distinct types with the same size, signedness, and alignment as uint_least16_t and uint_least32_t, respectively, in , called the underlying types.

This means char16_t is at least 16 bits (but may be larger)

But I also believe:

The value of a char16_t literal containing a single c-char is equal to its ISO 10646 code point value, provided that the code point is representable with a single 16-bit code unit.

provides the same guarantees (though less explicitly (as you have to know that ISO 10646 is UCS (Note UCS is compatible but not exactly the same as Unicode))).

Merchant answered 22/6, 2011 at 13:50 Comment(1)
Upvoted. I was about to quote the same paragraph. You beat me to it. :)Candi
E
5

The value of a char16_t literal containing a single c-char is equal to its ISO 10646 code point value, provided that the code point is representable with a single 16-bit code unit.

This is impossible to satisfy if char16_t isn't at least 16 bits wide, so by contradiction, it's guaranteed to be at least that wide.

Embrey answered 22/6, 2011 at 13:48 Comment(4)
one may think so, but what would be the difference between (e.g.) std::int16_t and std::int_least16_t?Erased
@FrEEzE2046: int16_t is an optional typedef while int_least16_t is a mandatory one. If <cstdint> provides int16_t it is guaranteed to be an "exact-16-bit" type with 2's complement encoding for negative numbers. int_least16_t may be larger than 16 bits. Consider a 32bit machine that does not support 8bit or 16bit arithmetic.Candi
@FrEEzE2046: int16_t is far dense packing of 16-bit values. int_least16_t is for fast packing. char16_t is for UTF-16 strings.Embrey
no, int_fast16_t would be for "fast". int_least16_t is the smallest integer type that is as least 16 bits wide (so, 16 bit if a 16 bit type exists, otherwise the next biggest).Poignant
H
2

It can't be guaranteed to be exactly 16 bits, since there are platforms which don't support types that small (for example, DSPs often can't address anything smaller than their word size, which may be 24, 32 or 64 bits). Your first quote guarantees that it will be at least 16 bits.

Hiccup answered 22/6, 2011 at 13:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.