I have always understood it this way: the purpose of the iostream
class is to read and/or write a stream of characters, which, if you think about it, are abstract entities that are only represented by the computer using a character encoding. The C++ standard makes great pains to avoid pinning down the character encoding, saying only that "Objects declared as characters (char
) shall be large enough to store any member of the implementation's basic character set," because it doesn't need to force the "implementation basic character set" to define the C++ language; the standard can leave the decision of which character encoding is used to the implementation (compiler together with an STL implementation), and just note that char
objects represent single characters in some encoding.
An implementation writer could choose a single-octet encoding such as ISO-8859-1 or even a double-octet encoding such as UCS-2. It doesn't matter. As long as a char
object is "large enough to store any member of the implementation's basic character set" (note that this explicitly forbids variable-length encodings), then the implementation may even choose an encoding that represents basic Latin in a way that is incompatible with any common encoding!
It is confusing that the char
, signed char
, and unsigned char
types share "char" in their names, but it is important to keep in mind that char
does not belong to the same family of fundamental types as signed char
and unsigned char
. signed char
is in the family of signed integer types:
There are four signed integer types: "signed char", "short int", "int", and "long int."
and unsigned char
is in the family of unsigned integer types:
For each of the signed integer types, there exists a corresponding (but different) unsigned integer type: "unsigned char", "unsigned short int", "unsigned int", and "unsigned long int," ...
The one similarity between the char
, signed char
, and unsigned char
types is that "[they] occupy the same amount of storage and have the same alignment requirements". Thus, you can reinterpret_cast
from char *
to unsigned char *
in order to determine the numeric value of a character in the execution character set.
To answer your question, the reason why the STL uses char
as the default type is because the standard streams are meant for reading and/or writing streams of characters, represented by char
objects, not integers (signed char
and unsigned char
). The use of char
versus the numeric value is a way of separating concerns.
char
,signed char
,unsigned char
,int8_t
anduint8_t
. (my vote would be for the last in this list) – Relator