My view is that a C implementation cannot satisfy the specification of certain stdio
functions (particularly fputc
/fgetc
) if sizeof(int)==1
, since the int
needs to be able to hold any possible value of unsigned char
or EOF
(-1). Is this reasoning correct?
(Obviously sizeof(int)
cannot be 1 if CHAR_BIT
is 8, due to the minimum required range for int
, so we're implicitly only talking about implementations with CHAR_BIT>=16
, for instance DSPs, where typical implementations would be a freestanding implementation rather than a hosted implementation, and thus not required to provide stdio
.)
Edit: After reading the answers and some links references, some thoughts on ways it might be valid for a hosted implementation to have sizeof(int)==1
:
First, some citations:
7.19.7.1(2-3):
If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the fgetc function obtains that character as an unsigned char converted to an int and advances the associated file position indicator for the stream (if defined).
If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the endof-file indicator for the stream is set and the fgetc function returns EOF. Otherwise, the fgetc function returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and the fgetc function returns EOF.
7.19.8.1(2):
The fread function reads, into the array pointed to by ptr, up to nmemb elements whose size is specified by size, from the stream pointed to by stream. For each object, size calls are made to the fgetc function and the results stored, in the order read, in an array of unsigned char exactly overlaying the object. The file position indicator for the stream (if defined) is advanced by the number of characters successfully read.
Thoughts:
Reading back
unsigned char
values outside the range ofint
could simply haveundefinedimplementation-defined behavior in the implementation. This is particularly unsettling, as it means that usingfwrite
andfread
to store binary structures (which while it results in nonportable files, is supposed to be an operation you can perform portably on any single implementation) could appear to work but silently fail.essentially always results in undefined behavior. I accept that an implementation might not have a usable filesystem, but it's a lot harder to accept that an implementation could have a filesystem that automatically invokes nasal demons as soon as you try to use it, and no way to determine that it's unusable.Now that I realize the behavior is implementation-defined and not undefined, it's not quite so unsettling, and I think this might be a valid (although undesirable) implementation.An implementation
sizeof(int)==1
could simply define the filesystem to be empty and read-only. Then there would be no way an application could read any data written by itself, only from an input device onstdin
which could be implemented so as to only give positivechar
values which fit inint
.
Edit (again): From the C99 Rationale, 7.4:
EOF is traditionally -1, but may be any negative integer, and hence distinguishable from any valid character code.
This seems to indicate that sizeof(int)
may not be 1, or at least that such was the intention of the committee.
sizeof()
is in terms ofunsigned char
units, which are the fundamental representation of any type. See "Representation of Types" (6.2.6) in the C standard. The other direction is possible, though; some bits ofint
could be padding bits, trap bits, etc. – Quadrangularsizeof(int)
is 1,int
cannot have any padding bits/trap bits due to the integer conversion rank and promotion rules in 6.3.1.1. Specifically, paragraph 3 says "The integer promotions preserve value including sign." This also means that ifsizeof(int)
is 1 andsigned char
is twos complement,int
must also be twos complement (orSCHAR_MIN
could not be preserved by promotion). – Quadrangularchar
do not exist as far as the formal language is concerned. That doesn't mean they're not there in the hardware. It means they're unobservable and therefore irrelevant. – Quadrangularasm
keyword or other way of writing machine code, which is outside the scope of the C language. There would be absolutely no way, using just C code, to access such padding bits inchar
, so from a formal standpoint, they don't exist. – Quadrangularlimits.h
. Behavior on overflow is undefined, so it doesn't matter if larger values somehow get generated. – Quadrangular