Why must the variable used to hold getchar's return value be declared as int?
Asked Answered
U

4

17

I am beginner in C programming language, recently I have studied about getchar function, which will accept a character from the console or from a file, display it immediately while typing and we need to press Enter key for proceeding.

It returns the unsigned char that they read. If end-of-file or an error is encountered getchar() functions return EOF.

My question is that, When it returns unsigned char, then why its returned value is stored in int variable?

Please help me.

Urissa answered 2/8, 2013 at 9:18 Comment(3)
A good link: Definition of EOF and how to use it effectivelyHeavyfooted
See also while ((c = getc(file)) != EOF) loop won't stop executing.Torero
Possible duplicate of Difference between int and char in getchar/fgetc and putchar/fputc?Awn
L
19

Precisely because of that EOF-value. Because a char in a file may be any possible char value, including the null character that C-strings use for termination, getchar() must use a larger integer type to add an EOF-value.

It simply happens to use int for that purpose, but it could use any type with at least 9 bit.

Lockwood answered 2/8, 2013 at 9:22 Comment(1)
Detail: "a char in a file may be any possible char value" --> a character may have many a possible value, but the value returned from getchar() is a character's unsigned char value, not its char value.Spangler
W
4

The return type is int to accommodate for the special value EOF.

EOF is a macro which expands to an integer constant expression with type int and an implementation dependent negative value but is very commonly -1.

Weatherbeaten answered 2/8, 2013 at 9:22 Comment(0)
U
2

Read this link: link

Here it is written that:

Do not convert the value returned by a character I/O function to char if that value will be compared to EOF. Once the return value of these functions has been converted to a char type, character values may be indistinguishable from EOF. Also, if sizeof(int) == sizeof(char), then the int used to capture the return value may be indistinguishable from EOF. See FIO35-C. Use feof() and ferror() to detect end-of-file and file errors when sizeof(int) == sizeof(char) for more details about when sizeof(int) == sizeof(char). See STR00-C. Represent characters using an appropriate type for more information on the proper use of character types.

This rule applies to the use of all character I/O functions.

Urissa answered 2/8, 2013 at 9:22 Comment(0)
A
1

The function getchar returns an int which

  • if the function is successful, represents the character code of the next character on the stream in the range of an unsigned char (i.e. a non-negative value), or

  • if the function fails, represents the special value EOF (which is a negative value) to indicate failure.

The reason why getchar always returns

  • a non-negative number for a valid character, and
  • a negative number to indicate failure,

is that it must be possible to distinguish between a valid character and the special value EOF (which indicates failure and is not a valid character).

If you store the int return value of getchar in an unsigned char, then you will lose information and will no longer be able to distinguish between a valid character and the special value EOF.

On most platforms, EOF is defined as the value -1 and an unsigned char can represent the range 0 to 255. On these platforms, the following applies:

The function getchar can return an int value in the range -1 to 255, which is 257 possible values. The range 0 to 255 (which is 256 possible values) is used for valid character codes and the value -1 is used to indicate failure (EOF).

If you store the int return value of getchar in a variable of type unsigned char, then you will only have 256 instead of 257 possible values. The value -1 will be converted to the value 255. This means that the unsigned char variable is unable to represent the value EOF and you will no longer be able to tell whether getchar returned the value EOF or whether it returned the valid character code 255. The value 255 could mean both.

You will get a similar problem if you store the return value of getchar in a variable of type signed char, because a signed char is also only able to represent 256 different values, but you need to be able to represent 257 different values. Even if a signed char has the advantage that it is able to represent the value EOF, you will still have the problem that you cannot distinguish EOF from a valid character, because the value -1 could mean both. It could mean EOF or it could mean a valid character with the character code 255.

For this reason, you should always first store the return value of getchar in a variable of type int. Only after determining that getchar did not return EOF is it safe to store the return value in a variable of type unsigned char or signed char, because you no longer need to distingish valid characters from the special value EOF.

The same also applies for storing the return value of getchar in a char. On some platforms, char is equivalent to signed char, and on some other platforms, char is equivalent to unsigned char. The ISO C standard allows both.

Atmo answered 29/3, 2023 at 4:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.