End of File (EOF) in C
Asked Answered
T

3

85

I am currently reading the book C Programming Language by Ritchie & Kernighan. And I am pretty confused about the usage of EOF in the getchar() function.

First, I want to know why the value of EOF is -1 and why the value of getchar()!=EOF is 0. Pardon me for my question but I really don't understand. I really tried but I can't.

Then I tried to run the example on the book that can count the number of characters using the code below but it seems that I never get out of the loop even if I press enter so I am wondering when would I reach the EOF?

main(){
   long nc;
   nc = 0;
   while (getchar() != EOF)
       ++nc;
   printf("%ld\n", nc);
}

Then, I read the same problem at Problem with EOF in C. Most people advised that instead of using EOF, use the terminator \n or the null terminator '\0' which makes a lot of sense.

Does it mean that the example on the book serves another purpose?

Tarriance answered 5/12, 2010 at 12:18 Comment(1)
You do understand that the book you mention is by the original authors of the C language, right?Bosomy
N
126

EOF indicates "end of file". A newline (which is what happens when you press enter) isn't the end of a file, it's the end of a line, so a newline doesn't terminate this loop.

The code isn't wrong[*], it just doesn't do what you seem to expect. It reads to the end of the input, but you seem to want to read only to the end of a line.

The value of EOF is -1 because it has to be different from any return value from getchar that is an actual character. So getchar returns any character value as an unsigned char, converted to int, which will therefore be non-negative.

If you're typing at the terminal and you want to provoke an end-of-file, use CTRL-D (unix-style systems) or CTRL-Z (Windows). Then after all the input has been read, getchar() will return EOF, and hence getchar() != EOF will be false, and the loop will terminate.

[*] well, it has undefined behavior if the input is more than LONG_MAX characters due to integer overflow, but we can probably forgive that in a simple example.

Nucleate answered 5/12, 2010 at 12:26 Comment(5)
i know now the problem.. that's why i can't see the result.. it's because i'm using Dev-C++ and it has no system("pause") so i need to type that in at the end of the code.Tarriance
Actually CTRL+D does not provoke an EOF. It just terminates your terminal, which in turn the kernel knows no more bytes can be read so no data is available in the standard input file descriptior.Kindle
@Tarriance system function creates a new shell and run the command passed to it. The command is executed by the system shell and is not related to the compiler whatsoeverSideway
If you hit return key 'enter', 'getchar()' see it only a character and your 'nc' variable increased.Cabbage
@KorayTugay Bash's behavior is to exit the shell when it receives the EOF character from control-D: unix.stackexchange.com/questions/110240/…Septima
E
27

EOF is -1 because that's how it's defined. The name is provided by the standard library headers that you #include. They make it equal to -1 because it has to be something that can't be mistaken for an actual byte read by getchar(). getchar() reports the values of actual bytes using positive number (0 up to 255 inclusive), so -1 works fine for this.

The != operator means "not equal". 0 stands for false, and anything else stands for true. So what happens is, we call the getchar() function, and compare the result to -1 (EOF). If the result was not equal to EOF, then the result is true, because things that are not equal are not equal. If the result was equal to EOF, then the result is false, because things that are equal are not (not equal).

The call to getchar() returns EOF when you reach the "end of file". As far as C is concerned, the 'standard input' (the data you are giving to your program by typing in the command window) is just like a file. Of course, you can always type more, so you need an explicit way to say "I'm done". On Windows systems, this is control-Z. On Unix systems, this is control-D.

The example in the book is not "wrong". It depends on what you actually want to do. Reading until EOF means that you read everything, until the user says "I'm done", and then you can't read any more. Reading until '\n' means that you read a line of input. Reading until '\0' is a bad idea if you expect the user to type the input, because it is either hard or impossible to produce this byte with a keyboard at the command prompt :)

Eccles answered 5/12, 2010 at 12:26 Comment(0)
B
11

That's a lot of questions.

  1. Why EOF is -1: usually -1 in POSIX system calls is returned on error, so i guess the idea is "EOF is kind of error"

  2. any boolean operation (including !=) returns 1 in case it's TRUE, and 0 in case it's FALSE, so getchar() != EOF is 0 when it's FALSE, meaning getchar() returned EOF.

  3. in order to emulate EOF when reading from stdin press Ctrl+D

Bourges answered 5/12, 2010 at 12:24 Comment(2)
Boolean operations return non-zero for true and zero for false. There's a difference.Bronchial
No, the operators are defined to return 1. Any non-zero value is "true" in a boolean context (for example, an if() or while() condition).Bosomy

© 2022 - 2024 — McMap. All rights reserved.