integer variable used with fgetc()
Asked Answered
B

2

4

I was trying to understand some basic code and got slightly confused by following code

int main ()
{
   FILE *fp;
   int c;
   int n = 0;

   fp = fopen("file.txt","r");
   if (fp == NULL)
   {
      perror("Error in opening file");
      return(-1);
   }
   do
   {
      c = fgetc(fp);
      if ( feof(fp) )
      {
         break ;
      }
      printf("%c", c);
   } while(1);

   fclose(fp);
   return(0);
}

Can someone explain me why c is of type integer, even though it is defined by fgetc(fp) which, from my knowledge, gets just the next character?

Bookmobile answered 17/9, 2015 at 22:20 Comment(3)
Look again at fgetc, it returns an int (with good reason), so assigning that result to an int is also a good idea.Kissie
c is of type int, not "integer". The word "integer" covers all the integer types, from char up to long long (and their unsigned variants).Seabrook
Have a look: https://mcmap.net/q/40716/-why-is-while-feof-file-always-wrong/3185968Disannul
I
4

Given the precise way this particular code has been written, c wouldn't need to be of type int--it would still work if c were of type char.

Most code that reads characters from a file should (at least initially) read those characters into an int though. Specifically, part of the basic design of C-style I/O is that functions like getc and fgetc can return EOF in addition to any value that could have come from the file. That is to say, any value of type char could be read from the file. getc, fgetc, etc., can signal the end of file by returning at least one value that won't/can't have come from the file. It's usually -1, but that's not guaranteed. When char is signed (which it usually is, nowadays) you can get a value from the file that will show up as -1 when it's been assigned to a char, so if you're depending on the return value to detect EOF, this can lead to mis-detection.

The code you've included in the question simply copies the contents of the file to standard output, one character at a time. Making use of the EOF return value, we can simplify the code a little bit, so the main loop looks like this:

int c;

while (EOF != (c = fgetc(fp)))
    printf("%c", c); // or: putchar(c);

I suppose some might find this excessively terse. Others may object to the assignment inside the condition of the loop. I, at least, think those are reasonable and valid arguments, but this still fits enough better with the design of the library that it's preferable.

Infelicity answered 17/9, 2015 at 22:32 Comment(5)
OP's c = fgetc(fp); if ( feof(fp) ) well handles the rare situation of unsigned char and int having the same number of unique values. while (EOF != (c = fgetc(fp))) is a common idiom, but not superior to OP'sPanjabi
@chux: That situation is not merely rare--it's completely unheard of (i.e., a purely theoretical possibility). At the same time, the code in question is significantly more difficult to read (especially for somebody accustomed to C, for whom the idiomatic code is immediately recognizable. In other words, it exchanges a purely theoretical advantage for a completely real disadvantage.Infelicity
Analog Devices 32-bit SHARC DSP. As somebody well accustomed to C, I do not find c = fgetc(fp); if ( feof(fp) ) significantly more difficult to read. I doubt any recent/mew machine will have INT_MAX-IINT_MIN == UCHAR_MAX-UCHAR_MIN. I also doubt any machine will not use 2's complement integer, yet C still maintain UB on int overflow. So I code defensively and watch for overflow. I do not denigrate code like c = fgetc(fp); if ( feof(fp) ) as needing simplification when it is in fact a reasonable alternative nor object to terse Yoda-like code EOF != (c = fgetc(fp))Panjabi
Out of curiosity, why is it possible to use %c with an int in printf? You have to specify a type when pulling arguments out of variadic functions, and I thought that printf expects a char upon encountering %c. Wouldn't passing an int where it expects a char lead to undefined behaviour? Or does it just cast it?Levee
@JoshuaPerrett printf's %c expects an int, not a char that is why it is OBVIOUSLY possible.Hilar
L
1

The signature of fgetc

int fgetc ( FILE * stream );

And about return value of fgetc

On success, the character read is returned (promoted to an int value). The return type is int to accommodate for the special value EOF, which indicates failure.

So we declare variable as integer. As character may be unsigned which can't take negative value. But implementation of EOF always negative or very commonly -1 which can be use as failure or end of file. So it is safe to use integer and can be use like

int c;
while ((c = fgetc(fp))!=EOF){
   //Code goes here
}
Leavings answered 17/9, 2015 at 22:28 Comment(6)
Whether char is signed or unsigned is implementation-defined (C11§6.2.5/15). EOF is always negative (C11§7.21.1/3).Kissie
Yes, so it is safe to use integer.Leavings
"As character may be unsigned which can't take negative value." is true, but an unsigned char converted to an int can have a negative value. Consider unsigned char and int both using 32-bit. Legal in C, but rare. OP's c = fgetc(fp); if ( feof(fp) ) is more robust than int c; while ((c = fgetc(fp))!=EOF){Panjabi
FWIW, I've got a different quote. My manpage says: "fgetc(), getc() and getchar() return the character read as an unsigned char cast to an int or EOF on end of file or error" and I use getchar in that way. Values from 0 to 255 indicate valid input, the special value EOF indicates the end of the file. EOF is negative, so it is distinct from valid input.Howerton
@chux: Your comment is being discussed here: https://mcmap.net/q/1293608/-what-is-the-better-way-to-check-eof-and-error-of-fgetc-duplicate/908515Scoot
@M Oehm unsigned char is not limited to 0 to 255, but to UCHAR_MAX - although that is certainly common to be 255. It is not even limited to 0 to INT_MAX. In that rare case, a conversion of unsigned char to int may result in a negative value equal to EOF. Man pages reflect a large segment of C, but is not the C spec.Panjabi

© 2022 - 2024 — McMap. All rights reserved.