ProbIem with EOF in C
Asked Answered
R

7

6

I'm writing a program which is supposed to read two strings that can contain line breaks and various other characters. Therefore, I'm using EOF (Ctrl-Z or Ctrl-D) to end the string.

This works fine with the first variable, but with the second variable, however, this seems to be problematic as apparently something is stuck in the input buffer and the user doesn't get to type in anything.

I tried to clean the buffer with while (getchar() != '\n'); and several similar variations but nothing seems to help. All cleaning attempts have resulted in an infinite loop, and without cleaning, adding the second variable is impossible.

The characters for both of the variables are read in a loop like this: while((c = getchar()) != EOF), which would suggest it is EOF what I have stuck in my buffer. Or does it affect the behavior of the program in some other way? Is there something wrong with the logic I'm using?

I'm starting to get bit desperate after struggling with this for hours.

code:

#include <stdio.h>
#include <string.h>

int main(void)
{
    int x = 0;
    int c;
    char a[100];
    char b[100];

    printf("Enter a: ");
    while((c = getchar()) != EOF)
    {
        a[x] = c;
        x++;
    }
    a[x] = '\0';
    x = 0;

    /*while (getchar() != '\n'); - the non-working loop*/

    printf("\nEnter b: ");
    while((c = getchar()) != EOF)
    {
        b[x] = c;
        x++;
    }
    b[x] = '\0';

    printf("\n\nResults:\na: %s\n", a);
    printf("b: %s\n", b);

    return(0);
}
Rusticus answered 25/10, 2009 at 21:16 Comment(4)
On my linux box, it works well.Hobbism
It allows you to enter values for both variables?Rusticus
Dynamic memory: make a function to read the input. I'd make that function something like int read_large_input(char **buf, size_t *len);. I'm a big follower of the motto "the function which malloc()s is responsible for free()ing".Sizzler
Well, buffer is one more step towards complexity and not actually exactly what I'm looking for, since even with a function and a buffer the data needs to be stored somewhere, and that storage size needs to be increased when needed... Anyway, I did something on my own again by trying out some stuff with this and noticed that realloc seems to solve this problem. Might be bit bad for the performance, but at least it doesn't fall short on or use too much memory. Just tested the program with about 1000 characters on several lines with no problems at all, so it seems to work. Thank you for help!Rusticus
H
13

After you received an EOF from the terminal, you will not receive any additional data. There is no way of un-EOF-ing the input - the end of the file is, well, the end.

So you should define that each variable is input on a separate line, and have users press enter instead of EOF. You still need to check whether you have received eof, because that means that the user actually typed EOF, and you won't see anything else - in this case, you need to break out of the loop and print an error message.

Haslet answered 25/10, 2009 at 21:20 Comment(3)
Okay, so EOF can't be used like I inteded. Thanks. Being able to add several lines into the same variable is quite important here, is there any sensible way to do that?Rusticus
There are several convention: a) an empty line (double enter) will terminate the input; this should work fine unless your multi-line input should also allow for empty lines. b) some stop character (often ".", e.g. in SMTP) will end the input; the assumption is that this is unlikely to occur in real text.Assignment
You could do something ... tricky ... and prevent normal EOF behavior (an EOF is still an EOF, but a <kbd>Control-D</kbd> need not send it, for instance). This is beyond the scope of C though.Argive
S
3

EOF isn't a character - it's a special value that the input functions return to indicate a condition, that the "end of file" on that input stream has been reached. As Martin v. Löwis says, once that "end of file" condition occurs, it means that no more input will be available on that stream.

The confusion arises because:

  • Many terminal types recognize a special keystroke to signal "end of file" when the "file" is an interactive terminal (eg. Ctrl-Z or Ctrl-D); and
  • The EOF value is one of the values that can be returned by the getchar() family of functions.

You will need to use an actual character value to separate the inputs - the ASCII nul character '\0' might be a good choice, if that can't appear as a valid value within the inputs themselves.

Sthilaire answered 25/10, 2009 at 21:32 Comment(2)
Yep, I used without actually knowing the details about it. Thank you, the nul character would be handy indeed as it could be used to terminate the string as well, but when I tried several alternatives, I don't think I was able to produce it. How do you get it?Rusticus
Arcthae: That depends on your system, but Ctrl-@ works on many terminals.Sthilaire
H
1

I run the code on my linux box, here is the result:

Enter a: qwer
asdf<Ctrl-D><Ctrl-D>
Enter b: 123
456<Ctrl-D><Ctrl-D>

Results:
a: qwer
asdf
b: 123
456

Two Ctrl-D was needed because the terminal input buffer was not empty.

Hobbism answered 25/10, 2009 at 22:26 Comment(3)
Oh... This is odd. I tried it now on Linux myself and it works (with one Ctrl-D per variable). Yesterday I was compiling on Windows and it worked completely differently.Rusticus
If You end the last line with a <Return>, You need only one <Ctrl-D>.Hobbism
This is the only accurate answer here. – On C-level the EOF condition is nothing more than a read() returning 0 bytes. You can still read() again.Royceroyd
D
0

What you are trying is fundamentally impossible with EOF.

Although it behaves like one in some ways, EOF is not a character in the stream but an environment-defined macro representing the end of the stream. I haven't seen your code, but I gather you're doing is something like this:

while ((c=getchar()) != EOF) {
    // do something
}
while ((c=getchar()) != EOF) {
    // do something else
}

When you type the EOF character the first time, to end the first string, the stream is irrevocably closed. That is, the status of the stream is that it is closed.

Thus, the contents of the second while loop are never run.

Dioxide answered 25/10, 2009 at 21:30 Comment(2)
Added some code now. I saw some programs using it so I tried to use it here without knowing its true nature.Rusticus
Yes, so your code is pretty much identical to what I expected. If this program need only be run on the command line, my suggestion is to separate the strings with the null character. On my system (OS X) this is invoked with Ctrl-@ (ie Ctrl-Shift-2 on this keyboard).Dioxide
P
0

You could use the null character ('\0') to separate the variables. Various UNIX tools (e.g. find) are capable of separating their output items in this way, which would suggest that it's a fairly standard method.

Another advantage of this is that you can read the stream into a single buffer and then create an array of char*s to point to the individual strings, and each string will be correctly '\0'-terminated without you having to change anything in the buffer manually. This means less memory allocation overhead, which may make your program run noticeably faster depending on how many variables you're reading. Of course, this is only necessary if you need to hold all the variables in memory at the same time — if you're dealing with them one-at-a-time, you don't get this particular advantage.

Pasteur answered 25/10, 2009 at 21:30 Comment(2)
Thanks for pointing it out. How do you enter null in the program?Rusticus
The null character is just a byte with the value 0, but it's usually entered using the escape code \0, which you can use inside string literals and character literals (although note that C will automatically put a null character at the end of any string that you define using a literal, because it uses this to signal the end of the string).Pasteur
S
0

Rather than stopping reading input at EOF -- which isn't a character -- stop at ENTER.

while((c = getchar()) != '\n')
{
    if (c == EOF) /* oops, something wrong, input terminated too soon! */;
    a[x] = c;
    x++;
}

EOF is a signal that the input terminated. You're almost guaranteed that all inputs from the user end with '\n': that's the last key the user types!!!


Edit: you can still use Ctrl-D and clearerr() to reset the input stream.

#include <stdio.h>

int main(void) {
  char a[100], b[100];
  int c, k;

  printf("Enter a: "); fflush(stdout);
  k = 0;
  while ((k < 100) && ((c = getchar()) != EOF)) {
    a[k++] = c;
  }
  a[k] = 0;

  clearerr(stdin);

  printf("Enter b: "); fflush(stdout);
  k = 0;
  while ((k < 100) && ((c = getchar()) != EOF)) {
    b[k++] = c;
  }
  b[k] = 0;

  printf("a is [%s]; b is [%s]\n", a, b);
  return 0;
}
$ ./a.out
Enter a: two
lines (Ctrl+D right after the next ENTER)
Enter b: three
lines
now (ENTER + Ctrl+D)
a is [two
lines (Ctrl+D right after the next ENTER)
]; b is [three
lines
now (ENTER + Ctrl+D)
]
$
Sizzler answered 25/10, 2009 at 21:50 Comment(5)
The point is to be able to enter input with line breaks though. Like was pointed out, I can try to figure out other character I can use like I intended to use EOF here, or maybe I should to come up with a new approach towards the problem. Your example - stopping at enter could work too, but you'd need another loop and still a way to determinate when the user is switching over to the second variable, which allows line breaks as well.Rusticus
Thanks! So there is a way to do this the way I thought after all. :D Is reseting the whole stream kind of like shooting flies, though?Rusticus
getchar() returns EOF when it tries to read after the end of file or if there was an error. clearerr() tells the program to ignore the end of file or error. Usually there's no point ignoring the signal: if the file ended, it won't magically have new data after ignoring EOF and if there was an error (network breaks, media removed, ...) clearerr() won't magically correct the error. For your specific problem, this should work -- but users cannot redirect input.Sizzler
Okay. Thanks, I think this is indeed perfect for my program. Btw, I see you test if k is below 100 in your code. Originally I intended to solve that by dynamic memory allocation, but when I had problems with that infinite loop and memory-related crashes I left it out and switched over to char[100]. I tried to enter example code of what I tried with the memory allocation here, but kind of failed, so I edited my question once again. (Or should I make a new question? I'm kind of new to the site and its general policy)Rusticus
I think it's better to make a new question. And my snippet should really test for k < 99 instead to allow for the following terminating zero.Sizzler
S
0

How do you enter null in the program?

You can implement the -print0 function using:

putchar(0);

This will print an ASCII nul character '\0' to sdtout.

Subbase answered 26/10, 2009 at 7:1 Comment(1)
I formatted the question bit badly. I'm reading input from the user's keyboard, and I'm trying to get two strings that can contain line breaks into two variables and I need something that the user can end each string with. I originally used EOF, and some people told me to switch using to NULL, but I wasn't able to produce that character with my keyboard. Probably because of the environment (Windows). Tried Ctrl-@ and pretty much everything else but didn't seem to get \0.Rusticus

© 2022 - 2024 — McMap. All rights reserved.