Theory Behind getchar() and putchar() Functions
Asked Answered
H

3

10

I'm working through "The C Programming Language" by K&R and example 1.5 has stumped me:

#include <stdio.h>

/* copy input to output; 1st version */
int main(int argc, char *argv[])
{
    int c;

    while ((c = getchar()) != EOF)
        putchar(c);

    return 0;
}

I understand that 'getchar()' takes a character for 'putchar()' to display. However, when I run the program in terminal, why is it that I can pass an entire line of characters for 'putchar()' to display?

Hercule answered 9/7, 2013 at 15:39 Comment(1)
Its simple , should be a duplicate , anyway look at the answer :)Beginning
C
24

Because your terminal is line-buffered. getchar() and putchar() still only work on single characters but the terminal waits with submitting the characters to the program until you've entered a whole line. Then getchar() gets the character from that buffer one-by-one and putchar() displays them one-by-one.

Addition: that the terminal is line-buffered means that it submits input to the program when a newline character is encountered. It is usually more efficient to submit blocks of data instead of one character at a time. It also offers the user a chance to edit the line before pressing enter.

Note: Line buffering can be turned off by disabling canonical mode for the terminal and calling setbuf with NULL on stdin.

Caddoan answered 9/7, 2013 at 15:46 Comment(3)
+1 for the explanation. But can you please elaborate more on this line:- "Because your terminal is line-buffered" ?Molly
"Line buffering can be turned off by disabling canonical mode for the terminal." -- That's not adequate. You also have to turn off line buffering for stdin via setbuf/setvbufAmp
Actually, I think I may have been wrong, at least for Linux. I don't currently have a Linux system to test it on, but my understanding now is that getchar with an empty buffer does a read into the buffer and if that read terminates early, it doesn't try to continue reading to find a newline, it immediately returns the first char of the buffer, so disabling canonical mode is adequate.Amp
C
2

Yeah you can actually write whatever you want as long as it's not an EOF char, the keyboard is a special I/O device, it works directly through the BIOS and the characters typed on the keyboard are directly inserted in a buffer this buffer is, in your case read by the primitive getchar(), when typing a sentence you are pushing data to the buffer, and the getchar() function is in an infinite loop, this is why this works.

You can ask me more questions if you want more details about how the IO device work.

Cheers.

Cerebrovascular answered 9/7, 2013 at 15:55 Comment(7)
There's no such a thing as the "EOF character", at least since the days of CP/M. EOF in C is just a special value returned by getchar() when the input stream is terminated; think about it as an error code, not as a character.Bonis
@MatteoItalia Yes, there is ... on POSIX it's ctrl-D and on Windows it's ctrl-Z. Most of the rest of this answer is wrong, however ... one doesn't read from the keyboard, which produces keycodes, one reads from a terminal, which is a virtual device usually associated with a window; the BIOS has nothing to do with it unless reading from the console ... and the BIOS certainly does not buffer that data. "You can ask me more questions if you want more details about how the IO device work." -- I shudder at the thought.Amp
@JimBalter: they are just magic sequences recognized by terminal emulators (and only if they are put on their own line); the terminal emulator sees the sequence and close its ends of the pipes. On the other side, the C runtime sees that the stream has been closed - read returns zero, ReadFile does something similar and the stream can be checked for end-of-file, nowhere here is an EOF character involved - and marks the end of its own stream. When the application reaches the end of the C stream, then the CRT returns EOF as error code.Bonis
@MatteoItalia I know all that; sheesh. But they are "EOF characters" -- that's what they're called. EOF stands for "end of file" ... ctrl-D is a character that indicates end-of-file. getchar returns the EOF (end-of-file) value (usually -1) ... they are totally different values, but they are both called "EOF".Amp
My point is that EOF is not a real character, but an error code, and Ctrl-D/Ctrl-Z are just keyboard shortcuts. In facts, if you store Ctrl-D in a file and do a cat on it (even piping it in another cat, just to be sure that it doesn't work differently when reading files) it doesn't stop at Ctrl-D but at the "real" end of file, so it's not like the character itself that is magic, it's the closing of the stream that does its thing. Otherwise, Ctrl-W would be the "close window character", Ctrl-S the "save file character" and so on.Bonis
@MatteoItalia Your point is wrong. Ctrl-D and Ctrl-Z are real characters ... they are designated as the "EOF character" on their respective systems. "if you store Ctrl-D" ... of course; on POSIX systems, the EOF character only applies to terminals. But in Windows text files, ctrl-Z is still an EOF character. "so it's not like the character itself that is magic" -- the "character itself" is just a sequence of bits; the "magic" is in how it is interpreted. Otherwise, Ctrl-W would be the "close window character" some people call it that, and they aren't wrong. Oh, and EOF isn't an error code.Amp
P.S. It's odd that you think things were any different in "the days of CP/M". If you say there's no such thing as the "EOF character", you could just as well say there's no such thing as the "A" character, there's just a byte value that is displayed as an A. But this is grossly confused about language and how it is used. In POSIX, "the EOF character" is the character that signals an end-of-file condition to the terminal driver, and that character defaults to EOT (ctrl-D).Amp
S
0

Consider the following program,

This program gets the input (getchar runs) till we enter '$' character and prints the output.

#include <stdio.h>
int main()
{
int c;
while ((c = getchar()) != '$') 
    putchar(c);
return 0;
}

If we enter input as abcd$$$$$$[Enter], it stores the input in buffer till the last $ symbol. After we pressed enter, the program (while loop) starts to run and getchar function receives one character at a time and prints it, from 'a' till first '$' . And the program won't end till we press '$' in the input.

Seafarer answered 26/3, 2022 at 4:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.