Contrary to DeadMG's answer, I believe the problem is with the contents of your input file, not with your expectation about the behavior of the newline character.
UPDATE : Now that I've had a chance to play with gedit
, I think I see what caused the problem. gedit
apparently is designed to make it difficult to create a file without a newline on the last line (which is sensible behavior). If you open gedit
and type three lines of input, typing Enter at the end of each line, then save the file, it will actually create a 4-line file, with the 4th line empty. The complete contents of the file, using your example, would then be "ABCD\nEFGH\nIJKL\n\n"
. To avoid creating that extra empty line, just don't type Enter at the end of the last line; gedit
will provide the required newline character for you.
(As a special case, if you don't enter anything at all, gedit
will create an empty file.)
Note this important distinction: In gedit
, typing Enter creates a new line. In a text file stored on disk, a newline character (LF, '\n'
) denotes the end of the current line.
Text file representations vary from system to system. The most common representations for an end-of-line marker are a single ASCII LF (newline) character (Unix, Linux, and similar systems), and as sequence of two characters, CR and LF (MS Windows). I'll assume the Unix-like representation here. (UPDATE: In a comment, you said you're using Ubuntu 12.04 and gcc 4.6.3, so text files should definitely be in the Unix-style format.)
I just wrote the following program based on the code in your question:
#include <iostream>
#include <string>
int main() {
std::string input;
int line_number = 0;
while(std::getline(std::cin, input))
{
line_number ++;
std::cout << "line " << line_number
<< ", input = \"" << input << "\"\n";
}
}
and I created a 3-line text file in.txt
:
ABCD
EFGH
IJHL
In the file in.txt
each line is terminated by a single newline character.
Here's the output I get:
$ cat in.txt
ABCD
EFGH
IJHL
$ g++ c.cpp -o c
$ ./c < in.txt
line 1, input = "ABCD"
line 2, input = "EFGH"
line 3, input = "IJHL"
$
The final newline at the very end of the file does not start a newline, it merely marks the end of the current line. (A text file that doesn't end with a newline character might not even be valid, depending on the system.)
I can get the behavior you describe if I add a second newline character to the end of in.txt
:
$ echo '' >> in.txt
$ cat in.txt
ABCD
EFGH
IJHL
$ ./c < in.txt
line 1, input = "ABCD"
line 2, input = "EFGH"
line 3, input = "IJHL"
line 4, input = ""
$
The program sees an empty line at the end of the input file because there's an empty line at the end of the input file.
If you examine the contents of in.txt
, you'll find two newline (LF) characters at the very end, one to mark the end of the third line, and one to mark the end of the (empty) fourth line. (Or if it's a Windows-format text file, you'll find a CR-LF-CR-LF sequence at the very end of the file.)
If your code doesn't deal properly with empty lines, then you should either ensure that it doesn't receive any empty lines on its input, or, better, modify it so it handles empty lines correctly. How should it handle empty lines? That depends on what the program is required to do, and it's probably entirely up to you. You can silently skip empty lines:
if (input != "") {
// process line
}
or you can treat an empty line as an error:
if (input == "") {
// error handling code
}
or you can treat empty lines as valid data.
In any case, you should decide exactly how you want to handle empty lines.
//some processing with input
... – Invitatorygetline
and/or>>
(the "and" case being use of '>>' on astd::istringstream
created from the line), or even regexps in C++11 or spirit in boost, is all sane once understood - they work quite well together. Your suggestion to usewhile(std::cin>>a>>b)
is only good if you don't need to verify the number of arguments per line (e.g. to report errors in input data). – Bylaw