Read file line by line using ifstream in C++
Asked Answered
P

8

792

The contents of file.txt are:

5 3
6 4
7 1
10 5
11 6
12 3
12 4

Where 5 3 is a coordinate pair. How do I process this data line by line in C++?

I am able to get the first line, but how do I get the next line of the file?

ifstream myfile;
myfile.open ("file.txt");
Poppied answered 23/10, 2011 at 20:24 Comment(1)
Related: How do I read an entire file into a std::string in C++?Nace
E
1156

First, make an ifstream:

#include <fstream>
std::ifstream infile("thefile.txt");

The two standard methods are:

  1. Assume that every line consists of two numbers and read token by token:

    int a, b;
    while (infile >> a >> b)
    {
        // process pair (a,b)
    }
    
  2. Line-based parsing, using string streams:

    #include <sstream>
    #include <string>
    
    std::string line;
    while (std::getline(infile, line))
    {
        std::istringstream iss(line);
        int a, b;
        if (!(iss >> a >> b)) { break; } // error
    
        // process pair (a,b)
    }
    

You shouldn't mix (1) and (2), since the token-based parsing doesn't gobble up newlines, so you may end up with spurious empty lines if you use getline() after token-based extraction got you to the end of a line already.

Ergener answered 23/10, 2011 at 20:34 Comment(22)
@EdwardKarak: I don't understand what "commas as the token" means. Commas don't represent integers.Ergener
the OP used a space to delimit the two integers. I wanted to know if while (infile >> a >> b) would work if the OP used a as a comma a delimiter, because that is the scenario in my own programOlaolaf
@EdwardKarak: Ah, so when you said "token" you meant "delimiter". Right. With a comma, you'd say: int a, b; char c; while ((infile >> a >> c >> b) && (c == ','))Ergener
@KerrekSB: That would only work if the comma was surrounded by spaces, i.e., "1 , 2". If the line contained "1,2", then your code would try to convert "1,2" into an integer (storing it in a) while c and b would get the tokens/delimiters on the next line. With anything besides whitespace delimiters, you really need to use std::getline() and parse the line.Falstaffian
@KerrekSB: Huh. I was wrong. I didn't know it could do that. I might have some code of my own to rewrite.Falstaffian
For an explanation of the while(getline(f, line)) { } construct and regarding error handling please have a look at this (my) article: gehrcke.de/2011/06/… (I think I do not need to have bad conscience posting this here, it even slightly pre-dates this answer).Crossfertilize
@galois: thatched :-)Ergener
enjoy your cold oneCrappie
What's the best way to skip "#" commented lines use the first or second approach? Thanks.Quoth
@elgnoh: You can't do it in the first approach, which assumes you're parsing tokens and doesn't know what a "line" is. It's trivial in the second approach, where you just check the first character of the line string (potentially skipping whitespace).Ergener
@KerrekSB: One clarification please, in the first approach, as i understand, >> returns the reference to the stream object. So the question is when the stream reaches eof what will be returned that makes the while loop break.Counts
@VivekMaran: the stream reaches EOF while reading digits to form the last element. It is in the next round that there are no digits left, and the attempt to read past the end of the stream makes the stream "fail", which is the exit condition for the loop. Here's a demo.Ergener
@VivekMaran: If you're reading in a way that doesn't "read ahead", like just getting individual characters out, then you never trigger EOF until you actually step over the end of the stream: wandbox.org/permlink/oFaYFTFtnEfucaMvErgener
@KerrekSB: Thanks got the EOF part, just figured out that directly using ifstream for a condition check will trigger the bool operator that will return false if eof is hit, and cause the loop to break.Counts
@VivekMaran: No, the boolean conversion checks for !fail(), not good(), which differs in the treatment of EOF (see here).Ergener
@KerrekSB, what if I need to read input one by one after reading the whole line using getline(). I mean in python language I can read the input from a file line by line in a list(a.k.a array) and then I can iterate over this list to pick the item one at a time and do whatever I want!. how should I handle such condition in c++? any suggestions?Merri
@anu: Sure, you can store each line in a container (e.g. a std::vector<string>) and then process that container after the loop has finished. This means of course that you need to be able to consume the entire file before proceeding, e.g. you can't be reading lines interactively. (But that's the same in Python I expect.) If performance is a concern, it might be better to read the entire file into memory in one step and then just store the location of the line breaks, e.g. as a std::vector<std::string_view>, but I'd only do that if this is a performance bottleneck.Ergener
@anu: Maybe ask a new question?Ergener
@KerrekSB, here is the question, I am trying to solve? Using the above post, but didn't get a clue, how to do it? Any suggestions?Merri
@KokHowTeh: can you be more precise? Do you want to parse out a string and an integer? That's indeed beyond iostreams' formatted input (at least in any nice and maintainable way); better to use a regular expression.Ergener
@KerrekSB, I was referring to this sample string: "HelloWorld, 123"Huskey
@KokHowTeh: Well, to parse that, fhe first variable needs to be a std::string: wandbox.org/permlink/LFdI2eyYF9klT9HNErgener
M
223

Use ifstream to read data from a file:

std::ifstream input( "filename.ext" );

If you really need to read line by line, then do this:

for( std::string line; getline( input, line ); )
{
    ...for each line in input...
}

But you probably just need to extract coordinate pairs:

int x, y;
input >> x >> y;

Update:

In your code you use ofstream myfile;, however the o in ofstream stands for output. If you want to read from the file (input) use ifstream. If you want to both read and write use fstream.

Mirk answered 23/10, 2011 at 20:32 Comment(2)
Your solution is a bit improved: your line variable is not visible after file read-in in contrast to Kerrek SB's second solution which is good and simple solution too.Uropod
getline is in string see, so don't forget the #include <string>Dashtilut
D
160

Reading a file line by line in C++ can be done in some different ways.

[Fast] Loop with std::getline()

The simplest approach is to open an std::ifstream and loop using std::getline() calls. The code is clean and easy to understand.

#include <fstream>

std::ifstream file(FILENAME);
if (file.is_open()) {
    std::string line;
    while (std::getline(file, line)) {
        // using printf() in all tests for consistency
        printf("%s", line.c_str());
    }
    file.close();
}

[Fast] Use Boost's file_description_source

Another possibility is to use the Boost library, but the code gets a bit more verbose. The performance is quite similar to the code above (Loop with std::getline()).

#include <boost/iostreams/device/file_descriptor.hpp>
#include <boost/iostreams/stream.hpp>
#include <fcntl.h>

namespace io = boost::iostreams;

void readLineByLineBoost() {
    int fdr = open(FILENAME, O_RDONLY);
    if (fdr >= 0) {
        io::file_descriptor_source fdDevice(fdr, io::file_descriptor_flags::close_handle);
        io::stream <io::file_descriptor_source> in(fdDevice);
        if (fdDevice.is_open()) {
            std::string line;
            while (std::getline(in, line)) {
                // using printf() in all tests for consistency
                printf("%s", line.c_str());
            }
            fdDevice.close();
        }
    }
}

[Fastest] Use C code

If performance is critical for your software, you may consider using the C language. This code can be 4-5 times faster than the C++ versions above, see benchmark below

FILE* fp = fopen(FILENAME, "r");
if (fp == NULL)
    exit(EXIT_FAILURE);

char* line = NULL;
size_t len = 0;
while ((getline(&line, &len, fp)) != -1) {
    // using printf() in all tests for consistency
    printf("%s", line);
}
fclose(fp);
if (line)
    free(line);

Benchmark -- Which one is faster?

I have done some performance benchmarks with the code above and the results are interesting. I have tested the code with ASCII files that contain 100,000 lines, 1,000,000 lines and 10,000,000 lines of text. Each line of text contains 10 words in average. The program is compiled with -O3 optimization and its output is forwarded to /dev/null in order to remove the logging time variable from the measurement. Last, but not least, each piece of code logs each line with the printf() function for consistency.

The results show the time (in ms) that each piece of code took to read the files.

The performance difference between the two C++ approaches is minimal and shouldn't make any difference in practice. The performance of the C code is what makes the benchmark impressive and can be a game changer in terms of speed.

                             10K lines     100K lines     1000K lines
Loop with std::getline()         105ms          894ms          9773ms
Boost code                       106ms          968ms          9561ms
C code                            23ms          243ms          2397ms

enter image description here

Durante answered 28/7, 2018 at 14:35 Comment(16)
What happens if you remove C++'s synchronization with C on the console outputs? You might be measuring a known disadvantage of the default behavior of std::cout vs printf.Raver
Thanks for bringing this concern. I've redone the tests and the performance is still the same. I have edited the code to use the printf() function in all cases for consistency. I have also tried using std::cout in all cases and this made absolutely no difference. As I have just described in the text, the output of the program goes to /dev/null so the time to print the lines is not measured.Durante
Groovy. Thanks. Wonder where the slowdown is.Raver
Hi @Durante I know this is an old thread, I tried to replicate your results and could not see any significant difference between c and c++ github.com/simonsso/readfile_benchmarksDisendow
@Fareanor That's not correct. It only affects the standard C++ streams, std::ifstream file is not one of them. en.cppreference.com/w/cpp/io/ios_base/sync_with_stdioSpiculate
@Durante Please, may you tell me what function/method used to calculate and time this. And what graphing plot system used? Any advice to re-create this graph and timing method would be helpful.Resendez
If reading from the file is not the bottleneck, then the C++ version will do fine. In my code the bottleneck is updating statistics.Interne
Note that your use of getline in C is a gnu extension (now added to POSIX). It's not a standard C function.Stereopticon
I'm trying to replicate the plain-c version of this solution but get an error or getline "myClass.cpp:29:13: error: no matching function for call to 'getline'"Mohl
@Simson, looks like there's a large performance different on your repo. If I'm reading it correctly, the bible text processing took quite a bit under a minute for c and quite a bit over for c++Ashly
(fp == NULL) can be replaced just by (!fp).Tetroxide
I guess @Raver was referring to usage of the below code to disable synchronization while conducting the experiment: ios_base::sync_with_stdio(false);Meisel
Perhaps the C++ iostreams-based std::getline (and boost version?) might be resizing/completely reallocating the string line each time by clearing then appending text, which is unfortunate. (Just based on cppreference documentation.) You can write a wrapper around the C getline if you wan ta more C++-like interface.Loud
If you were creating a new std::string each time and storing them (rather than re-using one std::string buffer), it would be similar in C (allocating a new char* for each line and storing them).Loud
Here's a quick bench benchmark, interested in any comments: quick-bench.com/q/-t-HNEvSRp1AJhnxk6yG06FWW7wLoud
(Also storing in a std::vector rather than calling realloc() for each new line is of course faster: quick-bench.com/q/xOle10cV7wk5KMQnWskcVrfJ094)Loud
S
20

Since your coordinates belong together as pairs, why not write a struct for them?

struct CoordinatePair
{
    int x;
    int y;
};

Then you can write an overloaded extraction operator for istreams:

std::istream& operator>>(std::istream& is, CoordinatePair& coordinates)
{
    is >> coordinates.x >> coordinates.y;

    return is;
}

And then you can read a file of coordinates straight into a vector like this:

#include <fstream>
#include <iterator>
#include <vector>

int main()
{
    char filename[] = "coordinates.txt";
    std::vector<CoordinatePair> v;
    std::ifstream ifs(filename);
    if (ifs) {
        std::copy(std::istream_iterator<CoordinatePair>(ifs), 
                std::istream_iterator<CoordinatePair>(),
                std::back_inserter(v));
    }
    else {
        std::cerr << "Couldn't open " << filename << " for reading\n";
    }
    // Now you can work with the contents of v
}
Saloma answered 20/8, 2016 at 16:58 Comment(3)
What happens when it's not possible to read two int tokens from the stream in operator>>? How can one make it work with a backtracking parser (i.e. when operator>> fails, roll back the stream to previous position end return false or something like that)?Jadda
If it's not possible to read two int tokens, then the is stream will evaluate to false and the reading loop will terminate at that point. You can detect this within operator>> by checking the return value of the individual reads. If you want to roll back the stream, you would call is.clear().Saloma
in the operator>> it is more correct to say is >> std::ws >> coordinates.x >> std::ws >> coordinates.y >> std::ws; since otherwise you are assuming that your input stream is in the whitespace-skipping mode.Pentimento
A
10

Expanding on the accepted answer, if the input is:

1,NYC
2,ABQ
...

you will still be able to apply the same logic, like this:

#include <fstream>

std::ifstream infile("thefile.txt");
if (infile.is_open()) {
    int number;
    std::string str;
    char c;
    while (infile >> number >> c >> str && c == ',')
        std::cout << number << " " << str << "\n";
}
infile.close();
Antoninus answered 18/5, 2017 at 9:35 Comment(0)
W
6

This answer is for visual studio 2017 and if you want to read from text file which location is relative to your compiled console application.

first put your textfile (test.txt in this case) into your solution folder. After compiling keep text file in same folder with applicationName.exe

C:\Users\"username"\source\repos\"solutionName"\"solutionName"

#include <iostream>
#include <fstream>

using namespace std;
int main()
{
    ifstream inFile;
    // open the file stream
    inFile.open(".\\test.txt");
    // check if opening a file failed
    if (inFile.fail()) {
        cerr << "Error opeing a file" << endl;
        inFile.close();
        exit(1);
    }
    string line;
    while (getline(inFile, line))
    {
        cout << line << endl;
    }
    // close the file stream
    inFile.close();
}
Wirer answered 6/3, 2019 at 17:45 Comment(0)
T
5

Although there is no need to close the file manually but it is good idea to do so if the scope of the file variable is bigger:

    ifstream infile(szFilePath);

    for (string line = ""; getline(infile, line); )
    {
        //do something with the line
    }

    if(infile.is_open())
        infile.close();
Tortola answered 1/5, 2018 at 20:11 Comment(2)
Not sure this deserved a down vote. OP asked for a way to get each line. This answer does that and gives a great tip of making sure the file closes. For a simple program it may not be needed but at minimum a GREAT habit to form. It could maybe be improved by adding in a few lines of code to process the individual lines it pulls but overall is the simplest answer to the OPs question.Wheresoever
Isn't the file closed when ifstream infile(szFilePath); goes out of scope?Rothman
B
3

This is a general solution to loading data into a C++ program, and uses the readline function. This could be modified for CSV files, but the delimiter is a space here.

int n = 5, p = 2;

int X[n][p];

ifstream myfile;

myfile.open("data.txt");

string line;
string temp = "";
int a = 0; // row index 

while (getline(myfile, line)) { //while there is a line
     int b = 0; // column index
     for (int i = 0; i < line.size(); i++) { // for each character in rowstring
          if (!isblank(line[i])) { // if it is not blank, do this
              string d(1, line[i]); // convert character to string
              temp.append(d); // append the two strings
        } else {
              X[a][b] = stod(temp);  // convert string to double
              temp = ""; // reset the capture
              b++; // increment b cause we have a new number
        }
    }

  X[a][b] = stod(temp);
  temp = "";
  a++; // onto next row
}
Beetner answered 16/3, 2019 at 17:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.