How to read CSV file and assign to Eigen Matrix?
Asked Answered
A

3

6

I try to read a large cvs file into Eigen Matrix, below the code found having problem where it can not detect each line of \n in cvs file to create multiple rows in the matrix. (It read entire file with single row). Not sure what's wrong with the code. Can anyone suggest here? Im also looking for a effective way to read csv file with 10k of rows and 1k of cols. Not so sure the code below will be the best effective way? Very appreciated with your comment.

#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include <fstream>
#include <istream> //DataFile.fail()  function
#include <vector>
#include <set>
#include <string>
using namespace std;


#include <Eigen/Core>
#include <Eigen/Dense>
using namespace Eigen;

 void readCSV(istream &input, vector< vector<string> > &output)
{
    int a = 0;
    int b = 0;

    string csvLine;
    // read every line from the stream
    while( std::getline(input, csvLine) )
    {

        istringstream csvStream(csvLine);
        vector<string> csvColumn;
        MatrixXd mv;
        string csvElement;
        // read every element from the line that is seperated by commas
        // and put it into the vector or strings
        while( getline(csvStream, csvElement, ' ') )
        {
            csvColumn.push_back(csvElement);
            //mv.push_back(csvElement);
            b++;
        }       
        output.push_back(csvColumn);
        a++;
    }
    cout << "a : " << a << " b : " << b << endl;   //a doen't detect '\n'
}

int main(int argc, char* argv[])
{

    cout<< "ELM" << endl;
    //Testing to load dataset from file.
    fstream file("Sample3.csv", ios::in);
    if(!file.is_open())
    {
        cout << "File not found!\n";
        return 1;
    }
    MatrixXd m(3,1000);
    // typedef to save typing for the following object
    typedef vector< vector<string> > csvVector;
    csvVector csvData;

    readCSV(file, csvData);
    // print out read data to prove reading worked
    for(csvVector::iterator i = csvData.begin(); i != csvData.end(); ++i)
    {
        for(vector<string>::iterator j = i->begin(); j != i->end(); ++j)
        {
           m(i,j) = *j; 
           cout << *j << ", ";
        }
        cout << "\n";
    }
}

I will also attach a sample cvs file. https://onedrive.live.com/redir?resid=F1507EBE7BF1C5B!117&authkey=!AMzCnpBqxUyF1BA&ithint=file%2ccsv

Audubon answered 13/12, 2015 at 1:32 Comment(5)
As of right now your delimiter is empty: ' '. Did you mean: ','?Balcony
Hi Lucas, Yes i do have try that ' , ' in my first code, but that becoz it doesn't detect end of line of each row in my cvs file, so i try make it ' '..but it seem still fail.Audubon
m(i,j) = *j; That cannot be correct.Posse
getline should take care of the newlines (unless the csv's fields have newline characters, which they shouldn't)Balcony
5gon12eder : yes you are right, it not correct. but how do i insert to matrix then?Audubon
B
-4

This will read from a csv file correctly:

std::ifstream indata;

indata.open(filename);

std::string                line;
while (getline(indata, line))
{
    std::stringstream          lineStream(line);
    std::string                cell;

    while (std::getline(lineStream, cell, ','))
    {
        //Process cell
    }
}

Edit: Also, since your csv is full of numbers, make sure to use std::stod or the equivalent conversion once you expect to treat them as such.

Balcony answered 13/12, 2015 at 2:0 Comment(5)
I had tried your code, and i do understand your comment. It seem still now working based on the cvs file. not sure what going wrong here. I just want it can get count 1001 cols each line and read 3 rows.Audubon
maybe i should double check again. coz the 2nd while loop is actually loop all the entire 3 rows of data in the cvs file, i can event identify each rows of them. Thats make me confuse still.Audubon
Try combining my code with Avi Ginsburg's, and you'll have a working answer. Don't mess with the delimiters (changing ',' to ' '), because that won't do what you want. We've already given you the answer, all you need to do is put a bit of work into implementing it. Good Luck!Balcony
maybe i have limited knowledge on here. Yes i do both your comment, the code still giving me error that i cant proceed. That's why i appreciated if Avi can provide litter more hints. error : mat = Map<VectorXd> (csvData.data(), rows, cols) No match contructor for initialization of Map<Eigen::VectorXd>.Audubon
@Audubon Not VectorXd, but MatrixXd. The Eigen object is a 2D matrix.Scheck
G
25

Here's something you can actually copy-paste

Writing your own "parser"

Pros: lightweight and customizable

Cons: customizable

#include <Eigen/Dense>
#include <vector>
#include <fstream>

using namespace Eigen;

template<typename M>
M load_csv (const std::string & path) {
    std::ifstream indata;
    indata.open(path);
    std::string line;
    std::vector<double> values;
    uint rows = 0;
    while (std::getline(indata, line)) {
        std::stringstream lineStream(line);
        std::string cell;
        while (std::getline(lineStream, cell, ',')) {
            values.push_back(std::stod(cell));
        }
        ++rows;
    }
    return Map<const Matrix<typename M::Scalar, M::RowsAtCompileTime, M::ColsAtCompileTime, RowMajor>>(values.data(), rows, values.size()/rows);
}

Usage:

MatrixXd A = load_csv<MatrixXd>("C:/Users/.../A.csv");
Matrix3d B = load_csv<Matrix3d>("C:/Users/.../B.csv");
VectorXd v = load_csv<VectorXd>("C:/Users/.../v.csv");

Using the armadillo library's parser

Pros: supports other formats as well, not just csv

Cons: extra dependency

#include <armadillo>

template <typename M>
M load_csv_arma (const std::string & path) {
    arma::mat X;
    X.load(path, arma::csv_ascii);
    return Eigen::Map<const M>(X.memptr(), X.n_rows, X.n_cols);
}
Guardianship answered 25/8, 2016 at 12:59 Comment(2)
Any idea on what to do if we know the size of the matrix?Salahi
Any idea why would i be getting a SIGFPE error?Arhat
S
0

Read the CSV file into your vector < vector > as you please (e.g. Lucas's answer). Instead of the vector< vector<string> > construct, use a vector< vector<double> > or even better a simple vector< double >. To assign the vector of vectors to an Eigen matrix efficiently using vector< vector< double > >, use the following:

Eigen::MatrixXcd mat(rows, cols);
for(int i = 0; i < rows; i++)
    mat.row(i) = Eigen::Map<Eigen::VectorXd> (csvData[i].data(), cols).cast<complex<double> >();

If you opted to use the vector< double > option, it becomes:

Eigen::MatrixXcd mat(rows, cols);
mat = Eigen::Map<Eigen::VectorXd> (csvData.data(), rows, cols).cast<complex<double> >().transpose();
Scheck answered 13/12, 2015 at 4:35 Comment(6)
Why would you construct that std::vector at all and not directly store the values in the Eigen::Matrix?Posse
@Posse If the size of the matrix cannot be determined until after reading the entire file it's easier to use push_back and let the container deal with resizing.Scheck
Okay, I thought you were suggesting to create a std::vector<std::string>, then a std::vector<double> and finally an Eigen::MatrixXd. A two-step process seems alright.Posse
@Avi Ginsburg, thanks for the great suggestion. However, I had tried the idea of yours and encounter an error at the readCSV function <vector < vector<double>> doesn't support push_back. Maybe can you post more code how i should implement? ThanksAudubon
@AviGinsburg: doesn't your last line read a row major format (CSV) into a column major one (Eigen)?Guardianship
@Guardianship Fair enough. Fixed.Scheck
B
-4

This will read from a csv file correctly:

std::ifstream indata;

indata.open(filename);

std::string                line;
while (getline(indata, line))
{
    std::stringstream          lineStream(line);
    std::string                cell;

    while (std::getline(lineStream, cell, ','))
    {
        //Process cell
    }
}

Edit: Also, since your csv is full of numbers, make sure to use std::stod or the equivalent conversion once you expect to treat them as such.

Balcony answered 13/12, 2015 at 2:0 Comment(5)
I had tried your code, and i do understand your comment. It seem still now working based on the cvs file. not sure what going wrong here. I just want it can get count 1001 cols each line and read 3 rows.Audubon
maybe i should double check again. coz the 2nd while loop is actually loop all the entire 3 rows of data in the cvs file, i can event identify each rows of them. Thats make me confuse still.Audubon
Try combining my code with Avi Ginsburg's, and you'll have a working answer. Don't mess with the delimiters (changing ',' to ' '), because that won't do what you want. We've already given you the answer, all you need to do is put a bit of work into implementing it. Good Luck!Balcony
maybe i have limited knowledge on here. Yes i do both your comment, the code still giving me error that i cant proceed. That's why i appreciated if Avi can provide litter more hints. error : mat = Map<VectorXd> (csvData.data(), rows, cols) No match contructor for initialization of Map<Eigen::VectorXd>.Audubon
@Audubon Not VectorXd, but MatrixXd. The Eigen object is a 2D matrix.Scheck

© 2022 - 2024 — McMap. All rights reserved.