How to read a CSV file into a .NET Datatable
Asked Answered
B

23

206

How can I load a CSV file into a System.Data.DataTable, creating the datatable based on the CSV file?

Does the regular ADO.net functionality allow this?

Burr answered 26/6, 2009 at 16:49 Comment(2)
How is this possibly 'off topic'? It's a specific question and 100 people find it usefulPassepartout
@Ryan: Verily I say unto you... StackOverflow moderators are a brood of vipers. Get behind me, StackOverflow moderators!Burr
M
96

Here's an excellent class that will copy CSV data into a datatable using the structure of the data to create the DataTable:

A portable and efficient generic parser for flat files

It's easy to configure and easy to use. I urge you to take a look.

Mcnully answered 26/6, 2009 at 16:57 Comment(4)
Excellent indeed. It worked perfectly for me out of the box, without even reading the documentation.Alcus
Will this work on CSV files where each row may be of a different structure? I have a log file with different types of logged event that would need to be separated out into multiple tables.Disjuncture
@Disjuncture - Probably not; the basic assumption of a CSV file is a rectangular data structure based on a single set of column headers specified in the first line. What you have appears to be more generic comma-delimited, discriminated data, requiring more sophisticated "ETL" to parse from the file into object instances of varying types (which could include DataRows of different DataTables).Chevrotain
Wow, this worked straight out of the box for me as well. I tried Microsoft.Office.Interop.Excel, IronXL, and DocumentFormat.OpenXml which all ran incredibly slow. About 10+ seconds for 3k rows. This parser completes the operation by the time I mouseUp! I used the nuget since that was easiest. Amazed!Timid
S
108

I have been using OleDb provider. However, it has problems if you are reading in rows that have numeric values but you want them treated as text. However, you can get around that issue by creating a schema.ini file. Here is my method I used:

// using System.Data;
// using System.Data.OleDb;
// using System.Globalization;
// using System.IO;

static DataTable GetDataTableFromCsv(string path, bool isFirstRowHeader)
{
    string header = isFirstRowHeader ? "Yes" : "No";

    string pathOnly = Path.GetDirectoryName(path);
    string fileName = Path.GetFileName(path);

    string sql = @"SELECT * FROM [" + fileName + "]";

    using(OleDbConnection connection = new OleDbConnection(
              @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + pathOnly + 
              ";Extended Properties=\"Text;HDR=" + header + "\""))
    using(OleDbCommand command = new OleDbCommand(sql, connection))
    using(OleDbDataAdapter adapter = new OleDbDataAdapter(command))
    {
        DataTable dataTable = new DataTable();
        dataTable.Locale = CultureInfo.CurrentCulture;
        adapter.Fill(dataTable);
        return dataTable;
    }
}
Sural answered 26/6, 2009 at 17:30 Comment(8)
Thanks buddy. That helped for me. I had a CSV file in which commas weren't only separators, they were everywhere inside many columns values, so coming up with a regex that would split the line was kinda challenging. The OleDbProvider inferred the schema correctly.Abroad
The implementation makes sense but how do we deal with cells containing mixed data types. For example, 40C and etc.?Redeeming
GKED, if the data you are reading in always has an expected set of columns and types you can place in the same folder a shema.ini file that tells OleDb provider information about the columns. Here is a link to a Microsoft article that provides details of how to structure the file. msdn.microsoft.com/en-us/library/…Sural
While this answer will work, I would strongly advise against it. You introduce an external dependency which may conflict with other installations of office on the same machine (use Excel on your local dev environment?), dependent on versions installed. There are NuGet packages out there (ExcelDataReader, CsvHelper) that do this in more efficient, more portable ways.Lepidosiren
@A.Murray - What exactly do you mean? This uses the built in OleDb provider in System.Data.dll. You don't need to install any additional "drivers". And I'd be shocked in this day and age if any windows installation didn't have the basic Jet driver installed. This is 1990's CSV....Broadloom
how can I saw the DT dynamically to sql db, if its not already thereTaxicab
+1 from my side I am using 64 bit target so I have to replace provider with "Provider=Microsoft.ACE.OLEDB.12.0;Data" in place of "Microsoft.Jet.OLEDB.4.0" and I have added all csv in one Ds in a loop rest code worked for meSexual
2021. OleDb is not cross-platform, so this will only work on windows.Narah
M
96

Here's an excellent class that will copy CSV data into a datatable using the structure of the data to create the DataTable:

A portable and efficient generic parser for flat files

It's easy to configure and easy to use. I urge you to take a look.

Mcnully answered 26/6, 2009 at 16:57 Comment(4)
Excellent indeed. It worked perfectly for me out of the box, without even reading the documentation.Alcus
Will this work on CSV files where each row may be of a different structure? I have a log file with different types of logged event that would need to be separated out into multiple tables.Disjuncture
@Disjuncture - Probably not; the basic assumption of a CSV file is a rectangular data structure based on a single set of column headers specified in the first line. What you have appears to be more generic comma-delimited, discriminated data, requiring more sophisticated "ETL" to parse from the file into object instances of varying types (which could include DataRows of different DataTables).Chevrotain
Wow, this worked straight out of the box for me as well. I tried Microsoft.Office.Interop.Excel, IronXL, and DocumentFormat.OpenXml which all ran incredibly slow. About 10+ seconds for 3k rows. This parser completes the operation by the time I mouseUp! I used the nuget since that was easiest. Amazed!Timid
Z
59

Hey its working 100%

public static DataTable ConvertCSVtoDataTable(string strFilePath)
{
    DataTable dt = new DataTable();
    using (StreamReader sr = new StreamReader(strFilePath))
    {
        string[] headers = sr.ReadLine().Split(',');
        foreach (string header in headers)
        {
            dt.Columns.Add(header);
        }
        while (!sr.EndOfStream)
        {
            string[] rows = sr.ReadLine().Split(',');
            DataRow dr = dt.NewRow();
            for (int i = 0; i < headers.Length; i++)
            {
                dr[i] = rows[i];
            }
            dt.Rows.Add(dr);
        }
    }
    return dt;
}

CSV Image enter image description here

Data table Imported enter image description here

Zane answered 30/12, 2014 at 12:27 Comment(8)
Only when 100% of the inputs are the simplest of CSV files (which may be true in your case).Burr
You are correct. you should use codeproject.com/Articles/9258/A-Fast-CSV-Reader( Lorion dll ) I tried its working well.Zane
See my answer from 2009.Burr
@ShivamSrivastava I'm getting the error in the last row are you there then give you other contact infoSchrock
Although I did not use this version exactly, it was based on it that I solved my issue. Thank you. Works very well.Metapsychology
This works for me , thanks a lot .. good part .. no third party dll's .. :)Greet
perfect solution very efficient and works extra fast.Ilka
Practical solution for the happy path but unfortunately this will fail even with this simple CSV line: 1,"Jon Doe,Phd" having fields Id,Name. Using a real and configurable CSV parser is the way to go for most cases.Upstage
B
41

I have decided to use Sebastien Lorion's Csv Reader.

Jay Riggs suggestion is a great solution also, but I just didn't need all of the features that that Andrew Rissing's Generic Parser provides.

UPDATE 10/25/2010

After using Sebastien Lorion's Csv Reader in my project for nearly a year and a half, I have found that it throws exceptions when parsing some csv files that I believe to be well formed.

So, I did switch to Andrew Rissing's Generic Parser and it seems to be doing much better.

UPDATE 9/22/2014

These days, I mostly use this extension method to read delimited text:

https://github.com/Core-Techs/Common/blob/master/CoreTechs.Common/Text/DelimitedTextExtensions.cs#L22

https://www.nuget.org/packages/CoreTechs.Common/

UPDATE 2/20/2015

Example:

var csv = @"Name, Age
Ronnie, 30
Mark, 40
Ace, 50";

TextReader reader = new StringReader(csv);
var table = new DataTable();
using(var it = reader.ReadCsvWithHeader().GetEnumerator())
{

    if (!it.MoveNext()) return;

    foreach (var k in it.Current.Keys)
        table.Columns.Add(k);

    do
    {
        var row = table.NewRow();
        foreach (var k in it.Current.Keys)
            row[k] = it.Current[k];
    
        table.Rows.Add(row);
    
    } while (it.MoveNext());
}
Burr answered 26/6, 2009 at 17:33 Comment(7)
I agree that Sebastien Lorien's CSV reader is great. I use it for heavy CSV processing, but I've also used Andrew's Rissing's for small jobs and it's served me well. Have fun!Mcnully
How can i use these classes to load CSV into DATATABLE ?Enfilade
I tried this but the it.Current.Keys collection returns with "System.Linq.Enumerable+WhereSelectListIterator`2[System.Int32,System.Char]" rather than the name of the column. Any thoughts as to why?Pulchritude
Can you use multi character delimiters?Jonson
No, but I thought about enabling that.Burr
I'm confused about how to retrieve the headers from this. The first row of values (row 2 in the file) ends up as my headers.Ordinate
@Ordinate Couldn't say why without seeing the actual file.Burr
I
14

We always used to use the Jet.OLEDB driver, until we started going to 64 bit applications. Microsoft has not and will not release a 64 bit Jet driver. Here's a simple solution we came up with that uses File.ReadAllLines and String.Split to read and parse the CSV file and manually load a DataTable. As noted above, it DOES NOT handle the situation where one of the column values contains a comma. We use this mostly for reading custom configuration files - the nice part about using CSV files is that we can edit them in Excel.

string CSVFilePathName = @"C:\test.csv";
string[] Lines = File.ReadAllLines(CSVFilePathName);
string[] Fields;
Fields = Lines[0].Split(new char[] { ',' });
int Cols = Fields.GetLength(0);
DataTable dt = new DataTable();
//1st row must be column names; force lower case to ensure matching later on.
for (int i = 0; i < Cols; i++)
    dt.Columns.Add(Fields[i].ToLower(), typeof(string));
DataRow Row;
for (int i = 1; i < Lines.GetLength(0); i++)
{
    Fields = Lines[i].Split(new char[] { ',' });
    Row = dt.NewRow();
    for (int f = 0; f < Cols; f++)
        Row[f] = Fields[f];
    dt.Rows.Add(Row);
}
Inhabitancy answered 17/8, 2010 at 22:15 Comment(0)
M
11

You can achieve it by using Microsoft.VisualBasic.FileIO.TextFieldParser dll in C#

static void Main()
        {
            string csv_file_path=@"C:\Users\Administrator\Desktop\test.csv";

            DataTable csvData = GetDataTabletFromCSVFile(csv_file_path);

            Console.WriteLine("Rows count:" + csvData.Rows.Count);

            Console.ReadLine();
        }


private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
        {
            DataTable csvData = new DataTable();

            try
            {

            using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
                {
                    csvReader.SetDelimiters(new string[] { "," });
                    csvReader.HasFieldsEnclosedInQuotes = true;
                    string[] colFields = csvReader.ReadFields();
                    foreach (string column in colFields)
                    {
                        DataColumn datecolumn = new DataColumn(column);
                        datecolumn.AllowDBNull = true;
                        csvData.Columns.Add(datecolumn);
                    }

                    while (!csvReader.EndOfData)
                    {
                        string[] fieldData = csvReader.ReadFields();
                        //Making empty value as null
                        for (int i = 0; i < fieldData.Length; i++)
                        {
                            if (fieldData[i] == "")
                            {
                                fieldData[i] = null;
                            }
                        }
                        csvData.Rows.Add(fieldData);
                    }
                }
            }
            catch (Exception ex)
            {
            }
            return csvData;
        }
Mustache answered 28/8, 2013 at 4:20 Comment(3)
Please do not attempt to re-invent the wheel with CSV processing. There's so many great open source alternatives out there that are very robust.Hanway
Thanks Brad, useful tip in relation to the TextFieldParser for handling embedded quotes.Berkeley
this is more proper answer, where there is spaces in between and kind of description in between. +1Leisured
V
8

this is the code i use it but your apps must run with net version 3.5

private void txtRead_Click(object sender, EventArgs e)
        {
           // var filename = @"d:\shiptest.txt";

            openFileDialog1.InitialDirectory = "d:\\";
            openFileDialog1.Filter = "txt files (*.txt)|*.txt|All files (*.*)|*.*";
            DialogResult result = openFileDialog1.ShowDialog();
            if (result == DialogResult.OK)
            {
                if (openFileDialog1.FileName != "")
                {
                    var reader = ReadAsLines(openFileDialog1.FileName);

                    var data = new DataTable();

                    //this assume the first record is filled with the column names
                    var headers = reader.First().Split(',');
                    foreach (var header in headers)
                    {
                        data.Columns.Add(header);
                    }

                    var records = reader.Skip(1);
                    foreach (var record in records)
                    {
                        data.Rows.Add(record.Split(','));
                    }

                    dgList.DataSource = data;
                }
            }
        }

        static IEnumerable<string> ReadAsLines(string filename)
        {
            using (StreamReader reader = new StreamReader(filename))
                while (!reader.EndOfStream)
                    yield return reader.ReadLine();
        }
Visigoth answered 6/2, 2013 at 11:45 Comment(1)
This is pretty much what I wanted to present.Finery
M
4

The best option I have found, and it resolves issues where you may have different versions of Office installed, and also 32/64-bit issues like Chuck Bevitt mentioned, is FileHelpers.

It can be added to your project references using NuGet and it provides a one-liner solution:

CommonEngine.CsvToDataTable(path, "ImportRecord", ',', true);
Mayweed answered 12/2, 2013 at 18:21 Comment(3)
can u tell what is CommonEngine? Is NuGet same with NuGet.Core. I found only NuGet.Core in referencesImponderable
It is FileHelpers you need. If you have NuGet, add it with NuGet. Otherwise, just add it as an assembly in your project. CommonEngine is part of FileHelpers.Mayweed
This is by far the best and easiest option I've come across. Thanks a lot!Rigamarole
F
4

Modified from Mr ChuckBevitt

Working solution:

string CSVFilePathName = APP_PATH + "Facilities.csv";
string[] Lines = File.ReadAllLines(CSVFilePathName);
string[] Fields;
Fields = Lines[0].Split(new char[] { ',' });
int Cols = Fields.GetLength(0);
DataTable dt = new DataTable();
//1st row must be column names; force lower case to ensure matching later on.
for (int i = 0; i < Cols-1; i++)
        dt.Columns.Add(Fields[i].ToLower(), typeof(string));
DataRow Row;
for (int i = 0; i < Lines.GetLength(0)-1; i++)
{
        Fields = Lines[i].Split(new char[] { ',' });
        Row = dt.NewRow();
        for (int f = 0; f < Cols-1; f++)
                Row[f] = Fields[f];
        dt.Rows.Add(Row);
}
Footsie answered 12/7, 2013 at 14:5 Comment(1)
So this solves a memory issue right? This is line by line processing and not persisting in memory so there should be no exceptions? I like the way that this is processed but doesn't File.ReadAllLines() save all into memory? I think you're supposed to use File.ReadLines() to avoid huge memory buffer? This is a good answer for the question at hand I just want to know about memory concerns.Rhotacism
W
4
    private static DataTable LoadCsvData(string refPath)
    {
        var cfg = new Configuration() { Delimiter = ",", HasHeaderRecord = true };
        var result = new DataTable();
        using (var sr = new StreamReader(refPath, Encoding.UTF8, false, 16384 * 2))
        {
            using (var rdr = new CsvReader(sr, cfg))
            using (var dataRdr = new CsvDataReader(rdr))
            {
                result.Load(dataRdr);
            }
        }
        return result;
    }

using: https://joshclose.github.io/CsvHelper/

Willenewillet answered 13/1, 2020 at 17:23 Comment(1)
Note that in release 13 Configuration was renamed to CsvConfiguration to avoid namespace conflicts. Demo of this answer working: dotnetfiddle.net/sdwc6iPalumbo
F
3
public class Csv
{
    public static DataTable DataSetGet(string filename, string separatorChar, out List<string> errors)
    {
        errors = new List<string>();
        var table = new DataTable("StringLocalization");
        using (var sr = new StreamReader(filename, Encoding.Default))
        {
            string line;
            var i = 0;
            while (sr.Peek() >= 0)
            {
                try
                {
                    line = sr.ReadLine();
                    if (string.IsNullOrEmpty(line)) continue;
                    var values = line.Split(new[] {separatorChar}, StringSplitOptions.None);
                    var row = table.NewRow();
                    for (var colNum = 0; colNum < values.Length; colNum++)
                    {
                        var value = values[colNum];
                        if (i == 0)
                        {
                            table.Columns.Add(value, typeof (String));
                        }
                        else
                        {
                            row[table.Columns[colNum]] = value;
                        }
                    }
                    if (i != 0) table.Rows.Add(row);
                }
                catch(Exception ex)
                {
                    errors.Add(ex.Message);
                }
                i++;
            }
        }
        return table;
    }
}
Fari answered 3/6, 2011 at 15:9 Comment(0)
C
3

I came across this piece of code that uses Linq and regex to parse a CSV file. The refering article is now over a year and a half old, but have not come across a neater way to parse a CSV using Linq (and regex) than this. The caveat is the regex applied here is for comma delimited files (will detect commas inside quotes!) and that it may not take well to headers, but there is a way to overcome these). Take a peak:

Dim lines As String() = System.IO.File.ReadAllLines(strCustomerFile)
Dim pattern As String = ",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
Dim r As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern)
Dim custs = From line In lines _
            Let data = r.Split(line) _
                Select New With {.custnmbr = data(0), _
                                 .custname = data(1)}
For Each cust In custs
    strCUSTNMBR = Replace(cust.custnmbr, Chr(34), "")
    strCUSTNAME = Replace(cust.custname, Chr(34), "")
Next
Credent answered 2/9, 2011 at 18:24 Comment(0)
V
3

For those of you wishing not to use an external library, and prefer not to use OleDB, see the example below. Everything I found was either OleDB, external library, or simply splitting based on a comma! For my case OleDB was not working so I wanted something different.

I found an article by MarkJ that referenced the Microsoft.VisualBasic.FileIO.TextFieldParser method as seen here. The article is written in VB and doesn't return a datatable, so see my example below.

public static DataTable LoadCSV(string path, bool hasHeader)
    {
        DataTable dt = new DataTable();

        using (var MyReader = new Microsoft.VisualBasic.FileIO.TextFieldParser(path))
        {
            MyReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited;
            MyReader.Delimiters = new String[] { "," };

            string[] currentRow;

            //'Loop through all of the fields in the file.  
            //'If any lines are corrupt, report an error and continue parsing.  
            bool firstRow = true;
            while (!MyReader.EndOfData)
            {
                try
                {
                    currentRow = MyReader.ReadFields();

                    //Add the header columns
                    if (hasHeader && firstRow)
                    {
                        foreach (string c in currentRow)
                        {
                            dt.Columns.Add(c, typeof(string));
                        }

                        firstRow = false;
                        continue;
                    }

                    //Create a new row
                    DataRow dr = dt.NewRow();
                    dt.Rows.Add(dr);

                    //Loop thru the current line and fill the data out
                    for(int c = 0; c < currentRow.Count(); c++)
                    {
                        dr[c] = currentRow[c];
                    }
                }
                catch (Microsoft.VisualBasic.FileIO.MalformedLineException ex)
                {
                    //Handle the exception here
                }
            }
        }

        return dt;
    }
Vermiculate answered 9/5, 2013 at 20:53 Comment(0)
B
3

Very basic answer: if you don't have a complex csv that can use a simple split function this will work well for importing (note this imports as strings, i do datatype conversions later if i need to)

 private DataTable csvToDataTable(string fileName, char splitCharacter)
    {                
        StreamReader sr = new StreamReader(fileName);
        string myStringRow = sr.ReadLine();
        var rows = myStringRow.Split(splitCharacter);
        DataTable CsvData = new DataTable();
        foreach (string column in rows)
        {
            //creates the columns of new datatable based on first row of csv
            CsvData.Columns.Add(column);
        }
        myStringRow = sr.ReadLine();
        while (myStringRow != null)
        {
            //runs until string reader returns null and adds rows to dt 
            rows = myStringRow.Split(splitCharacter);
            CsvData.Rows.Add(rows);
            myStringRow = sr.ReadLine();
        }
        sr.Close();
        sr.Dispose();
        return CsvData;
    }

My method if I am importing a table with a string[] separater and handles the issue where the current line i am reading may have went to the next line in the csv or text file <- IN which case i want to loop until I get to the total number of lines in the first row (columns)

public static DataTable ImportCSV(string fullPath, string[] sepString)
    {
        DataTable dt = new DataTable();
        using (StreamReader sr = new StreamReader(fullPath))
        {
           //stream uses using statement because it implements iDisposable
            string firstLine = sr.ReadLine();
            var headers = firstLine.Split(sepString, StringSplitOptions.None);
            foreach (var header in headers)
            {
               //create column headers
                dt.Columns.Add(header);
            }
            int columnInterval = headers.Count();
            string newLine = sr.ReadLine();
            while (newLine != null)
            {
                //loop adds each row to the datatable
                var fields = newLine.Split(sepString, StringSplitOptions.None); // csv delimiter    
                var currentLength = fields.Count();
                if (currentLength < columnInterval)
                {
                    while (currentLength < columnInterval)
                    {
                       //if the count of items in the row is less than the column row go to next line until count matches column number total
                        newLine += sr.ReadLine();
                        currentLength = newLine.Split(sepString, StringSplitOptions.None).Count();
                    }
                    fields = newLine.Split(sepString, StringSplitOptions.None);
                }
                if (currentLength > columnInterval)
                {  
                    //ideally never executes - but if csv row has too many separators, line is skipped
                    newLine = sr.ReadLine();
                    continue;
                }
                dt.Rows.Add(fields);
                newLine = sr.ReadLine();
            }
            sr.Close();
        }

        return dt;
    }
Brinkley answered 13/4, 2015 at 20:46 Comment(2)
Nice you just didn't declare rows as string[] yet.Abaft
@AnimalStyle you're right - updated with more robust method and declared rowsBrinkley
R
3

With Cinchoo ETL - an open source library, you can easily convert CSV file to DataTable with few lines of code.

using (var p = new ChoCSVReader(** YOUR CSV FILE **)
     .WithFirstLineHeader()
    )
{
    var dt = p.AsDataTable();
}

For more information, please visit codeproject article.

Sample fiddle: https://dotnetfiddle.net/jxhUVX

Rattoon answered 1/7, 2018 at 16:40 Comment(0)
L
3

I've recently written a CSV parser for .NET that I'm claiming is currently the fastest available as a nuget package: Sylvan.Data.Csv.

Using this library to load a DataTable is extremely easy.

using var dr = CsvDataReader.Create("data.csv");
var dt = new DataTable();
dt.Load(dr);

Assuming your file is a standard comma separated files with headers, that's all you need. There are also options to allow reading files without headers, and using alternate delimiters etc.

It is also possible to provide a custom schema for the CSV file so that columns can be treated as something other than string values. This will allow the DataTable columns to be loaded with values that can be easier to work with, as you won't have to coerce them when you access them.

This can be accomplished by providing an ICsvSchemaProvider implementation, which exposes a single method DbColumn? GetColumn(string? name, int ordinal). The DbColumn type is an abstract type defined in System.Data.Common, which means that you would have to provide an implementation of that too if you implement your own schema provider. The DbColumn type exposes a variety of metadata about a column, and you can choose to expose as much of the metadata as needed. The most important metadata is the DataType and AllowDBNull.

A very simple implementation that would expose type information could look like the following:

class TypedCsvColumn : DbColumn
{
    public TypedCsvColumn(Type type, bool allowNull)
    {
        // if you assign ColumnName here, it will override whatever is in the csv header
        this.DataType = type;
        this.AllowDBNull = allowNull;
    }
}
    
class TypedCsvSchema : ICsvSchemaProvider
{
    List<TypedCsvColumn> columns;

    public TypedCsvSchema()
    {
        this.columns = new List<TypedCsvColumn>();
    }

    public TypedCsvSchema Add(Type type, bool allowNull = false)
    {
        this.columns.Add(new TypedCsvColumn(type, allowNull));
        return this;
    }

    DbColumn? ICsvSchemaProvider.GetColumn(string? name, int ordinal)
    {
        return ordinal < columns.Count ? columns[ordinal] : null;
    }
}

To consume this implementation you would do the following:


var schema = new TypedCsvSchema()
    .Add(typeof(int))
    .Add(typeof(string))
    .Add(typeof(double), true)
    .Add(typeof(DateTime))
    .Add(typeof(DateTime), true);
var options = new CsvDataReaderOptions
{
    Schema = schema
};


using var dr = CsvDataReader.Create("data.csv", options);
...
Leathaleather answered 16/4, 2020 at 17:30 Comment(2)
I cannot use any of your code, I guess it's outdated. Could you perhaps update your post?Rigamarole
@speyck Indeed, it was out of date. I've updated the samples here. If you have any questions or issues feel free to open an issue at github.com/MarkPflug/SylvanLeathaleather
H
2

Here's a solution that uses ADO.Net's ODBC text driver:

Dim csvFileFolder As String = "C:\YourFileFolder"
Dim csvFileName As String = "YourFile.csv"

'Note that the folder is specified in the connection string,
'not the file. That's specified in the SELECT query, later.
Dim connString As String = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" _
    & csvFileFolder & ";Extended Properties=""Text;HDR=No;FMT=Delimited"""
Dim conn As New Odbc.OdbcConnection(connString)

'Open a data adapter, specifying the file name to load
Dim da As New Odbc.OdbcDataAdapter("SELECT * FROM [" & csvFileName & "]", conn)
'Then fill a data table, which can be bound to a grid
Dim dt As New DataTableda.Fill(dt)

grdCSVData.DataSource = dt

Once filled, you can value properties of the datatable, like ColumnName, to make utilize all the powers of the ADO.Net data objects.

In VS2008 you can use Linq to achieve the same effect.

NOTE: This may be a duplicate of this SO question.

Husky answered 26/6, 2009 at 17:24 Comment(0)
I
2

Can't resist adding my own spin to this. This is so much better and more compact than what I've used in the past.

This solution:

  • Does not depend on a database driver or 3rd party library.
  • Will not fail on duplicate column names
  • Handles commas in the data
  • Handles any delimiter, not just commas (although that is the default)

Here's what I came up with:

  Public Function ToDataTable(FileName As String, Optional Delimiter As String = ",") As DataTable
    ToDataTable = New DataTable
    Using TextFieldParser As New Microsoft.VisualBasic.FileIO.TextFieldParser(FileName) With
      {.HasFieldsEnclosedInQuotes = True, .TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited, .TrimWhiteSpace = True}
      With TextFieldParser
        .SetDelimiters({Delimiter})
        .ReadFields.ToList.Unique.ForEach(Sub(x) ToDataTable.Columns.Add(x))
        ToDataTable.Columns.Cast(Of DataColumn).ToList.ForEach(Sub(x) x.AllowDBNull = True)
        Do Until .EndOfData
          ToDataTable.Rows.Add(.ReadFields.Select(Function(x) Text.BlankToNothing(x)).ToArray)
        Loop
      End With
    End Using
  End Function

It depends on an extension method (Unique) to handle duplicate column names to be found as my answer in How to append unique numbers to a list of strings

And here's the BlankToNothing helper function:

  Public Function BlankToNothing(ByVal Value As String) As Object 
    If String.IsNullOrEmpty(Value) Then Return Nothing
    Return Value
  End Function
Indopacific answered 2/11, 2014 at 19:37 Comment(0)
B
2

I use a library called ExcelDataReader, you can find it on NuGet. Be sure to install both ExcelDataReader and the ExcelDataReader.DataSet extension (the latter provides the required AsDataSet method referenced below).

I encapsulated everything in one function, you can copy it in your code directly. Give it a path to CSV file, it gets you a dataset with one table.

public static DataSet GetDataSet(string filepath)
{
   var stream = File.OpenRead(filepath);

   try
   {
       var reader = ExcelReaderFactory.CreateCsvReader(stream, new ExcelReaderConfiguration()
       {
           LeaveOpen = false
       });

       var result = reader.AsDataSet(new ExcelDataSetConfiguration()
       {
           // Gets or sets a value indicating whether to set the DataColumn.DataType 
           // property in a second pass.
           UseColumnDataType = true,

           // Gets or sets a callback to determine whether to include the current sheet
           // in the DataSet. Called once per sheet before ConfigureDataTable.
           FilterSheet = (tableReader, sheetIndex) => true,

           // Gets or sets a callback to obtain configuration options for a DataTable. 
           ConfigureDataTable = (tableReader) => new ExcelDataTableConfiguration()
           {
               // Gets or sets a value indicating the prefix of generated column names.
               EmptyColumnNamePrefix = "Column",

               // Gets or sets a value indicating whether to use a row from the 
               // data as column names.
               UseHeaderRow = true,

               // Gets or sets a callback to determine which row is the header row. 
               // Only called when UseHeaderRow = true.
               ReadHeaderRow = (rowReader) =>
               {
                   // F.ex skip the first row and use the 2nd row as column headers:
                   //rowReader.Read();
               },

               // Gets or sets a callback to determine whether to include the 
               // current row in the DataTable.
               FilterRow = (rowReader) =>
               {
                   return true;
               },

               // Gets or sets a callback to determine whether to include the specific
               // column in the DataTable. Called once per column after reading the 
               // headers.
               FilterColumn = (rowReader, columnIndex) =>
               {
                   return true;
               }
           }
       });

       return result;
   }
   catch (Exception ex)
   {
       return null;
   }
   finally
   {
       stream.Close();
       stream.Dispose();
   }
}
Bottleneck answered 14/5, 2019 at 14:13 Comment(2)
It's 2020 and this is a great solution compared to some of the older answers on here. It's nicely packaged and uses a popular and lightweight library from NuGet. And it's flexible--if your CSV is in memory, simply pass it in as a MemoryStream instead of a file path. The DataTable that OP asked for is easily extracted from the DataSet like this: result.Tables[0]Kiva
Before calling this method ensure to have pasted System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance); as explained at https://mcmap.net/q/11390/-system-notsupportedexception-no-data-is-available-for-encoding-1252Psalms
T
0

Use this, one function solve all problems of comma and quote:

public static DataTable CsvToDataTable(string strFilePath)
    {

        if (File.Exists(strFilePath))
        {

            string[] Lines;
            string CSVFilePathName = strFilePath;

            Lines = File.ReadAllLines(CSVFilePathName);
            while (Lines[0].EndsWith(","))
            {
                Lines[0] = Lines[0].Remove(Lines[0].Length - 1);
            }
            string[] Fields;
            Fields = Lines[0].Split(new char[] { ',' });
            int Cols = Fields.GetLength(0);
            DataTable dt = new DataTable();
            //1st row must be column names; force lower case to ensure matching later on.
            for (int i = 0; i < Cols; i++)
                dt.Columns.Add(Fields[i], typeof(string));
            DataRow Row;
            int rowcount = 0;
            try
            {
                string[] ToBeContinued = new string[]{};
                bool lineToBeContinued = false;
                for (int i = 1; i < Lines.GetLength(0); i++)
                {
                    if (!Lines[i].Equals(""))
                    {
                        Fields = Lines[i].Split(new char[] { ',' });
                        string temp0 = string.Join("", Fields).Replace("\"\"", "");
                        int quaotCount0 = temp0.Count(c => c == '"');
                        if (Fields.GetLength(0) < Cols || lineToBeContinued || quaotCount0 % 2 != 0)
                        {
                            if (ToBeContinued.GetLength(0) > 0)
                            {
                                ToBeContinued[ToBeContinued.Length - 1] += "\n" + Fields[0];
                                Fields = Fields.Skip(1).ToArray();
                            }
                            string[] newArray = new string[ToBeContinued.Length + Fields.Length];
                            Array.Copy(ToBeContinued, newArray, ToBeContinued.Length);
                            Array.Copy(Fields, 0, newArray, ToBeContinued.Length, Fields.Length);
                            ToBeContinued = newArray;
                            string temp = string.Join("", ToBeContinued).Replace("\"\"", "");
                            int quaotCount = temp.Count(c => c == '"');
                            if (ToBeContinued.GetLength(0) >= Cols && quaotCount % 2 == 0 )
                            {
                                Fields = ToBeContinued;
                                ToBeContinued = new string[] { };
                                lineToBeContinued = false;
                            }
                            else
                            {
                                lineToBeContinued = true;
                                continue;
                            }
                        }

                        //modified by Teemo @2016 09 13
                        //handle ',' and '"'
                        //Deserialize CSV following Excel's rule:
                        // 1: If there is commas in a field, quote the field.
                        // 2: Two consecutive quotes indicate a user's quote.

                        List<int> singleLeftquota = new List<int>();
                        List<int> singleRightquota = new List<int>();

                        //combine fileds if number of commas match
                        if (Fields.GetLength(0) > Cols) 
                        {
                            bool lastSingleQuoteIsLeft = true;
                            for (int j = 0; j < Fields.GetLength(0); j++)
                            {
                                bool leftOddquota = false;
                                bool rightOddquota = false;
                                if (Fields[j].StartsWith("\"")) 
                                {
                                    int numberOfConsecutiveQuotes = 0;
                                    foreach (char c in Fields[j]) //start with how many "
                                    {
                                        if (c == '"')
                                        {
                                            numberOfConsecutiveQuotes++;
                                        }
                                        else
                                        {
                                            break;
                                        }
                                    }
                                    if (numberOfConsecutiveQuotes % 2 == 1)//start with odd number of quotes indicate system quote
                                    {
                                        leftOddquota = true;
                                    }
                                }

                                if (Fields[j].EndsWith("\""))
                                {
                                    int numberOfConsecutiveQuotes = 0;
                                    for (int jj = Fields[j].Length - 1; jj >= 0; jj--)
                                    {
                                        if (Fields[j].Substring(jj,1) == "\"") // end with how many "
                                        {
                                            numberOfConsecutiveQuotes++;
                                        }
                                        else
                                        {
                                            break;
                                        }
                                    }

                                    if (numberOfConsecutiveQuotes % 2 == 1)//end with odd number of quotes indicate system quote
                                    {
                                        rightOddquota = true;
                                    }
                                }
                                if (leftOddquota && !rightOddquota)
                                {
                                    singleLeftquota.Add(j);
                                    lastSingleQuoteIsLeft = true;
                                }
                                else if (!leftOddquota && rightOddquota)
                                {
                                    singleRightquota.Add(j);
                                    lastSingleQuoteIsLeft = false;
                                }
                                else if (Fields[j] == "\"") //only one quota in a field
                                {
                                    if (lastSingleQuoteIsLeft)
                                    {
                                        singleRightquota.Add(j);
                                    }
                                    else
                                    {
                                        singleLeftquota.Add(j);
                                    }
                                }
                            }
                            if (singleLeftquota.Count == singleRightquota.Count)
                            {
                                int insideCommas = 0;
                                for (int indexN = 0; indexN < singleLeftquota.Count; indexN++)
                                {
                                    insideCommas += singleRightquota[indexN] - singleLeftquota[indexN];
                                }
                                if (Fields.GetLength(0) - Cols >= insideCommas) //probabaly matched
                                {
                                    int validFildsCount = insideCommas + Cols; //(Fields.GetLength(0) - insideCommas) may be exceed the Cols
                                    String[] temp = new String[validFildsCount];
                                    int totalOffSet = 0;
                                    for (int iii = 0; iii < validFildsCount - totalOffSet; iii++)
                                    {
                                        bool combine = false;
                                        int storedIndex = 0;
                                        for (int iInLeft = 0; iInLeft < singleLeftquota.Count; iInLeft++)
                                        {
                                            if (iii + totalOffSet == singleLeftquota[iInLeft])
                                            {
                                                combine = true;
                                                storedIndex = iInLeft;
                                                break;
                                            }
                                        }
                                        if (combine)
                                        {
                                            int offset = singleRightquota[storedIndex] - singleLeftquota[storedIndex];
                                            for (int combineI = 0; combineI <= offset; combineI++)
                                            {
                                                temp[iii] += Fields[iii + totalOffSet + combineI] + ",";
                                            }
                                            temp[iii] = temp[iii].Remove(temp[iii].Length - 1, 1);
                                            totalOffSet += offset;
                                        }
                                        else
                                        {
                                            temp[iii] = Fields[iii + totalOffSet];
                                        }
                                    }
                                    Fields = temp;
                                }
                            }
                        }
                        Row = dt.NewRow();
                        for (int f = 0; f < Cols; f++)
                        {
                            Fields[f] = Fields[f].Replace("\"\"", "\""); //Two consecutive quotes indicate a user's quote
                            if (Fields[f].StartsWith("\""))
                            {
                                if (Fields[f].EndsWith("\""))
                                {
                                    Fields[f] = Fields[f].Remove(0, 1);
                                    if (Fields[f].Length > 0)
                                    {
                                        Fields[f] = Fields[f].Remove(Fields[f].Length - 1, 1);
                                    }
                                }
                            }
                            Row[f] = Fields[f];
                        }
                        dt.Rows.Add(Row);
                        rowcount++;
                    }
                }
            }
            catch (Exception ex)
            {
                throw new Exception( "row: " + (rowcount+2) + ", " + ex.Message);
            }
            //OleDbConnection connection = new OleDbConnection(string.Format(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0}; Extended Properties=""text;HDR=Yes;FMT=Delimited"";", FilePath + FileName));
            //OleDbCommand command = new OleDbCommand("SELECT * FROM " + FileName, connection);
            //OleDbDataAdapter adapter = new OleDbDataAdapter(command);
            //DataTable dt = new DataTable();
            //adapter.Fill(dt);
            //adapter.Dispose();
            return dt;
        }
        else
            return null;

        //OleDbConnection connection = new OleDbConnection(string.Format(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0}; Extended Properties=""text;HDR=Yes;FMT=Delimited"";", strFilePath));
        //OleDbCommand command = new OleDbCommand("SELECT * FROM " + strFileName, connection);
        //OleDbDataAdapter adapter = new OleDbDataAdapter(command);
        //DataTable dt = new DataTable();
        //adapter.Fill(dt);
        //return dt;
    }
Toluidine answered 22/1, 2019 at 8:18 Comment(0)
C
0

Just sharing this extension methods, I hope that it can help someone.

public static List<string> ToCSV(this DataSet ds, char separator = '|')
{
    List<string> lResult = new List<string>();

    foreach (DataTable dt in ds.Tables)
    {
        StringBuilder sb = new StringBuilder();
        IEnumerable<string> columnNames = dt.Columns.Cast<DataColumn>().
                                          Select(column => column.ColumnName);
        sb.AppendLine(string.Join(separator.ToString(), columnNames));

        foreach (DataRow row in dt.Rows)
        {
            IEnumerable<string> fields = row.ItemArray.Select(field =>
              string.Concat("\"", field.ToString().Replace("\"", "\"\""), "\""));
            sb.AppendLine(string.Join(separator.ToString(), fields));
        }

        lResult.Add(sb.ToString());
    }
    return lResult;
}

public static DataSet CSVtoDataSet(this List<string> collectionCSV, char separator = '|')
{
    var ds = new DataSet();

    foreach (var csv in collectionCSV)
    {
        var dt = new DataTable();

        var readHeader = false;
        foreach (var line in csv.Split(new[] { Environment.NewLine }, StringSplitOptions.None))
        {
            if (!readHeader)
            {
                foreach (var c in line.Split(separator))
                    dt.Columns.Add(c);
            }
            else
            {
                dt.Rows.Add(line.Split(separator));
            }
        }

        ds.Tables.Add(dt);
    }

    return ds;
}
Criminal answered 1/3, 2019 at 20:26 Comment(0)
S
0
 Public Function ReadCsvFileToDataTable(strFilePath As String) As DataTable
    Dim dtCsv As DataTable = New DataTable()
    Dim Fulltext As String
    Using sr As StreamReader = New StreamReader(strFilePath)
        While Not sr.EndOfStream
            Fulltext = sr.ReadToEnd().ToString()
            Dim rows As String() = Fulltext.Split(vbLf)
            For i As Integer = 0 To rows.Count() - 1 - 1
                Dim rowValues As String() = rows(i).Split(","c)
                If True Then
                    If i = 0 Then
                        For j As Integer = 0 To rowValues.Count() - 1
                            dtCsv.Columns.Add(rowValues(j))
                        Next
                    Else
                        Dim dr As DataRow = dtCsv.NewRow()
                        For k As Integer = 0 To rowValues.Count() - 1
                            dr(k) = rowValues(k).ToString()
                        Next
                        dtCsv.Rows.Add(dr)
                    End If
                End If
            Next
        End While
    End Using
    Return dtCsv
End Function
Stoneman answered 14/3, 2019 at 8:20 Comment(0)
M
0

Converter csv to DataTable. You can choose separator, isFirstRowHeaders and prefix for extra headers if your first row is not full list of headers, or you automaticle generate headers.

        public DataTable GetDataFromCsv(string path, char separator, bool isFirstRowHeaders = true,  string prefixAutoHeader = "AutoHeader_")
    {
        DataTable dt = new DataTable();
        string csvData;
        try
        {
            using (StreamReader sr = new StreamReader(path))
            {
                csvData = sr.ReadToEnd().ToString();

                //Split csvData by Rows
                List<string> csvRows = new List<string>(csvData.Split('\n'));

                //Split rows by cells with selected separator
                List<List<string>> csvCells = new List<List<string>>();
                csvRows.ForEach(r => csvCells.Add(new List<string>(r.Split(separator))));

                //definition row max size, for adding extra headers
                int maxSizeRow = csvCells.OrderByDescending(r => r.Count).First().Count;

                //if isFirstRowHeaders then filling datatable headers from first csvRow  
                if (isFirstRowHeaders)
                {
                    foreach (string header in csvCells[0])
                    {
                        dt.Columns.Add(header);
                    }
                }

                //Adding extra headers in datatable or create AutoHeaders if isFirstRowHeaders == false
                for (int i = dt.Columns.Count; i < maxSizeRow; i++)
                {
                    dt.Columns.Add(prefixAutoHeader + i);
                }

                //Filling datatable
                foreach (var row in csvCells)
                {
                    //Skip the first row if it is consist headers
                    if (isFirstRowHeaders)
                    {
                        isFirstRowHeaders = false;
                    }
                    else
                    {
                        //creating datatable row and Add to datatable
                        int i = 0;
                        DataRow toInsert = dt.NewRow();
                        foreach (string cell in row)
                        {
                            try
                            {
                                toInsert[i] = cell;
                            }
                            catch (Exception ex) { }
                            i++;
                        }
                        dt.Rows.Add(toInsert);
                    }
                }
            }
            return dt;
        }
        catch (Exception e)
        {
            return null;
        }
    }
Makeup answered 27/7, 2022 at 17:21 Comment(1)
I haven't tested this, but does this handle when a 'column' contains the same value as the separator (comma) and the value is wrapped in quotes?Supersede

© 2022 - 2024 — McMap. All rights reserved.