DataTable To CSV
Asked Answered
V

2

5

This question is NOT about parsing a CSV.

Using the following code to create a CSV from a DataTable
But it is slow
100 rows by 14 columns is 4 seconds
Is there a faster way?

StringBuilder sb = new StringBuilder();
bool first = true;
int colCount = 0;
foreach (DataColumn dc in DT.Columns)
{
    if (first) first = false; else sb.Append(",");
    sb.Append("\"" + dc.ColumnName +  "\"");
    colCount++;
}
sb.AppendLine();
foreach (DataRow dr in DT.Rows)
{  
    first = true;
    for (int i = 0; i < colCount; i++)
    {
        if (first) first = false; else sb.Append(",");
        sb.Append("\"" + dr[i].ToString().Trim() + "\"");
    }
    sb.AppendLine();
}
return sb.ToString();

StringBuilder is not the problem here.
Load i from 0 to 1 million runs in 300 milliseconds

StringBuilder sb = new StringBuilder();
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < 1000000; i++)
{
    sb.Append(i.ToString());
}
sw.Stop();
Debug.Write(sw.ElapsedMilliseconds.ToString());
Valedictory answered 14/2, 2013 at 19:45 Comment(12)
@DourHighArch The question is creating a CSV. NOT parsing. There is nothing in this question about parsing a CSV.Valedictory
My only comment would be that you are writing your output into a StringBuilder - why not write directly to a stream?Calculable
Not sure how much you would gain, but you could eliminate the string expressions, i.e. change sb.Append("\"" + dc.ColumnName + "\""); to sb.Append('"'); sb.Append(dc.ColumnName); db.Append('"');. Same in your second loop. Also, single character literals may be faster than strings, where applicable. Finally (nit) first in your second loop can be eliminated in favor of a check against i==0.Lenticel
@Calculable I take out SB calls and the time is the same. Loading a SB with 40,000 characters is not what is taking 4 seconds.Valedictory
@500-InternalServerError But would not (i==0) take longer than (first)Valedictory
No, at the CPU level the complexity of the two is equivalent.Lenticel
@500-InternalServerError I tested and you are correctValedictory
Apologies @Blam, I misunderstood your question. You may remove the proposed answer, or leave it as a warning to others who may not read the questions correctly.Stronghold
that does seem kinda odd that it's taking that long.. What exactly are in those columns?? massive blocks of data that require transformation or just simple int,string, columns. some example of the data itself might help. I have a project where I write CSV's from IDataReaders and it's reasonable even with millions of rows.Desecrate
@PaulFarry One report that is just 11053 characters takes 2.7 seconds. Yes I know DataReader is much faster. That is why I am confused with this speed.Valedictory
Haven't got an answer for you, but I mocked up some code and I found no issue with the datatable stuff. Are you sure your DataTable is the actual problem and not something else in your application maxing out the CPU/IO and not giving your code enough of a chance to run?.. the example i did is here gist.github.com/PaulFarry/6e5e1e80f484cfad1218Desecrate
@PaulFarry Yes your sample does run fast on my PC. Did not test it on the server. Not sure what is going on. It Fills from a DataAdapter but that should not make a difference. It is running on a server that is at high CPU but other parts of the app are responsive. I will reboot and test again when the server is idle.Valedictory
C
12

There are faster ways of doing string concatenation and some of the other logic. Using the string builder to build up the whole thing may be part of the slowdown too. A library would have some of these things thought out already and would probably perform faster, as some have suggested.

Here is some code using CsvHelper (which I wrote).

using( var dt = new DataTable() )
{
    dt.Load( dataReader );
    foreach( DataColumn column in dt.Columns )
    {
        csv.WriteField( column.ColumnName );
    }
    csv.NextRecord();

    foreach( DataRow row in dt.Rows )
    {
        for( var i = 0; i < dt.Columns.Count; i++ )
        {
            csv.WriteField( row[i] );
        }
        csv.NextRecord();
    }
}

If you have the DataReader you don't even have to use the DataTable then, which should speed it up some too.

var hasHeaderBeenWritten = false;
while( dataReader.Read() )
{
    if( !hasHeaderBeenWritten )
    {
        for( var i = 0; i < dataReader.FieldCount; i++ )
        {
            csv.WriteField( dataReader.GetName( i ) );
        }
        csv.NextRecord();
        hasHeaderBeenWritten = true;
    }

    for( var i = 0; i < dataReader.FieldCount; i++ )
    {
        csv.WriteField( dataReader[i] );
    }
    csv.NextRecord();
}
Congruence answered 22/3, 2015 at 20:51 Comment(1)
Thanks for the example, its missing the using (var csv = new CsvWriter(writer)) bit which might confuse some peopleAcarpous
L
0

Here is a complete version of an extension method that converts it to a CSV string instead of a file:

public static string ToCsv(this DataTable dt)
{
    using var writer = new StringWriter();
    using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
    {
        // write column names
        foreach( DataColumn column in dt.Columns )
        {
            csv.WriteField( column.ColumnName );
        }
        csv.NextRecord();
        
        foreach (DataRow row in dt.Rows)
        {
            for (var i = 0; i < dt.Columns.Count; i++)
            {
                csv.WriteField(row[i]);
            }
            csv.NextRecord();
        }
    }
    
    return writer.ToString();
}
Leclaire answered 14/2 at 9:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.