Writing a CSV file in .net
Asked Answered
C

11

44

I have a requirement to export a dataset as a CSV file.

I have spent a while searching for a set of rules to go by and realised there are quite a few rules and exceptions when writing a CSV file.

http://knab.ws/blog/index.php?/archives/3-CSV-file-parser-and-writer-in-C-Part-1.html http://bytes.com/topic/c-sharp/answers/236875-problems-streamwriter-output-csv http://social.msdn.microsoft.com/forums/en-US/csharpgeneral/thread/0073fcbb-adab-40f0-b768-4bba803d3ccd

So now it is not a simple process of separating strings with commas, I have searched for an existing CSV writer either 3rd party or (hopefully!) included in the .net framework.

Edit: New link: http://www.thinqlinq.com/Post.aspx/Title/LINQ-to-CSV-using-DynamicObject-and-TextFieldParser

The TextFieldParser is a VB object (can be referenced from C#) that will automatically parse CSV files. :)

I was wondering if anybody knows any handy .Net (2.0 -> 3.5 and 4.0) libraries that can be used to generate a correctly formatted CSV file.

Also, if there are any sets of rules for generating CSV files.

There are a lot of details of CSV readers and parsing CSV files, however not as many about writing (ok, I know it is just the opposite :P ).

http://www.codeproject.com/KB/database/CsvReader.aspx

Any help would be much appreciated :)

I found another article with some more detailed CSV rules: http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm

A neat 3rd party library is Linq-to-CSV (not framework library): http://www.codeproject.com/KB/linq/LINQtoCSV.aspx

Thanks for your help everyone. I have decided that the best solution will be to create a simple static class that will do the special character replacement (that Chris mentioned).

If I had a need for Linq querying my CSV files, I would look at the CodeProjects implementation of Linq-to-CSV.

Thanks again :)

Cain answered 6/11, 2009 at 0:42 Comment(3)
I think since you are the one exporting you can be rather relaxed when you write, if you just follow the general rules most programs like Excel figure out how to read them.Pluvial
That's true. I am in the unfortunate position of writing an "Export" function that does not specify potential uses. I assume 99% of the time this will be excel or perhaps (unlikely) SSIS packages by other applications. I can only assume though.Cain
You can try my very light weight delimited file writer: gist.github.com/eranbetzalel/…Atheroma
D
21

If there are any commas in your cell, surround the entire cell with double quotes, eg:

cell 1,cell 2,"This is one cell, even with a comma",cell4,etc

And if you want a literal double quote, do two of them, eg:

cell 1,cell 2,"This is my cell and it has ""quotes"" in it",cell 4,etc

As for dates, stick to ISO format, and you should be fine (eg yyyy-mm-dd hh:mm:ss)

Demeter answered 6/11, 2009 at 0:52 Comment(7)
Are they the only "rules" per se? What about newlines for example. Do you know of any references for these rules/requirments? I assume (from searches) that there is no standard for these file types, just proprietry requirements (eg what will work with excel :P ). Thanks for your input.Cain
Basically, i go with whatever works with excel. As for newlines, C#'s various AppendLine and WriteLine functions all seem to append \r\n, which seems to agree with excel.Demeter
And remember to enclose values with newlines in double quotes.Senecal
As per en.wikipedia.org/wiki/Comma-separated_values this answer is way to simple for the reality of CSVs formatting requirements if you needto be strict.Atomics
BTW about dates: what you quoted isn't an ISO format (ISO 8601 uses T, not space for separation between date and time)Breunig
Take into account the fact that any CSV you may save will always be culture-dependent on the culture you used for the CSV. If you try to open a CSV formatted as @Demeter said into e.g. an italian/german copy of Excel (or Excel on an italian/german Windows) it will keep all your values on the first column, because in those cultures the list separator is the semicolon (not the comma). Try to use a library that adheres to RFC4180.Anticipatory
This link and the links in it provide some good info and also tell about the lack of standards in CSV - tools.ietf.org/html/rfc4180Foreshank
W
51

CsvHelper (a library I maintain) also available via NuGet.

CsvHelper can automatically write your class objects to a file for you.

var myObj = new MyCustomClass
{
    Prop1 = "one",
    Prop2 = 2
};
var streamWriter = // Create a writer to somewhere...
var csvWriter = new CsvWriter( streamWriter );

// You can write a single record.
csvWriter.WriteRecord( myObj );

// You can also write a collection of records.
var myRecords = new List<MyCustomClass>{ myObj };
csvWriter.WriteRecords( myRecords );
Wherein answered 22/2, 2010 at 23:51 Comment(1)
Btw, there were some mapping features added to CsvHelper that allow you to map your classes without using attributes. You can use a fluent mapping class instead, which allows you to map to classes that you don't have control over.Wherein
D
21

If there are any commas in your cell, surround the entire cell with double quotes, eg:

cell 1,cell 2,"This is one cell, even with a comma",cell4,etc

And if you want a literal double quote, do two of them, eg:

cell 1,cell 2,"This is my cell and it has ""quotes"" in it",cell 4,etc

As for dates, stick to ISO format, and you should be fine (eg yyyy-mm-dd hh:mm:ss)

Demeter answered 6/11, 2009 at 0:52 Comment(7)
Are they the only "rules" per se? What about newlines for example. Do you know of any references for these rules/requirments? I assume (from searches) that there is no standard for these file types, just proprietry requirements (eg what will work with excel :P ). Thanks for your input.Cain
Basically, i go with whatever works with excel. As for newlines, C#'s various AppendLine and WriteLine functions all seem to append \r\n, which seems to agree with excel.Demeter
And remember to enclose values with newlines in double quotes.Senecal
As per en.wikipedia.org/wiki/Comma-separated_values this answer is way to simple for the reality of CSVs formatting requirements if you needto be strict.Atomics
BTW about dates: what you quoted isn't an ISO format (ISO 8601 uses T, not space for separation between date and time)Breunig
Take into account the fact that any CSV you may save will always be culture-dependent on the culture you used for the CSV. If you try to open a CSV formatted as @Demeter said into e.g. an italian/german copy of Excel (or Excel on an italian/german Windows) it will keep all your values on the first column, because in those cultures the list separator is the semicolon (not the comma). Try to use a library that adheres to RFC4180.Anticipatory
This link and the links in it provide some good info and also tell about the lack of standards in CSV - tools.ietf.org/html/rfc4180Foreshank
O
18

I would just like to add there's an RFC that specifies the CSV format which is what I would regard as the canonical source.

Overword answered 17/2, 2010 at 7:14 Comment(1)
Thanks Richard, that is a very detailed piece of information :)Cain
S
6

I've used filehelpers extensively and it's pretty awesome for generating CSVs.

Sterilant answered 6/11, 2009 at 0:44 Comment(3)
Thanks, FileHelpers looks like a very handy (+ open-source) library. Unfortunately in this instance I am unable to add attributes to my objects which I would like to convert as CSV. Using .net reflector I couldn't see a way of doing this by passing in values/lists. Do you know if this is possible?Cain
You could just create some new "generator only" classes and use AutoMapper to map your real classes to the generator classes and then write those classes out using FileHelpers. I've done it before and it's pretty simple.Sterilant
Does filehelpers really do CSV? Sure they have Delimited but that's not the same as CSV with the quotation rules.Atomics
A
4

Here is the function you can use to generate a row of CSV file from string list (IEnumerable(Of String) or string array can be used as well):

Function CreateCSVRow(strArray As List(Of String)) As String
    Dim csvCols As New List(Of String)
    Dim csvValue As String
    Dim needQuotes As Boolean
    For i As Integer = 0 To strArray.Count() - 1
        csvValue = strArray(i)
        needQuotes = (csvValue.IndexOf(",", StringComparison.InvariantCulture) >= 0 _
                      OrElse csvValue.IndexOf("""", StringComparison.InvariantCulture) >= 0 _
                      OrElse csvValue.IndexOf(vbCrLf, StringComparison.InvariantCulture) >= 0)
        csvValue = csvValue.Replace("""", """""")
        csvCols.Add(If(needQuotes, """" & csvValue & """", csvValue))
    Next
    Return String.Join(",", csvCols.ToArray())
End Function

As I think, it won't be difficult to convert from VB.NET to C#)

Anxiety answered 19/11, 2013 at 18:24 Comment(1)
Thanks for the info Evgeny, I'm sure it will be useful for someone who comes across this question. :)Cain
R
3

For the specifications, see http://en.wikipedia.org/wiki/Comma-separated_values

Reeducate answered 6/11, 2009 at 1:52 Comment(0)
W
3

I know you said you found your answer, but I just wanted to give a vote for the LINQtoCSV library you mentioned. I've used it in a couple projects and it works really well for keeping your business code clean and not concerned with details / peculiarities of the file format.

Maybe in your specific case it is not too difficult to write the exporter, but the nice thing about this library is that it is bidirectional. If you find yourself having to consume the CSV down the road it's not much extra code, and/or it gives you a consistent library to use on future projects.

Welldisposed answered 6/11, 2009 at 3:35 Comment(1)
Thanks, it looks very handy indeed. Projects have different requirements and priorities so different solutions may be better suited to different project. Thanks and dont forget to vote up if you like it. :)Cain
E
2

You can use ODBC to read and write CSV files (via OdbcConnection and a suitable connection string). This should be reasonably good for generating CSV files, and will handle things like quoting for you; however I have run into some issues when using it to read CSV files generated by other programs.

Ellipticity answered 6/11, 2009 at 0:57 Comment(2)
Thanks, are there any references for getting started for this method?Cain
If you google for "odbc csv file" it throws up a good few links. The top one is c-sharpcorner.com/UploadFile/mahesh/… -- note you will need to scroll way down towards the bottom to find the C# code though!Ellipticity
F
2

Another rule to add to the others: Use the commas as field separators rather than as field terminators. The reason for this is that a trailing comma at the end of a line could be ambiguous: Does it have no significance or does it signify a NULL value following it?

Fiorenza answered 6/11, 2009 at 1:34 Comment(2)
Very good point. It would be nice to have an "end-of-row" separator instead of assuming a new-line. For example different OS's use different characters!Cain
CSV format specifies CRLF ("\r\n") as the terminator for each row.Liar
C
0

I found this important link which is quite neat. Haven't tried it yet, will let you know how it goes!

http://www.codeproject.com/KB/linq/LINQtoCSV.aspx

Looking more closely, this implementation essentially only uses basic rules too:

special chars = \n \" and the separator char.

if found special characters, then surround with quotes. Replace quote with double quote.

Essentially the rules Chris mentioned. I think the easiest way to do this is to create my helper method based on the simple rules and revise on a user-needs basis.

Cain answered 6/11, 2009 at 1:12 Comment(0)
P
0

Can you use a string array and then concatenate using:

string out = "";
string[] elements = { "1", "2" };
foreach(string s in elements) { out += s + "," };
out = out.substring(0, out.Length-1);
Polytonality answered 11/3, 2010 at 0:23 Comment(1)
you can replace the above code with String.Join(",", "1", "2", "etc...");Mydriasis

© 2022 - 2024 — McMap. All rights reserved.