Splitting Comma Separated Values (CSV)
Asked Answered
M

8

2

How to split the CSV file in c sharp? And how to display this?

Myrick answered 26/6, 2009 at 5:40 Comment(5)
Important reading regarding CSV: creativyst.com/Doc/Articles/CSV/CSV01.htm#CSVariationsShoemaker
Not very precise, your question... if you give more details, you'll get more useful answers.Conchiferous
simply, i mean i have to read a string that i entered, and split up and stores into an array.Myrick
@Nisam: what Benjol meant is that if you show actual examples of how the strings look like, it will be easier to provide examples of how to split them.Salamander
Ok. i have one CSV file named address.csv (i have no code example), I have to just split the contents of the file.Myrick
M
0

I had got the result for my query. its like simple like i had read a file using io.file. and all the text are stored into a string. After that i splitted with a seperator. The code is shown below.

using System;
using System.Collections.Generic;
using System.Text;

namespace CSV
{
    class Program
    {
        static void Main(string[] args)
        {

            string csv = "user1, user2, user3,user4,user5";

            string[] split = csv.Split(new char[] {',',' '});
            foreach(string s in split)
            {
                if (s.Trim() != "")
                    Console.WriteLine(s);
            }
            Console.ReadLine();
        }
    }
}
Myrick answered 26/6, 2009 at 6:45 Comment(1)
I'm glad you found a solution. If you need to look further in depth to CSV files, you may want to have a look at some of the other methods and links posted here. It seems like you are aware of the exact format that your CSV's will be in. Also, consider the file sizes. A string representing the entire CSV is okay up to a couple of megabytes, but take a look at the Codeproject link posted by Vinko if you're dealing with large files. Even if you don't need to use them in this example, you should make sure you understand what StringSplitOptions and the "count" parameter to String.Split do. :-)Access
U
12

I've been using the TextFieldParser Class in the Microsoft.VisualBasic.FileIO namespace for a C# project I'm working on. It will handle complications such as embedded commas or fields that are enclosed in quotes etc. It returns a string[] and, in addition to CSV files, can also be used for parsing just about any type of structured text file.

Unequaled answered 26/6, 2009 at 6:41 Comment(2)
Interesting. I've never looked here before. I recently wrote a complete CSV class in C#, and this would have helped. I wound up converting newlines to ~'s, and well, commas could only occur in the last field, so I used the maxentries parameter of String.Split to capture the entire last field, commas and all, but I'll have to look at this class. Thanks for the link.Access
Order for 'common' functionality 1) Core Framework 2) Extended Framework 3) Open Source Library 4) Cheap Paid for Library 5) Write it yourselfStandoff
M
5

Display where? About splitting, the best way is to use a good library to that effect.

This library is pretty good, I can recommend it heartily.

The problems using naïve methods is that the usually fail, there are tons of considerations without even thinking about performance:

  • What if the text contains commas
  • Support for the many existing formats (separated by semicolon, or text surrounded by quotes, or single quotes, etc.)
  • and many others
Mellins answered 26/6, 2009 at 5:44 Comment(0)
B
3

Import Micorosoft.VisualBasic as a reference (I know, its not that bad) and use Microsoft.VisualBasic.FileIO.TextFieldParser - this handles CSV files very well, and can be used in any .Net language.

Bede answered 29/6, 2009 at 14:32 Comment(0)
B
1

read the file one line at a time, then ...

foreach (String line in line.Split(new char[] { ',' }))
    Console.WriteLine(line);
Balmy answered 26/6, 2009 at 5:44 Comment(2)
Unfortunately this doesn't work if the values have embedded commas or fields are enclosed in quotes. Try the TextFieldParser class instead.Unequaled
Yeah I relalized that shortly after providing this example >.< It is easily overcome though, just keep a tally of the number of "'s and only split when there have been an even number, and don't count escaped ones :) So I guess you would have to tokenize the string before splitting. This complexity is what 3'rd party libraries help with.Balmy
C
1

This is a CSV parser I use on occasion.

Usage: (dgvMyView is a datagrid type.)

CSVReader reader = new CSVReader("C:\MyFile.txt");
reader.DisplayResults(dgvMyView);

Class:

using System.IO;
using System.Text.RegularExpressions;
using System.Windows.Forms;    
public class CSVReader
{
    private const string ESCAPE_SPLIT_REGEX = "({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*";
    private string[] FieldNames;
    private List<string[]> Records;
    private int ReadIndex;

    public CSVReader(string File)
    {
        Records = new List<string[]>();
        string[] Record = null;
        StreamReader Reader = new StreamReader(File);
        int Index = 0;
        bool BlankRecord = true;

        FieldNames = GetEscapedSVs(Reader.ReadLine());
        while (!Reader.EndOfStream)
        {
            Record = GetEscapedSVs(Reader.ReadLine());
            BlankRecord = true;
            for (Index = 0; Index <= Record.Length - 1; Index++)
            {
                if (!string.IsNullOrEmpty(Record[Index])) BlankRecord = false;
            }
            if (!BlankRecord) Records.Add(Record);
        }
        ReadIndex = -1;
        Reader.Close();
    }

    private string[] GetEscapedSVs(string Data)
    {
        return GetEscapedSVs(Data, ",", "\"");
    }
    private string[] GetEscapedSVs(string Data, string Separator, string Escape)
    {
        string[] Result = null;
        int Index = 0;
        int PriorMatchIndex = 0;
        MatchCollection Matches = Regex.Matches(Data, string.Format(ESCAPE_SPLIT_REGEX, Separator, Escape));

        Result = new string[Matches.Count];


        for (Index = 0; Index <= Result.Length - 2; Index++)
        {
            Result[Index] = Data.Substring(PriorMatchIndex, Matches[Index].Groups["Separator"].Index - PriorMatchIndex);
            PriorMatchIndex = Matches[Index].Groups["Separator"].Index + Separator.Length;
        }
        Result[Result.Length - 1] = Data.Substring(PriorMatchIndex);

        for (Index = 0; Index <= Result.Length - 1; Index++)
        {
            if (Regex.IsMatch(Result[Index], string.Format("^{0}[^{0}].*[^{0}]{0}$", Escape))) Result[Index] = Result[Index].Substring(1, Result[Index].Length - 2);
            Result[Index] = Result[Index].Replace(Escape + Escape, Escape);
            if (Result[Index] == null) Result[Index] = "";
        }

        return Result;
    }

    public int FieldCount
    {
        get { return FieldNames.Length; }
    }

    public string GetString(int Index)
    {
        return Records[ReadIndex][Index];
    }

    public string GetName(int Index)
    {
        return FieldNames[Index];
    }

    public bool Read()
    {
        ReadIndex = ReadIndex + 1;
        return ReadIndex < Records.Count;
    }


    public void DisplayResults(DataGridView DataView)
    {
        DataGridViewColumn col = default(DataGridViewColumn);
        DataGridViewRow row = default(DataGridViewRow);
        DataGridViewCell cell = default(DataGridViewCell);
        DataGridViewColumnHeaderCell header = default(DataGridViewColumnHeaderCell);
        int Index = 0;
        ReadIndex = -1;

        DataView.Rows.Clear();
        DataView.Columns.Clear();

        for (Index = 0; Index <= FieldCount - 1; Index++)
        {
            col = new DataGridViewColumn();
            col.CellTemplate = new DataGridViewTextBoxCell();
            header = new DataGridViewColumnHeaderCell();
            header.Value = GetName(Index);
            col.HeaderCell = header;
            DataView.Columns.Add(col);
        }

        while (Read())
        {
            row = new DataGridViewRow();
            for (Index = 0; Index <= FieldCount - 1; Index++)
            {
                cell = new DataGridViewTextBoxCell();
                cell.Value = GetString(Index).ToString();
                row.Cells.Add(cell);
            }
            DataView.Rows.Add(row);
        }
    }
}
Cahan answered 26/6, 2009 at 6:14 Comment(0)
M
0

I had got the result for my query. its like simple like i had read a file using io.file. and all the text are stored into a string. After that i splitted with a seperator. The code is shown below.

using System;
using System.Collections.Generic;
using System.Text;

namespace CSV
{
    class Program
    {
        static void Main(string[] args)
        {

            string csv = "user1, user2, user3,user4,user5";

            string[] split = csv.Split(new char[] {',',' '});
            foreach(string s in split)
            {
                if (s.Trim() != "")
                    Console.WriteLine(s);
            }
            Console.ReadLine();
        }
    }
}
Myrick answered 26/6, 2009 at 6:45 Comment(1)
I'm glad you found a solution. If you need to look further in depth to CSV files, you may want to have a look at some of the other methods and links posted here. It seems like you are aware of the exact format that your CSV's will be in. Also, consider the file sizes. A string representing the entire CSV is okay up to a couple of megabytes, but take a look at the Codeproject link posted by Vinko if you're dealing with large files. Even if you don't need to use them in this example, you should make sure you understand what StringSplitOptions and the "count" parameter to String.Split do. :-)Access
F
0

The following function takes a line from a CSV file and splits it into a List<string>.

Arguments:
string line = the line to split
string textQualifier = what (if any) text qualifier (i.e. "" or "\"" or "'")
char delim = the field delimiter (i.e. ',' or ';' or '|' or '\t')
int colCount = the expected number of fields (0 means don't check)

Example usage:

List<string> fields = SplitLine(line, "\"", ',', 5);
// or
List<string> fields = SplitLine(line, "'", '|', 10);
// or
List<string> fields = SplitLine(line, "", '\t', 0);

Function:

private List<string> SplitLine(string line, string textQualifier, char delim, int colCount)
{
    List<string> fields = new List<string>();
    string origLine = line;

    char textQual = '"';
    bool hasTextQual = false;
    if (!String.IsNullOrEmpty(textQualifier))
    {
        hasTextQual = true;
        textQual = textQualifier[0];            
    }

    if (hasTextQual)
    {
        while (!String.IsNullOrEmpty(line))
        {
            if (line[0] == textQual) // field is text qualified so look for next unqualified delimiter
            {
                int fieldLen = 1;
                while (true)
                {
                    if (line.Length == 2) // must be final field (zero length)
                    {
                        fieldLen = 2;
                        break;
                    }
                    else if (fieldLen + 1 >= line.Length) // must be final field
                    {
                        fieldLen += 1;
                        break;
                    }
                    else if (line[fieldLen] == textQual && line[fieldLen + 1] == textQual) // escaped text qualifier
                    {
                        fieldLen += 2;
                    }
                    else if (line[fieldLen] == textQual && line[fieldLen + 1] == delim) // must be end of field
                    {
                        fieldLen += 1;
                        break;
                    }
                    else // not a delimiter
                    {
                        fieldLen += 1;
                    }
                }
                string escapedQual = textQual.ToString() + textQual.ToString();
                fields.Add(line.Substring(1, fieldLen - 2).Replace(escapedQual, textQual.ToString())); // replace escaped qualifiers
                if (line.Length >= fieldLen + 1)
                {
                    line = line.Substring(fieldLen + 1);
                    if (line == "") // blank final field
                    {
                        fields.Add("");
                    }
                }
                else
                {
                    line = "";
                }
            }
            else // field is not text qualified
            {
                int fieldLen = line.IndexOf(delim);
                if (fieldLen != -1) // check next delimiter position
                {
                    fields.Add(line.Substring(0, fieldLen));
                    line = line.Substring(fieldLen + 1);
                    if (line == "") // final field must be blank 
                    {
                        fields.Add("");
                    }
                }
                else // must be last field
                {
                    fields.Add(line);
                    line = "";
                }
            }
        }
    }
    else // if there is no text qualifier, then use existing split function
    {
        fields.AddRange(line.Split(delim));
    }      

    if (colCount > 0 && colCount != fields.Count) // count doesn't match expected so throw exception
    {
        throw new Exception("Field count was:" + fields.Count.ToString() + ", expected:" + colCount.ToString() + ". Line:" + origLine);

    }
    return fields;
}
Feckless answered 29/6, 2009 at 14:4 Comment(0)
C
0

Problem: Convert a comma separated string into an array where commas in "quoted strings,,," should not be considered as separators but as part of an entry

Input: String: First,"Second","Even,With,Commas",,Normal,"Sentence,with ""different"" problems",3,4,5

Output: String-Array: ['First','Second','Even,With,Commas','','Normal','Sentence,with "different" problems','3','4','5']

Code:

string sLine;
sLine = "First,\"Second\",\"Even,With,Commas\",,Normal,\"Sentence,with \"\"different\"\" problems\",3,4,5";

// 1. Split line by separator; do not split if separator is within quotes
string Separator = ",";
string Escape = '"'.ToString();
MatchCollection Matches = Regex.Matches(sLine,
    string.Format("({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*", Separator, Escape));
string[] asColumns = new string[Matches.Count + 1];

int PriorMatchIndex = 0;
for (int Index = 0; Index <= asColumns.Length - 2; Index++)
{
    asColumns[Index] = sLine.Substring(PriorMatchIndex, Matches[Index].Groups["Separator"].Index - PriorMatchIndex);
    PriorMatchIndex = Matches[Index].Groups["Separator"].Index + Separator.Length;
}
asColumns[asColumns.Length - 1] = sLine.Substring(PriorMatchIndex);

// 2. Remove quotes
for (int Index = 0; Index <= asColumns.Length - 1; Index++)
{
    if (Regex.IsMatch(asColumns[Index], string.Format("^{0}[^{0}].*[^{0}]{0}$", Escape))) // If "Text" is sourrounded by quotes (but ignore double quotes => "Leave ""inside"" quotes")
    {
        asColumns[Index] = asColumns[Index].Substring(1, asColumns[Index].Length - 2); // "Text" => Text
    }
    asColumns[Index] = asColumns[Index].Replace(Escape + Escape, Escape); // Remove double quotes ('My ""special"" text' => 'My "special" text')
    if (asColumns[Index] == null) asColumns[Index] = "";
}

The output array is asColumns

Caveator answered 20/1, 2021 at 11:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.