Stream read line
Asked Answered
O

3

7

I have a stream reader line by line (sr.ReadLine()). My code counts the line-end with both line endings \r\n and/or \n.

        StreamReader sr = new System.IO.StreamReader(sPath, enc);

        while (!sr.EndOfStream)
        {
            // reading 1 line of datafile
            string sLine = sr.ReadLine();
            ...

How to tell to code (instead of universal sr.ReadLine()) that I want to count new line only a full \r\n and not the \n?

Orling answered 15/9, 2014 at 7:50 Comment(3)
So, do you want the number of occurences of \r\n and number of occurences of solo \n in the stream?Adena
Exactly I want to read each line, but the line means till ending by the full \r\n and not only \n. Other words I can say, that one row can contain any blah blah \n blah \r\nOrling
It is important to know how big is your file to choose the correct way to handle the input-Baudoin
K
9

It is not possible to do this using StreamReader.ReadLine. As per msdn:

A line is defined as a sequence of characters followed by a line feed ("\n"), a carriage return ("\r"), or a carriage return immediately followed by a line feed ("\r\n"). The string that is returned does not contain the terminating carriage return or line feed. The returned value is null if the end of the input stream is reached.

So yoг have to read this stream byte-by-byte and return line only if you've captured \r\n

EDIT

Here is some code sample

private static IEnumerable<string> ReadLines(StreamReader stream)
{
    StringBuilder sb = new StringBuilder();

    int symbol = stream.Peek();
    while (symbol != -1)
    {
        symbol = stream.Read();
        if (symbol == 13 && stream.Peek() == 10)
        {
            stream.Read();

            string line = sb.ToString();
            sb.Clear();

            yield return line;
        }
        else
            sb.Append((char)symbol);
    }

    yield return sb.ToString();
}

You can use it like

foreach (string line in ReadLines(stream))
{
   //do something
}
Kieffer answered 15/9, 2014 at 7:58 Comment(3)
Yes, I was affraid of that :( Can you gimme the sample to submit your answere pls?Orling
What does those 13 and 10 signify in (symbol == 13 && stream.Peek() == 10)?Phagocyte
@NoSaidTheCompiler they are ascii codes of \r and \n charactersKieffer
U
5

you cannot do it with ReadLine, but you can do instead:

stream.ReadToEnd().Split(new[] {"\r\n"}, StringSplitOptions.None)
Unclassified answered 15/9, 2014 at 8:10 Comment(2)
What if stream data is really large? ReadToEnd doesn't seems to be a reasonable solution.Kieffer
it always depends on workload, if you need super optimized you can use StreamReader.Readline source of mono implementationUnclassified
A
-1

For simplification, let's work over a byte array:

    static int NumberOfNewLines(byte[] data)
    {
        int count = 0;
        for (int i = 0; i < data.Length - 1; i++)
        {
            if (data[i] == '\r' && data[i + 1] == '\n')
                count++;
        }
        return count;
    }

If you care about efficiency, optimize away, but this should work.

You can get the bytes of a file by using System.IO.File.ReadBytes(string filename).

Adena answered 15/9, 2014 at 8:10 Comment(2)
dont use byte[] directly for text, because encoding can create problemsUnclassified
@EnricoSada so is there some way that in unicode/utf8 these bytes (\n, \r) would be just part of multibyte characters?Adena

© 2022 - 2024 — McMap. All rights reserved.