How would you split by \r\n if String.Split(String[]) did not exist?
Asked Answered
J

8

6

Using the .NET MicroFramework which is a really cut-down version of C#. For instance, System.String barely has any of the goodies that we've enjoyed over the years.

I need to split a text document into lines, which means splitting by \r\n. However, String.Split only provides a split by char, not by string.

How can I split a document into lines in an efficient manner (e.g. not looping madly across each char in the doc)?

P.S. System.String is also missing a Replace method, so that won't work.
P.P.S. Regex is not part of the MicroFramework either.

Jury answered 10/1, 2010 at 22:33 Comment(1)
It's not a cut-down version of C#; it's a cut-down version of the .NET Framework.Basilio
S
12

You can do

string[] lines = doc.Split('\n');
for (int i = 0; i < lines.Length; i+= 1)
   lines[i] = lines[i].Trim();

Assuming that the µF supports Trim() at all. Trim() will remove all whitespace, that might be useful. Otherwise use TrimEnd('\r')

Shunt answered 10/1, 2010 at 22:44 Comment(2)
Yep, this does the trick and keeps most of the performance characteristics. Thanks. Pretty clever.Jury
I was just about to suggest something like this - beat me to the punch. You may need to recombine strings (theoretically) if '\n' is not always preceded by '\r' in the input.Chinook
S
6

I would loop across each char in the document, because that's clearly required. How do you think String.Split works? I would try to do so only hitting each character once, however.

Keep a list of strings found so far. Use IndexOf repeatedly, passing in the current offset into the string (i.e. the previous match + 2).

Sephira answered 10/1, 2010 at 22:38 Comment(2)
True, but the full .NET implementation of Split(string[]) uses pointers and unsafe code to achieve the performance that it does. Otherwise, I would simply copy the code. I am operating on a really low end chip and was hoping for something exceedingly clever.Jury
If you're only splitting by a single delimiter string, and the delimiter is longer than 3 character, you can get better performance using one of the Boyer-Moore search algorithm variants. Not that it helps the poster's problem, in this case. en.wikipedia.org/wiki/…Chinook
L
3

How can I split a document into lines in an efficient manner (e.g. not looping madly across each char in the doc)?

How do you think the built-in Split works?

Just reimplement it yourself as an extension method.

Liquid answered 10/1, 2010 at 22:38 Comment(0)
W
2

What about:

string path = "yourfile.txt";
string[] lines = File.ReadAllLines(path);

Or

string content = File.ReadAllText(path);
string[] lines = content.Split(
    Environment.NewLine.ToCharArray(),
    StringSplitOptions.RemoveEmptyEntries);

Readind that .NET Micro Framework 3.0, this code can work:

string line = String.Empty;
StreamReader reader = new StreamReader(path);
while ((line = reader.ReadLine()) != null)
{
    // do stuff
}
Wincer answered 10/1, 2010 at 22:36 Comment(4)
It splits by both items in the CharArray separately. Thus, you'd get a bunch of empty results in addition to legit results.Jury
@AngryHacker: RemoveEmptyEntries should deal with them;Wincer
File.ReadAllLines is not supported either.Jury
See the link the provided in the question for documentation on what's supported.Jury
S
0

This may help in some scenario:

StreamReader reader = new StreamReader(file);    
string _Line = reader.ReadToEnd();
string IntMediateLine = string.Empty;
IntMediateLine = _Line.Replace("entersign", "");
string[] ArrayLineSpliter = IntMediateLine.Split('any specail chaarater');
Schwenk answered 1/9, 2012 at 12:39 Comment(1)
In what scenarios? Can you provide some details please.Erigeron
T
0

If you'd like a MicroFramework compatible split function that works for an entire string of characters, here's one that does the trick, similar to the regular frameworks' version using StringSplitOptions.None:

    private static string[] Split(string s, string delim)
    {
        if (s == null) throw new NullReferenceException();

        // Declarations
        var strings = new ArrayList();
        var start = 0;

        // Tokenize
        if (delim != null && delim != "")
        {
            int i;
            while ((i = s.IndexOf(delim, start)) != -1)
            {
                strings.Add(s.Substring(start, i - start));
                start = i + delim.Length;
            }
        }

        // Append left over
        strings.Add(s.Substring(start));

        return (string[]) strings.ToArray(typeof(string));
    }
Tenace answered 22/8, 2017 at 18:43 Comment(1)
I thought about turning this into an Extension Method as per this possibilityTenace
I
-1

You can split your string with a substring.

String.Split(new string[] { "\r\n" }, StringSplitOptions.None);
Idolum answered 29/9, 2020 at 22:45 Comment(0)
S
-1
String theText = "word1 \n word2 \n word3";

string[] lines = theText.Split( new string[] { "\r\n", "\r", "\n"}, StringSplitOptions.None);

or

string[] lines = theText.Split( new string[] { "\r\n", "\r", "\n" }, StringSplitOptions.RemoveEmptyEntries );

https://spacetech.dk/how-to-split-on-newline-in-c.html

https://www.bytehide.com/blog/split-string-csharp

Skylar answered 29/2 at 3:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.