Split String into smaller Strings by length variable
Asked Answered
H

14

42

I'd like to break apart a String by a certain length variable.
It needs to bounds check so as not explode when the last section of string is not as long as or longer than the length. Looking for the most succinct (yet understandable) version.

Example:

string x = "AAABBBCC";
string[] arr = x.SplitByLength(3);
// arr[0] -> "AAA";
// arr[1] -> "BBB";
// arr[2] -> "CC"
Heymann answered 9/6, 2010 at 18:29 Comment(0)
M
74

You need to use a loop:

public static IEnumerable<string> SplitByLength(this string str, int maxLength) {
    for (int index = 0; index < str.Length; index += maxLength) {
        yield return str.Substring(index, Math.Min(maxLength, str.Length - index));
    }
}

Alternative:

public static IEnumerable<string> SplitByLength(this string str, int maxLength) {
    int index = 0;
    while(true) {
        if (index + maxLength >= str.Length) {
            yield return str.Substring(index);
            yield break;
        }
        yield return str.Substring(index, maxLength);
        index += maxLength;
    }
}

2nd alternative: (For those who can't stand while(true))

public static IEnumerable<string> SplitByLength(this string str, int maxLength) {
    int index = 0;
    while(index + maxLength < str.Length) {
        yield return str.Substring(index, maxLength);
        index += maxLength;
    }

    yield return str.Substring(index);
}
Mestee answered 9/6, 2010 at 18:36 Comment(11)
Right on! I didn't want to limit folks by asking for the IEnumerable version. This rocks.Heymann
Great minds think alike, I guess! (Or more humbly: great minds and mediocre minds sometimes have the same thought?)Arvizu
Do not like the While (true) I think that is bad style, unless listening on a port.Alfieri
@Mike: That's why I included the first version.Mestee
@Mike: I think run-on sentences with no subject in the first clause are bad style. (Oh snap!)Arvizu
ugggghhhhh, I can't believe a +1 for oh snap, do you have an alternate account you vote yourself up. (period) Haven't heard that type of talk since grade schoolAlfieri
@Mike: I was just joking, geez. +1 on a comment is meaningless anyway (and no, I did not +1 myself). I just thought it was a little ridiculous that you'd downvote someone because you didn't like his/her style. "It answers the question, and it works perfectly, but... I don't like that style, so I'll downvote it."Arvizu
@SLaks: I think in your final example, you meant <= instead of >=.Arvizu
+1 Impressioned for the performance (1st version) compared to what I ve used (till now): string[] sa = (Regex.Split(s, "(...)")); string[] sa = sa.Where(item => !string.IsNullOrEmpty(item)).ToArray();Islas
not that you'd use it intentionally so (usage could potentially be buried within other methods), but pointing out that maxLength == 0 results in an infinite loopOxyacid
Love the first solution with the yield - so elegant.Kernel
O
14

Easy to understand version:

string x = "AAABBBCC";
List<string> a = new List<string>();
for (int i = 0; i < x.Length; i += 3)
{
    if((i + 3) < x.Length)
        a.Add(x.Substring(i, 3));
    else
        a.Add(x.Substring(i));
}

Though preferably the 3 should be a nice const.

Overprint answered 9/6, 2010 at 18:34 Comment(5)
If the int to split by is larger than the string length, this will not yield any result.Derwent
@JYelton: No, it will still enter the loop and end up in the else statement.Overprint
@JYelton: I think you're mistaken. Are you suggesting there's some way to sneak past an if/else without triggering either?Arvizu
Nevermind, I was mistaken. I originally thought that the loop would not execute if the split length was longer than the string length, but the check mechanism is built-in.Derwent
beautiful - thanks. Much easier to follow than the IEnumerable for some of us newer developersReaves
M
6

It's not particularly succinct, but I might use an extension method like this:

public static IEnumerable<string> SplitByLength(this string s, int length)
{
    for (int i = 0; i < s.Length; i += length)
    {
        if (i + length <= s.Length)
        {
            yield return s.Substring(i, length);
        }
        else
        {
            yield return s.Substring(i);
        }
    }
}

Note that I return an IEnumerable<string>, not an array. If you want to convert the result to an array, use ToArray:

string[] arr = x.SplitByLength(3).ToArray();
Mccomas answered 9/6, 2010 at 18:36 Comment(0)
T
6

My solution:

public static string[] SplitToChunks(this string source, int maxLength)
{
    return source
        .Where((x, i) => i % maxLength == 0)
        .Select(
            (x, i) => new string(source
                .Skip(i * maxLength)
                .Take(maxLength)
                .ToArray()))
        .ToArray();
}

I actually rather use List<string> instead of string[].

Tumor answered 17/1, 2011 at 11:18 Comment(0)
A
5

Here's what I'd do:

public static IEnumerable<string> EnumerateByLength(this string text, int length) {
    int index = 0;
    while (index < text.Length) {
        int charCount = Math.Min(length, text.Length - index);
        yield return text.Substring(index, charCount);
        index += length;
    }
}

This method would provide deferred execution (which doesn't really matter on an immutable class like string, but it's worth noting).

Then if you wanted a method to populate an array for you, you could have:

public static string[] SplitByLength(this string text, int length) {
    return text.EnumerateByLength(length).ToArray();
}

The reason I would go with the name EnumerateByLength rather then SplitByLength for the "core" method is that string.Split returns a string[], so in my mind there's precedence for methods whose names start with Split to return arrays.

That's just me, though.

Arvizu answered 9/6, 2010 at 18:37 Comment(2)
I wanted to upvote your use of a separate Inner/Impl method, but then you had to go and spoil it by calling .ToArray() in the outer method.Lambrecht
@Joel: Ha, but that's what the OP requested! Fine, I'll change it up.Arvizu
H
1

Using Batch from MoreLinq, on .Net 4.0:

public static IEnumerable<string> SplitByLength(this string str, int length)
{
    return str.Batch(length, String.Concat);
}

On 3.5 Concat need an array, so we can use Concat with ToArray or, new String:

public static IEnumerable<string> SplitByLength(this string str, int length)
{
    return str.Batch(length, chars => new String(chars.ToArray()));
}

It may be a bit unintuitive to look at a string as a collection of characters, so string manipulation might be proffered.

Horsy answered 9/6, 2010 at 18:42 Comment(0)
F
1

UPD: Using some Linq to make it actually succinct


static IEnumerable EnumerateByLength(string str, int len)
        {
            Match m = (new Regex(string.Format("^(.{{1,{0}}})*$", len))).Match(str);
            if (m.Groups.Count <= 1)
                return Empty;
            return (from Capture c in m.Groups[1].Captures select c.Value);
        }

Initial version:


        static string[] Empty = new string [] {};

        static string[] SplitByLength(string str, int len)
        {
            Regex r = new Regex(string.Format("^(.{{1,{0}}})*$",len));
            Match m = r.Match(str);
            if(m.Groups.Count <= 1)
                return Empty;

            string [] result = new string[m.Groups[1].Captures.Count];
            int ix = 0;
            foreach(Capture c in m.Groups[1].Captures)
            {
                result[ix++] = c.Value;
            }
            return result;
        }
Flyweight answered 9/6, 2010 at 20:21 Comment(0)
C
1

Yet another slight variant (classic but simple and pragmatic):

class Program
{
    static void Main(string[] args) {
        string msg = "AAABBBCC";

        string[] test = msg.SplitByLength(3);            
    }
}

public static class SplitStringByLength
{
    public static string[] SplitByLength(this string inputString, int segmentSize) {
        List<string> segments = new List<string>();

        int wholeSegmentCount = inputString.Length / segmentSize;

        int i;
        for (i = 0; i < wholeSegmentCount; i++) {
            segments.Add(inputString.Substring(i * segmentSize, segmentSize));
        }

        if (inputString.Length % segmentSize != 0) {
            segments.Add(inputString.Substring(i * segmentSize, inputString.Length - i * segmentSize));
        }

        return segments.ToArray();
    }
}
Cerussite answered 9/6, 2010 at 21:13 Comment(0)
D
0
    private string[] SplitByLength(string s, int d)
    {
        List<string> stringList = new List<string>();
        if (s.Length <= d) stringList.Add(s);
        else
        {
            int x = 0;
            for (; (x + d) < s.Length; x += d)
            {
                stringList.Add(s.Substring(x, d));
            }
            stringList.Add(s.Substring(x));
        }
        return stringList.ToArray();
    }
Derwent answered 9/6, 2010 at 18:44 Comment(0)
A
0
    private void button2_Click(object sender, EventArgs e)
    {
        string s = "AAABBBCCC";
        string[] a = SplitByLenght(s,3);
    }

    private string[] SplitByLenght(string s, int split)
    {
        //Like using List because I can just add to it 
        List<string> list = new List<string>();

                    // Integer Division
        int TimesThroughTheLoop = s.Length/split;


        for (int i = 0; i < TimesThroughTheLoop; i++)
        {
            list.Add(s.Substring(i * split, split));

        }

        // Pickup the end of the string
        if (TimesThroughTheLoop * split != s.Length)
        {
            list.Add(s.Substring(TimesThroughTheLoop * split));
        }

        return list.ToArray();
    }
Alfieri answered 9/6, 2010 at 18:45 Comment(0)
O
0

I had the strange scenario where I had segmented a string, then rearranged the segments (i.e. reversed) before concatenating them, and then I later needed to reverse the segmentation. Here's an update to the accepted answer by @SLaks:

    /// <summary>
    /// Split the given string into equally-sized segments (possibly with a 'remainder' if uneven division).  Optionally return the 'remainder' first.
    /// </summary>
    /// <param name="str">source string</param>
    /// <param name="maxLength">size of each segment (except the remainder, which will be less)</param>
    /// <param name="remainderFirst">if dividing <paramref name="str"/> into segments would result in a chunk smaller than <paramref name="maxLength"/> left at the end, instead take it from the beginning</param>
    /// <returns>list of segments within <paramref name="str"/></returns>
    /// <remarks>Original method at https://mcmap.net/q/382421/-split-string-into-smaller-strings-by-length-variable </remarks>
    private static IEnumerable<string> ToSegments(string str, int maxLength, bool remainderFirst = false) {
        // note: `maxLength == 0` would not only not make sense, but would result in an infinite loop
        if(maxLength < 1) throw new ArgumentOutOfRangeException("maxLength", maxLength, "Should be greater than 0");
        // correct for the infinite loop caused by a nonsensical request of `remainderFirst == true` and no remainder (`maxLength==1` or even division)
        if( remainderFirst && str.Length % maxLength == 0 ) remainderFirst = false;

        var index = 0;
        // note that we want to stop BEFORE we reach the end
        // because if it's exact we'll end up with an
        // empty segment
        while (index + maxLength < str.Length)
        {
            // do we want the 'final chunk' first or at the end?
            if( remainderFirst && index == 0 ) {
                // figure out remainder size
                var remainder = str.Length % maxLength;
                yield return str.Substring(index, remainder);

                index += remainder;
            }
            // normal stepthrough
            else {
                yield return str.Substring(index, maxLength);
                index += maxLength;
            }
        }

        yield return str.Substring(index);
    }//---  fn  ToSegments

(I also corrected a bug in the original while version resulting in empty segment if maxLength==1)

Oxyacid answered 11/3, 2014 at 16:27 Comment(0)
R
0

I have a recursive solution:

    public List<string> SplitArray(string item, int size)
    {
        if (item.Length <= size) return new List<string> { item };
        var temp = new List<string> { item.Substring(0,size) };
        temp.AddRange(SplitArray(item.Substring(size), size));
        return temp;
    }

Thoug, it does not returns a IEnumerable but a List

Ritualist answered 15/3, 2019 at 18:4 Comment(0)
S
0
static IEnumerable<string> SplitByLength(string str, int maxLength) {
    if (str is null)
        throw new ArgumentNullException(nameof(str));

    if (maxLength < 1)
        throw new ArgumentOutOfRangeException(nameof(maxLength));    

    int index = 0;

    while (index < str.Length)
    {
        var span = str.AsSpan(index, Math.Min(maxLength, str.Length - index));
        index += span.Length;
        yield return span.ToString();
    }
}
Sobriety answered 4/7 at 2:31 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Weave
M
0

I quite like this Regex approach:

string x = "AAABBBCC";

string[] arr =
    Regex
        .Match(x, "(...|..$|.$)*")
        .Groups[1]
        .Captures
        .Cast<Capture>()
        .Select(c => c.Value)
        .ToArray();

Or:

string[] arr =
    Regex
        .Split(x, "(...)")
        .Where(c => !String.IsNullOrEmpty(c))
        .ToArray()

They both give:

AAA
BBB
CC
Malleable answered 4/7 at 5:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.