Why does "||".split("\\|").length return 0 and not 3?
Asked Answered
B

2

7

When there are adjacent separators in the split expression I expect null or an empty string--not have it eliminated.

The Java code is below:

public class splitter {
    public static void main(String args[]) {
        int size = "||".split("\\|").length;
        assert size == 3 : "size should be 3 and not " + size;
    }
}

I expected to get either { "", "", "" } or { null, null, null }. Either would be fine.

Perhaps there's a regular expression that will not be fooled by empty words?

Boxwood answered 17/7, 2012 at 1:11 Comment(0)
K
14

According to the javadoc:

This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.

The javadoc for split(String, int) elaborates:

The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

(emphasis mine)

So to return an array of empty strings, call "||".split("\\|", -1)

Kannada answered 17/7, 2012 at 1:26 Comment(2)
Thanks, Paul. I like your answer much better.Boxwood
".. much better than mine." I meant to write.Boxwood
B
0

I need to take a closer look at Paul's answer (his looks simpler), but I was able to find something about look-ahead expressions that solve the assertions (I apologize that the code is in Apex--it just wraps Java).

static testMethod void testPatternStringSplit() {
        Pattern aPattern = Pattern.Compile('(?=\\|)');
        system.assertEquals(3, aPattern.split('||').size());
        system.assertEquals(3, aPattern.split(' | | ').size());
        system.assertEquals(3, aPattern.split('a|b|c').size());
        system.assertEquals(3, aPattern.split('a|b|').size());
        system.assertEquals(3, aPattern.split('|b|c').size());
        system.assertEquals(3, aPattern.split('|b|').size());
}

I need to write some code to test Paul's ...

Boxwood answered 17/7, 2012 at 2:3 Comment(1)
That won't give him three empty strings when splitting "||" though - it will give him one empty string and two "|"s.Internuncio

© 2022 - 2024 — McMap. All rights reserved.