This one is more correct, but still not working for nested parenthesis /[^(,]*(?:([^)]+))?[^),]*/
– DarkSide Mar 24, 2013 at 23:09
You're method can not parse "one, two, three, ((five), (four(six))), seven, eight, nine". I think the correct RegEx would be a recursive one: /(([^()]+|(?R))*)/.
– Cristian Toma Jul 6, 2009 at 7:26
Yes, it's easier, but doesn't work in case of nested brackets, like so: one, two, three, (four, (five, six), (ten)), seven
– Cristian Toma Jul 6, 2009 at 7:41
Thank you very much, your help is much appreciated. But now I realize that I will also encounter nested brackets and your solution doesn't apply.
– Cristian Toma Jul 6, 2009 at 7:43
Sounds to me that we need to have a string splitting algorithm that respects balanced parenthetical grouping. I'll give that a crack using a recursive regex pattern! The behavior will be to respect the lowest balanced parentheticals and let any higher level un-balanced parentheticals be treated as non-grouping characters. Please leave a comment with any input strings that are not correctly split so that I can try to make improvements (test driven development).
Code: (Demo)
$tests = [
'one, two, three, (four, five, six), seven, (eight, nine)',
'()',
'one and a ),',
'(one, two, three)',
'one, (, two',
'one, two, ), three',
'one, (unbalanced, (nested, whoops ) two',
'one, two, three and a half, ((five), (four(six))), seven, eight, nine',
'one, (two, (three and a half, (four, (five, (six, seven), eight)))), nine, (ten, twenty twen twen)',
'ten, four, (,), good buddy',
];
foreach ($tests as $test) {
var_export(
preg_split(
'/(?>(\((?:(?>[^()]+)|(?1))*\))|[^,]+)\K,?\s*/',
$test,
0,
PREG_SPLIT_NO_EMPTY
)
);
echo "\n";
}
Output:
array (
0 => 'one',
1 => 'two',
2 => 'three',
3 => '(four, five, six)',
4 => 'seven',
5 => '(eight, nine)',
)
array (
0 => '()',
)
array (
0 => 'one and a )',
)
array (
0 => '(one, two, three)',
)
array (
0 => 'one',
1 => '(',
2 => 'two',
)
array (
0 => 'one',
1 => 'two',
2 => ')',
3 => 'three',
)
array (
0 => 'one',
1 => '(unbalanced',
2 => '(nested, whoops )',
3 => 'two',
)
array (
0 => 'one',
1 => 'two',
2 => 'three and a half',
3 => '((five), (four(six)))',
4 => 'seven',
5 => 'eight',
6 => 'nine',
)
array (
0 => 'one',
1 => '(two, (three and a half, (four, (five, (six, seven), eight))))',
2 => 'nine',
3 => '(ten, twenty twen twen)',
)
array (
0 => 'ten',
1 => 'four',
2 => '(,)',
3 => 'good buddy',
)
Here's a related answer which recursively traverses parenthetical groups and reverses the order of comma separated values on each level: Reverse the order of parenthetically grouped text and reverse the order of parenthetical groups