RegExp in PHP. Get text between first level parentheses
Asked Answered
W

3

6

I have two type of strings in one text:

a(bc)de(fg)h

a(bcd(ef)g)h

I need to get text between first level parentheses. In my example this is:

bc

fg

bcd(ef)g

I tried to use next regular expression /\((.+)\)/ with Ungreedy (U) flag:

bc

fg

bcd(ef

And without it:

bc)de(fg

bcd(ef)g

Both variants don't do what I need. Maybe someone know how solve my issue?

Wanting answered 8/4, 2017 at 13:10 Comment(3)
why Regex? Simple iteration over the string would workHarrisonharrod
are you trying to process each string separately OR they are a part of arbitrary text?Vermifuge
This strings are a part of one text.Wanting
R
2

This question pretty much has the answer, but the implementations are a little ambiguous. You can use the logic in the accepted answer without the ~s to get this regex:

\(((?:\[^\(\)\]++|(?R))*)\)

Tested with this output:

enter image description here

Respectful answered 8/4, 2017 at 13:41 Comment(0)
V
3

Use PCRE Recursive pattern to match substrings in nested parentheses:

$str = "a(bc)de(fg)h some text a(bcd(ef)g)h ";
preg_match_all("/\((((?>[^()]+)|(?R))*)\)/", $str, $m);

print_r($m[1]);

The output:

Array
(
    [0] => bc
    [1] => fg
    [2] => bcd(ef)g
)

\( ( (?>[^()]+) | (?R) )* \)

First it matches an opening parenthesis. Then it matches any number of substrings which can either be a sequence of non-parentheses, or a recursive match of the pattern itself (i.e. a correctly parenthesized substring). Finally, there is a closing parenthesis.


Technical cautions:

If there are more than 15 capturing parentheses in a pattern, PCRE has to obtain extra memory to store data during a recursion, which it does by using pcre_malloc, freeing it via pcre_free afterwards. If no memory can be obtained, it saves data for the first 15 capturing parentheses only, as there is no way to give an out-of-memory error from within a recursion.

Vermifuge answered 8/4, 2017 at 13:48 Comment(2)
How would one match the second level here, ie ef (out of curiosity) ? +1 anyway.Panne
@Jan, (ef) become a recursive match of the pattern itself (i.e. of the "parent" pattern)Vermifuge
R
2

This question pretty much has the answer, but the implementations are a little ambiguous. You can use the logic in the accepted answer without the ~s to get this regex:

\(((?:\[^\(\)\]++|(?R))*)\)

Tested with this output:

enter image description here

Respectful answered 8/4, 2017 at 13:41 Comment(0)
D
-1

Please can you try that:

preg_match("/\((.+)\)/", $input_line, $output_array);

Test this code in http://www.phpliveregex.com/

Regex: \((.+)\)
Input: a(bcd(eaerga(er)gaergf)g)h
Output: array(2
   0    =>  (bcd(eaerga(er)gaergf)g)
   1    =>  bcd(eaerga(er)gaergf)g
)
Danuloff answered 8/4, 2017 at 13:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.