Regular expression to replace content between parentheses ()
Asked Answered
K

4

8

I tried this code:

string.replaceAll("\\(.*?)","");

But it returns null. What am I missing?

Keratin answered 12/4, 2011 at 13:34 Comment(1)
Note that the frequently offered expression: \(.*?\) fails if parentheses are nested. Instead of the lazy-dot-star, use the more precise expression: [^()]*. See my answer for a better solution.Riocard
E
21

Try:

string.replaceAll("\\(.*?\\)","");

You didn't escape the second parenthesis and you didn't add an additional "\" to the first one.

Enthusiasm answered 12/4, 2011 at 13:36 Comment(1)
Works fine if there are no nested parentheses. Otherwise it fails.Riocard
R
11

First, Do you wish to remove the parentheses along with their content? Although the title of the question indicates no, I am assuming that you do wish to remove the parentheses as well.

Secondly, can the content between the parentheses contain nested matching parentheses? This solution assumes yes. Since the Java regex flavor does not support recursive expressions, the solution is to first craft a regex which matches the "innermost" set of parentheses, and then apply this regex in an iterative manner replacing them from the inside-out. Here is a tested Java program which correctly removes (possibly nested) parentheses and their contents:

import java.util.regex.*;
public class TEST {
    public static void main(String[] args) {
        String s = "stuff1 (foo1(bar1)foo2) stuff2 (bar2) stuff3";
        String re = "\\([^()]*\\)";
        Pattern p = Pattern.compile(re);
        Matcher m = p.matcher(s);
        while (m.find()) {
            s = m.replaceAll("");
            m = p.matcher(s);
        }
        System.out.println(s);
    }
}

Test Input:

"stuff1 (foo1(bar1)foo2) stuff2 (bar2) stuff3"

Test Output:

"stuff1  stuff2  stuff3"

Note that the lazy-dot-star solution will never work, because it fails to match the innermost set of parentheses when they are nested. (i.e. it erroneously matches: (foo1(bar1) in the example above.) And this is a very commonly made regex mistake: Never use the dot when there is a more precise expression! In this case, the contents between an "innermost" set of matching parentheses consists of any character that is not an opening or closing parentheses, (i.e. Use: [^()]* instead of: .*?).

Riocard answered 12/4, 2011 at 15:35 Comment(4)
this is an absolute brilliant answer. java's regex engine bitten me again with its lacking support of recursive expressions :/Pi
@Riocard what if I have something like this : "stuff1 [foo1[bar1]foo2] stuff2 [bar2] stuff3" ?Tumid
@user3767784 - What is your question? Your string has no parentheses. If you want to match matching [square brackets] instead of matching parentheses, then simply modify the regex and replace each literal parentheses with a literal square bracket. 1.e. String re = "\\[[^\\[\\]]*\\]";Riocard
Since 3.8, Apache commons-lang now ships with RegexUtils, which includes utilities for recursive matching.Willaims
S
2

Try string.replaceAll("\\(.*?\\)","").

Stonework answered 12/4, 2011 at 13:36 Comment(2)
The match above ".*?" is non greedy, it'll match to the first occurrence of closing parentheses. For greedy it's just ".*"Middy
@earcam: Yes, thanks, I've already spotted that and edited it out.Stonework
T
1

string.replaceAll("\\([^\\)]*\\)",""); This way you are saying match a bracket, then all non-closing bracket chars, and then a closing bracket. This is usually faster than reluctant or greedy .* matchers.

Trifocals answered 12/4, 2011 at 14:4 Comment(1)
You don't need to escape the parenthesis in a character class.Nicolais

© 2022 - 2024 — McMap. All rights reserved.