Escaping double-slashes with regular expressions in Java
Asked Answered
B

3

4

I have this unit test:

public void testDeEscapeResponse() {
    final String[] inputs = new String[] {"peque\\\\u0f1o", "peque\\u0f1o"};
    final String[] expected = new String[] {"peque\\u0f1o", "peque\\u0f1o"};
    for (int i = 0; i < inputs.length; i++) {
        final String input = inputs[i];
        final String actual = QTIResultParser.deEscapeResponse(input);
        Assert.assertEquals(
            "deEscapeResponse did not work correctly", expected[i], actual);
    }
}

I have this method:

static String deEscapeResponse(String str) {
    return str.replaceAll("\\\\", "\\");
}

The unit test is failing with this error:

java.lang.StringIndexOutOfBoundsException: String index out of range: 1
    at java.lang.String.charAt(String.java:686)
    at java.util.regex.Matcher.appendReplacement(Matcher.java:703)
    at java.util.regex.Matcher.replaceAll(Matcher.java:813)
    at java.lang.String.replaceAll(String.java:2189)
    at com.acme.MyClass.deEscapeResponse
    at com.acme.MyClassTest.testDeEscapeResponse

Why?

Breaststroke answered 14/6, 2011 at 19:19 Comment(0)
S
4

Use String.replace which does a literal replacement instead of String.replaceAll which uses regular expressions.

Example:

"peque\\\\u0f1o".replace("\\\\", "\\")    //  gives  peque\u0f1o

String.replaceAll takes a regular expression thus \\\\ is interpreted as the expression \\ which in turn matches a single \. (The replacement string also has special treatment for \ so there's an error there too.)

To make String.replaceAll work as you expect here, you would need to do

"peque\\\\u0f1o".replaceAll("\\\\\\\\", "\\\\")
Staff answered 14/6, 2011 at 19:26 Comment(3)
You know, I thought about using replace, but I thought it was exactly the same as replaceAll, except that it only replaced the first instance. Thanks!!Breaststroke
Ah, I can see why you thought so when there is a replaceAll method :-)Staff
...though there is actually a String.replaceFirst method for that :-)Staff
N
2

I think the problem is that you're using replaceAll() instead of replace(). replaceAll expects a regular expression in the first field and you're just trying to string match.

Nauplius answered 14/6, 2011 at 19:27 Comment(0)
E
1

See javadoc for Matcher:

Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

Thus with replaceAll you cannot replace anything with a backslash. Thus a really crazy workaround for your case would be str.replaceAll("\\\\(\\\\)", "$1")

Enaenable answered 14/6, 2011 at 19:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.