String.replaceAll single backslashes with double backslashes
Asked Answered
G

5

137

I'm trying to convert the String \something\ into the String \\something\\ using replaceAll, but I keep getting all kinds of errors. I thought this was the solution:

theString.replaceAll("\\", "\\\\");

But this gives the below exception:

java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
Geez answered 9/11, 2009 at 15:40 Comment(0)
R
235

The String#replaceAll() interprets the argument as a regular expression. The \ is an escape character in both String and regex. You need to double-escape it for regex:

string.replaceAll("\\\\", "\\\\\\\\");

But you don't necessarily need regex for this, simply because you want an exact character-by-character replacement and you don't need patterns here. So String#replace() should suffice:

string.replace("\\", "\\\\");

Update: as per the comments, you appear to want to use the string in JavaScript context. You'd perhaps better use StringEscapeUtils#escapeEcmaScript() instead to cover more characters.

Russophobe answered 9/11, 2009 at 15:45 Comment(3)
Actually, it is used in a JavaScript AST that should be converted back to source. Your solution works. Thanks!Geez
If you want to use String#replaceAll() anyway, you can quote the replacement string with Matcher#quoteReplacement(): theString.replaceAll("\\", Matcher.quoteReplacement("\\\\"));Grapevine
Matcher.quoteReplacement( ... ) is a good way! Please see Pshemo's answer!Sarcous
U
21

TLDR: use theString = theString.replace("\\", "\\\\"); instead.


Problem

replaceAll(target, replacement) uses regular expression (regex) syntax for target and partially for replacement.

Problem is that \ is special character in regex (it can be used like \d to represents digit) and in String literal (it can be used like "\n" to represent line separator or \" to escape double quote symbol which normally would represent end of string literal).

In both these cases to create \ symbol we can escape it (make it literal instead of special character) by placing additional \ before it (like we escape " in string literals via \").

So to target regex representing \ symbol will need to hold \\, and string literal representing such text will need to look like "\\\\".

So we escaped \ twice:

  • once in regex \\
  • once in String literal "\\\\" (each \ is represented as "\\").

In case of replacement \ is also special there. It allows us to escape other special character $ which via $x notation, allows us to use portion of data matched by regex and held by capturing group indexed as x, like "012".replaceAll("(\\d)", "$1$1") will match each digit, place it in capturing group 1 and $1$1 will replace it with its two copies (it will duplicate it) resulting in "001122".

So again, to let replacement represent \ literal we need to escape it with additional \ which means that:

  • replacement must hold two backslash characters \\
  • and String literal which represents \\ looks like "\\\\"

BUT since we want replacement to hold two backslashes we will need "\\\\\\\\" (each \ represented by one "\\\\").

So version with replaceAll can look like

replaceAll("\\\\", "\\\\\\\\");

Easier way with replaceAll

To make out life easier Java provides tools to automatically escape text into target and replacement parts. So now we can focus only on strings, and forget about regex syntax:

replaceAll(Pattern.quote(target), Matcher.quoteReplacement(replacement))

which in our case can look like

replaceAll(Pattern.quote("\\"), Matcher.quoteReplacement("\\\\"))

Even better: use replace

If we don't really need regex syntax support lets not involve replaceAll at all. Instead lets use replace. Both methods will replace all targets, but replace doesn't involve regex syntax. So you could simply write

theString = theString.replace("\\", "\\\\");
Unicameral answered 20/10, 2016 at 20:39 Comment(0)
A
14

To avoid this sort of trouble, you can use replace (which takes a plain string) instead of replaceAll (which takes a regular expression). You will still need to escape backslashes, but not in the wild ways required with regular expressions.

Avertin answered 9/11, 2009 at 15:45 Comment(0)
E
8

You'll need to escape the (escaped) backslash in the first argument as it is a regular expression. Replacement (2nd argument - see Matcher#replaceAll(String)) also has it's special meaning of backslashes, so you'll have to replace those to:

theString.replaceAll("\\\\", "\\\\\\\\");
Evered answered 9/11, 2009 at 15:43 Comment(0)
R
3

Yes... by the time the regex compiler sees the pattern you've given it, it sees only a single backslash (since Java's lexer has turned the double backwhack into a single one). You need to replace "\\\\" with "\\\\", believe it or not! Java really needs a good raw string syntax.

Respire answered 9/11, 2009 at 15:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.