Escape string for use in Javascript regex [duplicate]
Asked Answered
G

1

690

Possible Duplicate:
Is there a RegExp.escape function in Javascript?

I am trying to build a javascript regex based on user input:

function FindString(input) {
    var reg = new RegExp('' + input + '');
    // [snip] perform search
}

But the regex will not work correctly when the user input contains a ? or * because they are interpreted as regex specials. In fact, if the user puts an unbalanced ( or [ in their string, the regex isn't even valid.

What is the javascript function to correctly escape all special characters for use in regex?

Garda answered 10/8, 2010 at 5:1 Comment(2)
Lodash have an escapeRegExp dedicated function: lodash.com/docs#escapeRegExpZurek
@YvesM. Maybe this is correct behavior, but that function doesn't escape `` characters. That would have unwanted effects on most regex strings I want to escape.Pugilism
C
1389

To escape the RegExp itself:

function escapeRegExp(string) {
    return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole matched string
}

To escape a replacement string:

function escapeReplacement(string) {
    return string.replace(/\$/g, '$$$$');
}

Example

All escaped RegExp characters:

escapeRegExp("All of these should be escaped: \ ^ $ * + ? . ( ) | { } [ ]");
>>> "All of these should be escaped: \\ \^ \$ \* \+ \? \. \( \) \| \{ \} \[ \] "

Find & replace a string:

var haystack = "I love $x!";

var needle = "$x";
var safeNeedle = escapeRegExp(needle); // "\\$x"

var replacement = "$100 bills"
var safeReplacement = escapeReplacement(replacement); // "$$100 bills"

haystack.replace(
  new RegExp(safeNeedle, 'g'),
  escapeReplacement(safeReplacement),
);
// "I love $100 bills!"

(NOTE: the above is not the original answer; it was edited to show the one from MDN. This means it does not match what you will find in the code in the below npm, and does not match what is shown in the below long answer. The comments are also now confusing. My recommendation: use the above, or get it from MDN, and ignore the rest of this answer. -Darren,Nov 2019)

Install

Available on npm as escape-string-regexp

npm install --save escape-string-regexp

Note

See MDN: Javascript Guide: Regular Expressions

Other symbols (~`!@# ...) MAY be escaped without consequence, but are not required to be.

.

.

.

.

Test Case: A typical url

escapeRegExp("/path/to/resource.html?search=query");

>>> "\/path\/to\/resource\.html\?search=query"

The Long Answer

If you're going to use the function above at least link to this stack overflow post in your code's documentation so that it doesn't look like crazy hard-to-test voodoo.

var escapeRegExp;

(function () {
  // Referring to the table here:
  // https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/regexp
  // these characters should be escaped
  // \ ^ $ * + ? . ( ) | { } [ ]
  // These characters only have special meaning inside of brackets
  // they do not need to be escaped, but they MAY be escaped
  // without any adverse effects (to the best of my knowledge and casual testing)
  // : ! , = 
  // my test "~!@#$%^&*(){}[]`/=?+\|-_;:'\",<.>".match(/[\#]/g)

  var specials = [
        // order matters for these
          "-"
        , "["
        , "]"
        // order doesn't matter for any of these
        , "/"
        , "{"
        , "}"
        , "("
        , ")"
        , "*"
        , "+"
        , "?"
        , "."
        , "\\"
        , "^"
        , "$"
        , "|"
      ]

      // I choose to escape every character with '\'
      // even though only some strictly require it when inside of []
    , regex = RegExp('[' + specials.join('\\') + ']', 'g')
    ;

  escapeRegExp = function (str) {
    return str.replace(regex, "\\$&");
  };

  // test escapeRegExp("/path/to/res?search=this.that")
}());
Cute answered 6/8, 2011 at 22:47 Comment(26)
Wow, that's verbose. I prefer bobince's version. But anything that works without escaping things unnecessarily...Conal
I expect all of the characters that SHOULD be escaped, not just the ones that MUST be escaped, which is what linters such as JSLint undersand.Cute
If someone knows please: Why is / to be escaped ? Its not in the list of characters, but it is in the regex both in this example and in the included MDN page's regex ?Michaelamichaele
A literal regex is like /blah/i. A literal comment is // blah. So to prevent abiguity /// becomes /\// and /blah/i/ becomes /blah\/i/. Make sense?Cute
Why is it replaced by '\\$&'. What is that suppose to mean? I am sorry, I am JS newbie.Fistic
@SushantGupta The "\\" adds the new backslash which escapes the matched special regex character. The "$&" is a back-reference to the contents of the current pattern match, adding the original special regex character.Vivisect
I think the ":" as e.g. used in "(?:x)" should also be escaped. The same applies for "=" as in "x(?=y)", the "!" as in "x(?!y)" and the "," as in "{n,m}". Or do you think, since the brackets are masked, these characters don't need to be masked???Krefetz
? and [ are escaped, which means that :, =, and ! have no special meaning. Check the comments in the code in the "Long Answer" block.Cute
you should also add the "m" for multilineThorma
Most of these characters don't need to be escaped within a character class. Dash and forward slash don't need to be escaped at all. So, this can be simplified as: return str.replace(/[[{}()*+?^$|\]\.\\]/g, "\\$&");Pastry
I have a fairly complex regex (that needed to be broken down to 80 chars per line due to employer's coding conventions) and this version didn't work, but bobince's version did.Protractor
@CoolAJ86 forward slashes shouldn't be escaped, they are not control characters in strings passed to RegExp. The only time you need to escape them is if you're generating JavaScript source code for use within a regex literal, which is a completely different problem and which your post doesn't even start to solve (for example, you don't escape newlines).Eggcup
Is there a saner way in 2016?Cherrylchersonese
The short one doesn't work for escaping "c++", which will be converted to c\\\\\\...(endless)Panatella
Nice answer. I looked for an online tool that does this but couldn't find one so have knocked up an online version at jsfiddle: jsfiddle.net/bwp2m5LpLavonia
Due to the recent ESLint release (no-useless-escape) it fails with this RegExp now by default.Hightest
Yes most of the escaped chars are redundant as eslint keeps telling me. I've tested and this would seem equivalent: str.replace(/[-[\]/{}()*+?.\\^$|]/g, '\\$&');Akins
Why is this hilariously wrong answer upvoted? It fails for even this simple regex: [\s\S]*Unconditioned
@Kal_Torak: var re = /[\s\S]*/; escapeRegExp(re.toString());Cute
@Unconditioned it does not fail, this is because you need to escape the escape in your string as shown by @CoolAJ86Rossman
I just added a note to this answer to hopefully be less confusing; linting and checking is more strict in the latest browsers, and escaping - (as shown in the long answer, and the pre-2018 short answer) was causing runtime errors. Using the regex from MDN fixed it for me.Hulahula
@DarrenCook Assuming that [ is escaped I suppose it should be safe to not escape -. However, if you concatenate a string with - inside of a RegExp with [ before ], then you will in fact need it to be escaped.Cute
@CoolAJ86 Escaping something inside a character class (square brackets) makes no sense. If you are deliberately inserting end user text into just the character class part of a regex, you need to do something different: detect if any hyphens, and if so remove them all, then put the hyphen at the very start or end of the character class; an inserted close square bracket does need escaping, though.Hulahula
If you're already using lodash, you can always import and use _.escapeRegExp()Logogriph
@RichardWilliams They are not redundant with the u flag. See my article: abareplace.com/blog/escape-regexpEpergne
The NPM package doesn't even come with an escaper for the replacement, so useless.Storeroom

© 2022 - 2024 — McMap. All rights reserved.