How to split a long regular expression into multiple lines in JavaScript?

H

11

196

I have a very long regular expression, which I wish to split into multiple lines in my JavaScript code to keep each line length 80 characters according to JSLint rules. It's just better for reading, I think. Here's pattern sample:

var pattern = /^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|(\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/;

Hendrix answered 7/9, 2012 at 11:17 Comment(6)

It seems you're (trying to) validate e-mail addresses. Why not simply do /\S+@\S+\.\S+/ ? – Ramiform 7/9, 2012 at 11:21

You should probably look to find a way to do that without a regular expression, or with multiple smaller regular expressions. That would be much more readable than a regular expression that long. If your regular expression is more than about 20 characters, there's probably a better way to do it. – Defrock 7/9, 2012 at 11:22

Isn't 80 characters kind of obsolete nowadays with wide monitors? – Damon 7/9, 2012 at 12:4

@OlegV.Volkov No. A person could be using split windows in vim, a virtual terminal in a server room. It is wrong to assume everyone will be coding in the same viewport as you. Furthermore, limiting your lines to 80 chars forces you to break up your code into smaller functions. – Expedition 10/10, 2012 at 21:23

Well, I certainly see your motivation for wanting to do this here - once this regex is split over multiple lines, as demonstrated by Koolilnc, it immediately becomes a perfect example of readable, self-documenting code. ¬_¬ – Wooer 10/6, 2014 at 13:41

@OlegV.Volkov, it is still can be convenient to be able to split a wide monitor into several windows. For instance in one window you have your text editor, in another unit tests run – Pitts 4/8, 2016 at 10:13

M

142

[Edit 2022/08] Created a small github repository to create regular expressions with spaces, comments and templating.

You could convert it to a string and create the expression by calling new RegExp():

var myRE = new RegExp (['^(([^<>()[\]\\.,;:\\s@\"]+(\\.[^<>(),[\]\\.,;:\\s@\"]+)*)',
                        '|(\\".+\\"))@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.',
                        '[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\\.)+',
                        '[a-zA-Z]{2,}))$'].join(''));

Notes:

when converting the expression literal to a string you need to escape all backslashes as backslashes are consumed when evaluating a string literal. (See Kayo's comment for more detail.)
RegExp accepts modifiers as a second parameter

/regex/g => new RegExp('regex', 'g')

[Addition ES20xx (tagged template)]

In ES20xx you can use tagged templates. See the snippet.

Note:

Disadvantage here is that you can't use plain whitespace in the regular expression string (always use \s, \s+, \s{1,x}, \t, \n etc).

(() => {
  const createRegExp = (str, opts) => 
    new RegExp(str.raw[0].replace(/\s/gm, ""), opts || "");
  const yourRE = createRegExp`
    ^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|
    (\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|
    (([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$`;
  console.log(yourRE);
  const anotherLongRE = createRegExp`
    (\byyyy\b)|(\bm\b)|(\bd\b)|(\bh\b)|(\bmi\b)|(\bs\b)|(\bms\b)|
    (\bwd\b)|(\bmm\b)|(\bdd\b)|(\bhh\b)|(\bMI\b)|(\bS\b)|(\bMS\b)|
    (\bM\b)|(\bMM\b)|(\bdow\b)|(\bDOW\b)
    ${"gi"}`;
  console.log(anotherLongRE);
})();

Marijn answered 7/9, 2012 at 11:20 Comment(6)

A new RegExp is a great way for multiline regular expressions. Instead of joining arrays, you can just use a string concatenation operator: var reg = new RegExp('^([a-' + 'z]+)$','i'); – Ellary 22/4, 2014 at 12:29

Caution: A long regular expression literal could be broken into multiple lines using the above answer. However it needs care because you can't simply copy the regular expression literal (defined with //) and paste it as the string argument to the RegExp constructor. This is because backslash characters get consumed when evaluating the string literal. Example: /Hey\sthere/ cannot be replaced by new RegExp("Hey\sthere"). Instead it should be replaced by new RegExp("Hey\\sthere") Note the extra backslash! Hence I prefer to just leave a long regex literal on one long line – Byronbyrum 27/4, 2014 at 4:37

An even clearer way to do this is to create named variables holding meaningful subsections, and joining those as strings or in an array. That lets you construct the RegExp in a way that is much easier to understand. – Kathie 3/10, 2014 at 17:12

Also MDN recommends to use literal notation when the regex will remain constant, versus the constructor notation when the regex can change. developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/… – Meyer 15/9, 2021 at 17:2

Replacing .replace(/\s/gm, "") with .replace(/( #.*|\s)/gm, "") will also enable the # comments (like ruby), which requies at least one space before the #. – Babirusa 11/4, 2022 at 7:28

@AkinHwan I don't believe it says that anymore. But in any case, I would almost always optimize for maintainability. It's very easy (and pretty much normalized) to create regexes that look like hieroglyphics. It's almost a badge of honor. Far better to write something that can easily be parsed by Future Me, let alone another dev. – Rochet 18/12, 2022 at 4:9

P

169

Extending @KooiInc answer, you can avoid manually escaping every special character by using the source property of the RegExp object.

Example:

var urlRegex = new RegExp(
  /(?:(?:(https?|ftp):)?\/\/)/.source       // protocol
  + /(?:([^:\n\r]+):([^@\n\r]+)@)?/.source  // user:pass
  + /(?:(?:www.)?([^/\n\r]+))/.source       // domain
  + /(\/[^?\n\r]+)?/.source                 // request
  + /(\?[^#\n\r]*)?/.source                 // query
  + /(#?[^\n\r]*)?/.source                  // anchor
);

or if you want to avoid repeating the .source property you can do it using the Array.map() function:

var urlRegex = new RegExp([
  /(?:(?:(https?|ftp):)?\/\/)/,     // protocol
  /(?:([^:\n\r]+):([^@\n\r]+)@)?/,  // user:pass
  /(?:(?:www.)?([^/\n\r]+))/,       // domain
  /(\/[^?\n\r]+)?/,                 // request
  /(\?[^#\n\r]*)?/,                 // query
  /(#?[^\n\r]*)?/,                  // anchor
].map(function (r) { return r.source; }).join(''));

In ES6 the map function can be reduced to: .map(r => r.source).

Piotrowski answered 12/1, 2016 at 22:34 Comment(5)

This is really convenient for adding comments to a long regexp. However, it is limited by having matching parentheses on the same line. – Rabid 12/7, 2018 at 2:22

Definitely, this! Super nice with the ability to comment each sub-regex. – Cypripedium 26/3, 2019 at 0:52

Thanks, it helped putting source in regex function – Lucie 15/1, 2020 at 6:17

Very clever. Thanks, this idea helped me a lot. Just as a side note: I encapsulated the whole thing in an function to make it even cleaner: combineRegex = (...regex) => new RegExp(regex.map(r => r.source).join("")) Usage: combineRegex(/regex1/, /regex2/, ...) – Hakeem 30/4, 2020 at 12:33

Also you should probably add the limitations of this method. (I.e. matching parenthesis, etc.) – Hakeem 30/4, 2020 at 12:35

M

142

[Edit 2022/08] Created a small github repository to create regular expressions with spaces, comments and templating.

You could convert it to a string and create the expression by calling new RegExp():

var myRE = new RegExp (['^(([^<>()[\]\\.,;:\\s@\"]+(\\.[^<>(),[\]\\.,;:\\s@\"]+)*)',
                        '|(\\".+\\"))@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.',
                        '[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\\.)+',
                        '[a-zA-Z]{2,}))$'].join(''));

Notes:

when converting the expression literal to a string you need to escape all backslashes as backslashes are consumed when evaluating a string literal. (See Kayo's comment for more detail.)
RegExp accepts modifiers as a second parameter

/regex/g => new RegExp('regex', 'g')

[Addition ES20xx (tagged template)]

In ES20xx you can use tagged templates. See the snippet.

Note:

Disadvantage here is that you can't use plain whitespace in the regular expression string (always use \s, \s+, \s{1,x}, \t, \n etc).

(() => {
  const createRegExp = (str, opts) => 
    new RegExp(str.raw[0].replace(/\s/gm, ""), opts || "");
  const yourRE = createRegExp`
    ^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|
    (\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|
    (([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$`;
  console.log(yourRE);
  const anotherLongRE = createRegExp`
    (\byyyy\b)|(\bm\b)|(\bd\b)|(\bh\b)|(\bmi\b)|(\bs\b)|(\bms\b)|
    (\bwd\b)|(\bmm\b)|(\bdd\b)|(\bhh\b)|(\bMI\b)|(\bS\b)|(\bMS\b)|
    (\bM\b)|(\bMM\b)|(\bdow\b)|(\bDOW\b)
    ${"gi"}`;
  console.log(anotherLongRE);
})();

Marijn answered 7/9, 2012 at 11:20 Comment(6)

A new RegExp is a great way for multiline regular expressions. Instead of joining arrays, you can just use a string concatenation operator: var reg = new RegExp('^([a-' + 'z]+)$','i'); – Ellary 22/4, 2014 at 12:29

Caution: A long regular expression literal could be broken into multiple lines using the above answer. However it needs care because you can't simply copy the regular expression literal (defined with //) and paste it as the string argument to the RegExp constructor. This is because backslash characters get consumed when evaluating the string literal. Example: /Hey\sthere/ cannot be replaced by new RegExp("Hey\sthere"). Instead it should be replaced by new RegExp("Hey\\sthere") Note the extra backslash! Hence I prefer to just leave a long regex literal on one long line – Byronbyrum 27/4, 2014 at 4:37

An even clearer way to do this is to create named variables holding meaningful subsections, and joining those as strings or in an array. That lets you construct the RegExp in a way that is much easier to understand. – Kathie 3/10, 2014 at 17:12

Also MDN recommends to use literal notation when the regex will remain constant, versus the constructor notation when the regex can change. developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/… – Meyer 15/9, 2021 at 17:2

Replacing .replace(/\s/gm, "") with .replace(/( #.*|\s)/gm, "") will also enable the # comments (like ruby), which requies at least one space before the #. – Babirusa 11/4, 2022 at 7:28

@AkinHwan I don't believe it says that anymore. But in any case, I would almost always optimize for maintainability. It's very easy (and pretty much normalized) to create regexes that look like hieroglyphics. It's almost a badge of honor. Far better to write something that can easily be parsed by Future Me, let alone another dev. – Rochet 18/12, 2022 at 4:9

S

34

Using strings in new RegExp is awkward because you must escape all the backslashes. You may write smaller regexes and concatenate them.

Let's split this regex

/^foo(.*)\bar$/

We will use a function to make things more beautiful later

function multilineRegExp(regs, options) {
    return new RegExp(regs.map(
        function(reg){ return reg.source; }
    ).join(''), options);
}

And now let's rock

var r = multilineRegExp([
     /^foo/,  // we can add comments too
     /(.*)/,
     /\bar$/
]);

Since it has a cost, try to build the real regex just once and then use that.

Stigmatize answered 14/6, 2015 at 23:37 Comment(3)

This is very cool -- not only you don't have to do additional escaping, but also you keep the special syntax highlight for the sub-regexes! – Playbill 24/7, 2020 at 13:41

one caveat though: you need to make sure your sub-regexes are self-contained, or wrap each in a new bracket group. Example: multilineRegExp([/a|b/, /c|d]) results in /a|bc|d/, while you meant (a|b)(c|d). – Playbill 24/7, 2020 at 13:52

this makes it impossible to break a big, complex regex group in multiple lines, as @Playbill mentioned, as you can't do `multilineRegExp([/a (/, /cold/, /|hot/, /) drink/]) – Muscid 31/1, 2022 at 11:1

O

19

Thanks to the wonderous world of template literals you can now write big, multi-line, well-commented, and even semantically nested regexes in ES6.

//build regexes without worrying about
// - double-backslashing
// - adding whitespace for readability
// - adding in comments
let clean = (piece) => (piece
    .replace(/((^|\n)(?:[^\/\\]|\/[^*\/]|\\.)*?)\s*\/\*(?:[^*]|\*[^\/])*(\*\/|)/g, '$1')
    .replace(/((^|\n)(?:[^\/\\]|\/[^\/]|\\.)*?)\s*\/\/[^\n]*/g, '$1')
    .replace(/\n\s*/g, '')
);
window.regex = ({raw}, ...interpolations) => (
    new RegExp(interpolations.reduce(
        (regex, insert, index) => (regex + insert + clean(raw[index + 1])),
        clean(raw[0])
    ))
);

Using this you can now write regexes like this:

let re = regex`I'm a special regex{3} //with a comment!`;

Outputs

/I'm a special regex{3}/

Or what about multiline?

'123hello'
    .match(regex`
        //so this is a regex

        //here I am matching some numbers
        (\d+)

        //Oh! See how I didn't need to double backslash that \d?
        ([a-z]{1,3}) /*note to self, this is group #2*/
    `)
    [2]

Outputs hel, neat!
"What if I need to actually search a newline?", well then use \n silly!
Working on my Firefox and Chrome.

Okay, "how about something a little more complex?"
Sure, here's a piece of an object destructuring JS parser I was working on:

regex`^\s*
    (
        //closing the object
        (\})|

        //starting from open or comma you can...
        (?:[,{]\s*)(?:
            //have a rest operator
            (\.\.\.)
            |
            //have a property key
            (
                //a non-negative integer
                \b\d+\b
                |
                //any unencapsulated string of the following
                \b[A-Za-z$_][\w$]*\b
                |
                //a quoted string
                //this is #5!
                ("|')(?:
                    //that contains any non-escape, non-quote character
                    (?!\5|\\).
                    |
                    //or any escape sequence
                    (?:\\.)
                //finished by the quote
                )*\5
            )
            //after a property key, we can go inside
            \s*(:|)
      |
      \s*(?={)
        )
    )
    ((?:
        //after closing we expect either
        // - the parent's comma/close,
        // - or the end of the string
        \s*(?:[,}\]=]|$)
        |
        //after the rest operator we expect the close
        \s*\}
        |
        //after diving into a key we expect that object to open
        \s*[{[:]
        |
        //otherwise we saw only a key, we now expect a comma or close
        \s*[,}{]
    ).*)
$`

It outputs /^\s*((\})|(?:[,{]\s*)(?:(\.\.\.)|(\b\d+\b|\b[A-Za-z$_][\w$]*\b|("|')(?:(?!\5|\\).|(?:\\.))*\5)\s*(:|)|\s*(?={)))((?:\s*(?:[,}\]=]|$)|\s*\}|\s*[{[:]|\s*[,}{]).*)$/

And running it with a little demo?

let input = '{why, hello, there, "you   huge \\"", 17, {big,smelly}}';
for (
    let parsed;
    parsed = input.match(r);
    input = parsed[parsed.length - 1]
) console.log(parsed[1]);

Successfully outputs

{why
, hello
, there
, "you   huge \""
, 17
,
{big
,smelly
}
}

Note the successful capturing of the quoted string.
I tested it on Chrome and Firefox, works a treat!

_{If curious you can checkout what I was doing, and its demonstration.

Though it only works on Chrome, because Firefox doesn't support backreferences or named groups. So note the example given in this answer is actually a neutered version and might get easily tricked into accepting invalid strings.}

Oppose answered 2/2, 2020 at 14:53 Comment(9)

you should think of exporting this as a NodeJS package, it is marvelous – Phantom 26/5, 2020 at 4:20

Although I've never done it myself, there's a pretty thorough tutorial here: zellwk.com/blog/publish-to-npm. I'd suggest checking np, at the end of the page. I've never used it, but Sindre Sorhus is a magician with these things, so I wouldn't pass it up. – Phantom 26/5, 2020 at 8:14

Hey @Oppose , do you mind if I make this a package? I'll give you attribution of course – Ciracirca 16/5, 2021 at 5:4

@Siddharth go for it. I haven't seemed to get around to it. Hashbrown777 on github too – Oppose 16/5, 2021 at 9:56

@Siddharth I've already got a gist using it in practice – Oppose 18/5, 2021 at 12:33

How would you add regex flags here? – Muscid 31/1, 2022 at 11:10

@Muscid well this is what I actually use. See how I use the last value as an [optional] options object? So if your case was needing 'g' flag it would be {global:true} but you could change the code slightly to just accept a string ('g'} directly I suppose. You could even add other objects elsewhere for instance to change how clean() works, maybe temporarily disabling it mid-regex. As an example of something like that I inject an array which is just turned into ord strings. – Oppose 27/2, 2022 at 12:10

@Hashbrown, I'll have to look at that. This was my approach: // RE* to indicate the flags.

const REgis = ({raw} : any, ...interpolations : string[]) : RegExp => (new RegExp(interpolations.reduce((regex, insert, index) => (regex + insert + cleanRegexTemplate(raw[index+1])), cleanRegexTemplate(raw[0])), 'gis'));

– Sufferable 3/12, 2023 at 12:35

@RyanBeesley yeah so that approach only worked if you were okay with all interpolations as regex strings. What I ended up doing in that comment link was being much more declarative; strings are interpreted as raw strings, if you want an inner-regex you interpolate actual RegExp objects. It's completely protected against trying to break out of inner contexts if say you do something like this; regex`hello(${'(|!|['})there` (which will break in yours/the original). In the new one you would do regex`hello(${/\(|!|\[/})there` unless you simply want it to match hello(![there literally – Oppose 3/12, 2023 at 16:30

S

11

There are good answers here, but for completeness someone should mention Javascript's core feature of inheritance with the prototype chain. Something like this illustrates the idea:

RegExp.prototype.append = function(re) {
  return new RegExp(this.source + re.source, this.flags);
};

let regex = /[a-z]/g
.append(/[A-Z]/)
.append(/[0-9]/);

console.log(regex); //=> /[a-z][A-Z][0-9]/g

Scrumptious answered 21/1, 2019 at 13:42 Comment(3)

This is the best answer here. – Spinster 6/4, 2020 at 19:41

This creates compiles the RegExp object each time you use .append, so the other answers which compile the given combined array in one go is slightly better. The difference is insignificant, I guess, but it's worth noting it. – Muscid 31/1, 2022 at 10:37

@Muscid This is true. In my tests it's about 80% slower than the accepted solution on my 8-year-old workstation with a 6-line multiline regex. Still, my PC came in at ~220,000 ops/sec jsbench.me/sfkz4e7mjf/2 – Scrumptious 1/2, 2022 at 23:4

A

6

The regex above is missing some black slashes which isn't working properly. So, I edited the regex. Please consider this regex which works 99.99% for email validation.

let EMAIL_REGEXP = 
new RegExp (['^(([^<>()[\\]\\\.,;:\\s@\"]+(\\.[^<>()\\[\\]\\\.,;:\\s@\"]+)*)',
                    '|(".+"))@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.',
                    '[0-9]{1,3}\])|(([a-zA-Z\\-0-9]+\\.)+',
                    '[a-zA-Z]{2,}))$'].join(''));

Agace answered 27/12, 2016 at 16:18 Comment(1)

"Above" ... votes and sorting can change what is "above". – Steer 17/2, 2022 at 20:27

N

2

To avoid the Array join, you can also use the following syntax:

var pattern = new RegExp('^(([^<>()[\]\\.,;:\s@\"]+' +
  '(\.[^<>()[\]\\.,;:\s@\"]+)*)|(\".+\"))@' +
  '((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|' +
  '(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$');

Nikolenikoletta answered 7/3, 2018 at 12:0 Comment(0)

H

2

You can simply use string operation.

var pattenString = "^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|"+
"(\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|"+
"(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$";
var patten = new RegExp(pattenString);

Haematoid answered 23/11, 2018 at 10:45 Comment(0)

H

2

I tried improving korun's answer by encapsulating everything and implementing support for splitting capturing groups and character sets - making this method much more versatile.

To use this snippet you need to call the variadic function combineRegex whose arguments are the regular expression objects you need to combine. Its implementation can be found at the bottom.

Capturing groups can't be split directly that way though as it would leave some parts with just one parenthesis. Your browser would fail with an exception.

Instead I'm simply passing the contents of the capture group inside an array. The parentheses are automatically added when combineRegex encounters an array.

Furthermore quantifiers need to follow something. If for some reason the regular expression needs to be split in front of a quantifier you need to add a pair of parentheses. These will be removed automatically. The point is that an empty capture group is pretty useless and this way quantifiers have something to refer to. The same method can be used for things like non-capturing groups (/(?:abc)/ becomes [/()?:abc/]).

This is best explained using a simple example:

var regex = /abcd(efghi)+jkl/;

would become:

var regex = combineRegex(
    /ab/,
    /cd/,
    [
        /ef/,
        /ghi/
    ],
    /()+jkl/    // Note the added '()' in front of '+'
);

If you must split character sets you can use objects ({"":[regex1, regex2, ...]}) instead of arrays ([regex1, regex2, ...]). The key's content can be anything as long as the object only contains one key. Note that instead of () you have to use ] as dummy beginning if the first character could be interpreted as quantifier. I.e. /[+?]/ becomes {"":[/]+?/]}

Here is the snippet and a more complete example:

function combineRegexStr(dummy, ...regex)
{
    return regex.map(r => {
        if(Array.isArray(r))
            return "("+combineRegexStr(dummy, ...r).replace(dummy, "")+")";
        else if(Object.getPrototypeOf(r) === Object.getPrototypeOf({}))
            return "["+combineRegexStr(/^\]/, ...(Object.entries(r)[0][1]))+"]";
        else 
            return r.source.replace(dummy, "");
    }).join("");
}
function combineRegex(...regex)
{
    return new RegExp(combineRegexStr(/^\(\)/, ...regex));
}

//Usage:
//Original:
console.log(/abcd(?:ef[+A-Z0-9]gh)+$/.source);
//Same as:
console.log(
  combineRegex(
    /ab/,
    /cd/,
    [
      /()?:ef/,
      {"": [/]+A-Z/, /0-9/]},
      /gh/
    ],
    /()+$/
  ).source
);

Hakeem answered 30/4, 2020 at 16:38 Comment(1)

Can you publish an npm package or something? This is an awesome concept, and allows linters/formatters to help keep it readable... – Steer 17/2, 2022 at 20:33

R

1

Personally, I'd go for a less complicated regex:

/\S+@\S+\.\S+/

Sure, it is less accurate than your current pattern, but what are you trying to accomplish? Are you trying to catch accidental errors your users might enter, or are you worried that your users might try to enter invalid addresses? If it's the first, I'd go for an easier pattern. If it's the latter, some verification by responding to an e-mail sent to that address might be a better option.

However, if you want to use your current pattern, it would be (IMO) easier to read (and maintain!) by building it from smaller sub-patterns, like this:

var box1 = "([^<>()[\]\\\\.,;:\s@\"]+(\\.[^<>()[\\]\\\\.,;:\s@\"]+)*)";
var box2 = "(\".+\")";

var host1 = "(\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])";
var host2 = "(([a-zA-Z\-0-9]+\\.)+[a-zA-Z]{2,})";

var regex = new RegExp("^(" + box1 + "|" + box2 + ")@(" + host1 + "|" + host2 + ")$");

Ramiform answered 7/9, 2012 at 11:39 Comment(2)

Downvoting - Although your comments about reducing regex complexity are valid, OP specifically is asking how to "split long regex over multiple lines". So although your advice is valid, it has been given for the wrong reasons. e.g. changing business logic to work around a programming language. Furthermore, the code example you gave is quite ugly. – Udele 14/10, 2014 at 15:13

@sleepycal I think Bart has answered the question. See the last section of his answer. He has answered the question as well as given an alternative. – Rectangle 14/1, 2016 at 6:14

D

1

@Hashbrown's great answer got me on the right track. Here's my version, also inspired by this blog.

function regexp(...args) {
  function cleanup(string) {
    // remove whitespace, single and multi-line comments
    return string.replace(/\s+|\/\/.*|\/\*[\s\S]*?\*\//g, '');
  }

  function escape(string) {
    // escape regular expression
    return string.replace(/[-.*+?^${}()|[\]\\]/g, '\\$&');
  }

  function create(flags, strings, ...values) {
    let pattern = '';
    for (let i = 0; i < values.length; ++i) {
      pattern += cleanup(strings.raw[i]);  // strings are cleaned up
      pattern += escape(values[i]);        // values are escaped
    }
    pattern += cleanup(strings.raw[values.length]);
    return RegExp(pattern, flags);
  }

  if (Array.isArray(args[0])) {
    // used as a template tag (no flags)
    return create('', ...args);
  }

  // used as a function (with flags)
  return create.bind(void 0, args[0]);
}

Use it like this:

regexp('i')`
  //so this is a regex

  //here I am matching some numbers
  (\d+)

  //Oh! See how I didn't need to double backslash that \d?
  ([a-z]{1,3}) /*note to self, this is group #2*/
`

To create this RegExp object:

/(\d+)([a-z]{1,3})/i

Duotone answered 2/6, 2020 at 14:2 Comment(0)

Recommended topics

Hot tags