Javascript RegExp non-capturing groups
Asked Answered
S

7

11

I am writing a set of RegExps to translate a CSS selector into arrays of ids and classes.

For example, I would like '#foo#bar' to return ['foo', 'bar'].

I have been trying to achieve this with

"#foo#bar".match(/((?:#)[a-zA-Z0-9\-_]*)/g)

but it returns ['#foo', '#bar'], when the non-capturing prefix ?: should ignore the # character.

Is there a better solution than slicing each one of the returned strings?

Selfinterest answered 2/6, 2012 at 18:16 Comment(3)
Here’s a one-liner: str.replace(/[^#]+|(#[a-zA-Z0-9\-_]*)/g, '$1').split('#').slice(1)Naamana
split doesn't work in ie8Samurai
@Samurai Why would ie8 even be relevant for anything in september 2014 unless it's a specific request?Omophagia
L
12

You could use .replace() or .exec() in a loop to build an Array.

With .replace():

var arr = [];
"#foo#bar".replace(/#([a-zA-Z0-9\-_]*)/g, function(s, g1) {
                                               arr.push(g1);
                                          });

With .exec():

var arr = [],
    s = "#foo#bar",
    re = /#([a-zA-Z0-9\-_]*)/g,
    item;

while (item = re.exec(s))
    arr.push(item[1]);
Lacunar answered 2/6, 2012 at 18:16 Comment(0)
I
5

It matches #foo and #bar because the outer group (#1) is capturing. The inner group (#2) is not, but that' probably not what you are checking.

If you were not using global matching mode, an immediate fix would be to use (/(?:#)([a-zA-Z0-9\-_]*)/ instead.

With global matching mode the result cannot be had in just one line because match behaves differently. Using regular expression only (i.e. no string operations) you would need to do it this way:

var re = /(?:#)([a-zA-Z0-9\-_]*)/g;
var matches = [], match;
while (match = re.exec("#foo#bar")) {
    matches.push(match[1]);
}

See it in action.

Interceptor answered 2/6, 2012 at 18:19 Comment(1)
No need with this to group the hash key at all (and then exclude it).Limoges
L
2

I'm not sure if you can do that using match(), but you can do it by using the RegExp's exec() method:

var pattern = new RegExp('#([a-zA-Z0-9\-_]+)', 'g');
var matches, ids = [];

while (matches = pattern.exec('#foo#bar')) {
    ids.push( matches[1] ); // -> 'foo' and then 'bar'
}
Limoges answered 2/6, 2012 at 18:32 Comment(0)
H
1

Unfortunately there is no lookbehind assertion in Javascript RegExp, otherwise you could do this:

/(?<=#)[a-zA-Z0-9\-_]*/g

Other than it being added to some new version of Javascript, I think using the split post processing is your best bet.

Hemihydrate answered 2/6, 2012 at 18:31 Comment(0)
A
1

You can use a negative lookahead assertion:

"#foo#bar".match(/(?!#)[a-zA-Z0-9\-_]+/g);  // ["foo", "bar"]
Arrowwood answered 2/6, 2012 at 18:44 Comment(1)
It does return ['foo', 'bar'], but won't search for the #, so "#foo#bar.foobar".match(/(?!#)[a-zA-Z0-9\-_]+/g); will return ['foo', 'bar', 'foobar']Selfinterest
T
1

The lookbehind assertion mentioned some years ago by mVChr is added in ECMAScript 2018. This will allow you to do this:

'#foo#bar'.match(/(?<=#)[a-zA-Z0-9\-_]*/g) (returns ["foo", "bar"])

(A negative lookbehind is also possible: use (?<!#) to match any character except for #, without capturing it.)

Therefor answered 19/4, 2018 at 0:50 Comment(0)
F
0

MDN does document that "Capture groups are ignored when using match() with the global /g flag", and recommends using matchAll(). matchAll() isn't available on Edge or Safari iOS, and you still need to skip the complete match (including the#`).

A simpler solution is to slice off the leading prefix, if you know its length - here, 1 for #.

const results = ('#foo#bar'.match(/#\w+/g) || []).map(s => s.slice(1));
console.log(results);

The [] || ... part is necessary in case there was no match, otherwise match returns null, and null.map won't work.

const results = ('nothing matches'.match(/#\w+/g) || []).map(s => s.slice(1));
console.log(results);
Forgetmenot answered 12/5, 2019 at 21:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.