Using regex lookbehinds in C++11
Asked Answered
S

2

17

Why can't I use lookbehinds in C++11? Lookahead works fine.

std::regex e("(?<=a)b");

This will throw the following exception:

The expression contained mismatched ( and ).

This wont throw any exception:

std::regex e("a(?=b)");

What am I missing?

Spavined answered 26/1, 2013 at 16:2 Comment(3)
If you are using gcc, notice that mostly all features of regex are not yet implemented.Sudoriferous
@Spavined How did you setup your working environment? Which software have you downloaded and installed?Enter
Interesting. Documentation I have found for std::regex_search doesn't indicate that it will throw exceptions. I would have thought that this would work using the extended regex qualifier, but it doesn't. Good question.Mano
F
22

C++11 <regex> uses ECMAScript's (ECMA-262) regex syntax, so it will not have look-behind (other flavors of regex that C++11 supports also don't have look-behind).

If your use case requires the use of look-behind, you may consider using Boost.Regex instead.

Fendig answered 26/1, 2013 at 17:23 Comment(4)
Do you know if boost regex support it?Spavined
@Carlj901: A quick Google shows that boost supports look behind under Perl syntax.Fendig
Surprisingly, ECMAScript mode on regex101.com has lookbehind workingBolero
It's normal, lookbehind has been added to ES2018. github.com/tc39/proposal-regexp-lookbehindMohawk
A
2

A positive lookbehind (?<=a) matches a location in a string that is immediately preceded with the lookbehind pattern. In case overlapping matches are not expected, like the case here, you can simply use a capturing group and extract just the Group 1 (or even more group values if you specify more than one):

a(b)

Here is a way to extract all matches using std::sregex_token_iterator:

#include <iostream>
#include <vector>
#include <regex>

int main() {
    std::regex rx("a(b)");             // A pattern with a capturing group
    std::string sentence("abba abec"); // A test string
    std::vector<std::string> names(std::sregex_token_iterator(
        sentence.begin(), sentence.end(), rx, 1), // "1" makes it return Group 1 values
        std::sregex_token_iterator()
    );
    for( auto & p : names ) std::cout << p << std::endl; // Print matches
    return 0;
}

If you only need to extract the first match use regex_search (not regex_match as this function requires a full string match):

std::smatch sm;
if (regex_search(sentence, sm, rx)) {
    std::cout << sm[1] << std::endl;
}

See the C++ demo.

Awhirl answered 26/2, 2022 at 12:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.