Why doesn't boost regex '.{2}' match '??'
Asked Answered
G

1

8

I'm trying to match some chunks if interesting data within a data stream.

There should be a leading < then four alphanumeric characters, two characters of checksum (or ?? if no shecksum was specified) and a trailing >.

If the last two characters are alphanumeric, the following code works as expected. If they're ?? though it fails.

// Set up a pre-populated data buffer as an example
std::string haystack = "Fli<data??>bble";

// Set up the regex
static const boost::regex e("<\\w{4}.{2}>");
std::string::const_iterator start, end;
start = haystack.begin();
end = haystack.end();
boost::match_flag_type flags = boost::match_default;

// Try and find something of interest in the buffer
boost::match_results<std::string::const_iterator> what;
bool succeeded = regex_search(start, end, what, e, flags); // <-- returns false

I've not spotted anything in the documentation which suggests this should be the case (all but NULL and newline should be match AIUI).

So what have I missed?

Gesticulate answered 8/11, 2016 at 9:52 Comment(2)
What compiler are you using? Mine (gcc) gives an explicit warning saying "trigraph ??> converted to }".Sicken
I'm using visual studio 2013 with the 2008 tool chain.Gesticulate
D
10

Because ??> is a trigraph, it will be converted to }, Your code is equivalent to:

// Set up a pre-populated data buffer as an example
std::string haystack = "Fli<data}bble";

// Set up the regex
static const boost::regex e("<\\w{4}.{2}>");
std::string::const_iterator start, end;
start = haystack.begin();
end = haystack.end();
boost::match_flag_type flags = boost::match_default;

// Try and find something of interest in the buffer
boost::match_results<std::string::const_iterator> what;
bool succeeded = regex_search(start, end, what, e, flags); // <-- returns false

You can change to this:

std::string haystack = "Fli<data?" "?>bble";

Demo (note: I use std::regex which is more or less the same)

NOTE: trigraph is deprecated from C++11, will be (likely) removed from C++17

Diversification answered 8/11, 2016 at 10:0 Comment(3)
You got it. Very interesting - I'd not heard of trigraphs before!Gesticulate
They got removed (or deprecated?) in latest standardsFlaxman
@Flaxman deprecated from C++11, will be removed by C++17Diversification

© 2022 - 2024 — McMap. All rights reserved.