C++ Regular Expressions with Boost Regex
Asked Answered
H

3

8

I am trying to take a string in C++ and find all IP addresses contained inside, and put them into a new vector string.

I've read a lot of documentation on regex, but I just can't seem to understand how to do this simple function.

I believe I can use this Perl expression to find any IP address:

re("\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b");

But I am still stumped on how to do the rest.

Hematology answered 27/4, 2011 at 12:55 Comment(4)
Did you try the Boost Regex tutorial and documentation? Got some code so far to share with us?Trixy
what exactly are you trying to match with that regex? First try to match a single IP addressConde
Have a look at John D Cook's excellent tutorial Getting started with C++ TR1 regular expressions. It's designed for those who already understand RegEx but can't figure out how to make it do stuff in C++.Anglaangle
That re has bugs: It allows 00 and it does not work with left justified or right justified IP addresses. It is also not syntactically factored for maximum speed. The correct one is at #5804953.Vantage
N
16

Perhaps you're looking for something like this. It uses regex_iterator to get all matches of the current pattern. See reference.

#include <boost/regex.hpp>
#include <iostream>
#include <string>

int main()
{
    std::string text(" 192.168.0.1 abc 10.0.0.255 10.5.1 1.2.3.4a 5.4.3.2 ");
    const char* pattern =
        "\\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
        "\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
        "\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
        "\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b";
    boost::regex ip_regex(pattern);

    boost::sregex_iterator it(text.begin(), text.end(), ip_regex);
    boost::sregex_iterator end;
    for (; it != end; ++it) {
        std::cout << it->str() << "\n";
        // v.push_back(it->str()); or something similar     
    }
}

Output:

192.168.0.1
10.0.0.255
5.4.3.2

Side note: you probably meant \\b instead of \b; I doubt you watnted to match backspace character.

Nosewheel answered 27/4, 2011 at 17:9 Comment(3)
Iterator "end" is not initialized. Is that OK?Fleta
@truthseeker: It is initialized by default constructor.Nosewheel
The end is default constructed, but this algorithm has bugs (adjacency. left justification, right justification) and the regex is wrong (and otherwise slow).Vantage
T
-1

The offered solution is quite good, thanks for it. Though I found a slight mistake in the pattern itself.

For example, something like 49.000.00.01 would be taken as a valid IPv4 address and from my understanding, it shouldn't be (just happened to me during some dump processing).

I suggest to improve the patter into:

"\\b(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)"
"\\.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)"
"\\.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)"
"\\.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)\\b";

This should allow only 0.0.0.0 as the all-zero-in, which I suppose to be correct and it will eliminate all .00. .000. etc.

Torquemada answered 20/10, 2013 at 16:12 Comment(0)
V
-1
#include <string>
#include <list>
#include <boost/regex.hpp>
typedef std::string::const_iterator ConstIt;

int main()
{
    // input text, expected result, & proper address pattern
    const std::string sInput
    (
            "192.168.0.1 10.0.0.255 abc 10.5.1.00"
            " 1.2.3.4a 168.72.0 0.0.0.0 5.4.3.2"
    );
    const std::string asExpected[] =
    {
        "192.168.0.1",
        "10.0.0.255",
        "0.0.0.0",
        "5.4.3.2"
    };
    boost::regex regexIPs
    (
        "(^|[ \t])("
        "(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])[.]"
        "(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])[.]"
        "(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])[.]"
        "(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])"
        ")($|[ \t])"
    );

    // parse, check results, and return error count
    boost::smatch what;
    std::list<std::string> ns;
    ConstIt end = sInput.end();
    for (ConstIt begin = sInput.begin();
                boost::regex_search(begin, end, what, regexIPs);
                begin = what[0].second)
    {
        ns.push_back(std::string(what[2].first, what[2].second));
    }

    // check results and return number of errors (zero)
    int iErrors = 0;
    int i = 0;
    for (std::string & s : ns)
        if (s != asExpected[i ++])
            ++ iErrors;
    return iErrors;
}
Vantage answered 21/3, 2017 at 6:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.