Regex vs brute-force for small strings
Asked Answered
F

8

6

When testing small strings (e.g. isPhoneNumber or isHexadecimal) is there a performance benefit from using regular expressions, or would brute forcing them be faster? Wouldn't brute forcing them by just checking whether or not the given string's chars are within a specified range be faster than using a regex?

For example:

public static boolean isHexadecimal(String value)
{
    if (value.startsWith("-"))
    {
        value = value.substring(1);
    }

    value = value.toLowerCase();

    if (value.length() <= 2 || !value.startsWith("0x"))
    {
        return false;
    }

    for (int i = 2; i < value.length(); i++)
    {
        char c = value.charAt(i);

        if (!(c >= '0' && c <= '9' || c >= 'a' && c <= 'f'))
        {
            return false;
        }
    }

    return true;
}

vs.

Regex.match(/0x[0-9a-f]+/, "0x123fa") // returns true if regex matches whole given expression

There seems like there would be some overhead associated with the regex, even when the pattern is pre-compiled, just from the fact that regular expressions have to work in many general cases. In contrast, the brute-force method does exactly what is required and no more. Am I missing some optimization that regular expressions have?

Foreplay answered 23/10, 2016 at 18:26 Comment(11)
Even if there was a performance benefit, I would much rather see a regex then parsing code.Churr
RegEx is usually the slowest way to parse, but also the shortest so one line of code can be easier to maintain and change than several lines. I am not sure what you mean by brute force, but guessing will probably be slower than RegEx.Tuba
There isn't really an answer to this question in particular for interpreted languages when you must put in balance several tests in "pure code" against a single regex. Most of the time, with compiled languages, the "pure code" way is faster (but all depends what you need to test)Zamarripa
"Brute force" sounds like you want to test the input against all possible valid strings. "checking whether or not the given string's chars are within a specified range" is exactly what a regex like /^[range]*$/ does.Ochlocracy
Most importantly, your two solutions do something entirely different. The regular expression equivalent to your isHexadecimal method would be /-?0x[0-9a-f]*/i (though I believe that you actually want /-0x[0-9a-fA-F]*/).Ochlocracy
Your regex does a different job than your brute force code, for some implementations. What result do you expect from Regex.match(/0x[0-9a-f]+/, "x0x123fa") or from Regex.match(/0x[0-9a-f]+/, "0x123fag")?Clue
It depends on the regex engine. A DFA-based engine could be almost as fast es efficient hand coded parsing. Unfortunately the regex standard implementations are NFA-based and thus slower. Yet the difference should not be that significant for linear patterns (without alternatives).Shunt
Since you didn't tag this with a language (which is bad BTW): this depends on the capabilities of the regex engine. Some of them (.NET, PCRE, maybe others) are capable of compiling the regex pattern down to executable machine code, which would get you the same result as if you wrote the equivalent code manually.Loquat
I think you must use an other term than "brute force". As well as I understand, you are asking if "pure code" is faster than regex. "Brute force" is a particular way that consists to test all possibilities to solve a problem (that is a silly way, but it's used for example to crack passwords). Whatever the alternative, "brute force" is the slowest and less cleaver algorithm. You should reword your question if you want pertinent answers.Zamarripa
(1) Benchmark it. (2) My guess - the regex will be slower. (3) Why do you care?Sleuth
If you are talking about a language that uses an interpreting matcher (Java, Python, Perl, etc.) - even if you precompile the regex - it's likely that a custom pattern matcher will be much faster. The compiled regex is a byte code "executed" by the interpreter to match the string. Machine instructions spent running interpreter are overhead that's missing in the custom code. The advantage of the regex is conciseness and (maybe) less error-proneness. If you're talking about compiling regexes to native code (e.g. lex/flex, PCRE, etc.), then the two will have similar performance.Inductee
A
7

I've written a small benchmark to estimate the performance of the:

  • NOP method (to get an idea of the baseline iteration speed);
  • Original method, as provided by the OP ;
  • RegExp;
  • Compiled Regexp;
  • The version provided by @maraca (w/o toLowerCase and substring);
  • "fastIsHex" version (switch-based), I've added just for fun.

The test machine configuration is as follows:

  • JVM: Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
  • CPU: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz

And here are the results I got for the original test string "0x123fa" and 10.000.000 iterations:

Method "NOP" => #10000000 iterations in 9ms
Method "isHexadecimal (OP)" => #10000000 iterations in 300ms
Method "RegExp" => #10000000 iterations in 4270ms
Method "RegExp (Compiled)" => #10000000 iterations in 1025ms
Method "isHexadecimal (maraca)" => #10000000 iterations in 135ms
Method "fastIsHex" => #10000000 iterations in 107ms

as you can see even the original method by the OP is faster than the RegExp method (at least when using JDK-provided RegExp implementation).

(for your reference)

Benchmark code:

public static void main(String[] argv) throws Exception {
    //Number of ITERATIONS
    final int ITERATIONS = 10000000;

    //NOP
    benchmark(ITERATIONS,"NOP",() -> nop(longHexText));

    //isHexadecimal
    benchmark(ITERATIONS,"isHexadecimal (OP)",() -> isHexadecimal(longHexText));

    //Un-compiled regexp
    benchmark(ITERATIONS,"RegExp",() -> longHexText.matches("0x[0-9a-fA-F]+"));

    //Pre-compiled regexp
    final Pattern pattern = Pattern.compile("0x[0-9a-fA-F]+");
    benchmark(ITERATIONS,"RegExp (Compiled)", () -> {
        pattern.matcher(longHexText).matches();
    });

    //isHexadecimal (maraca)
    benchmark(ITERATIONS,"isHexadecimal (maraca)",() -> isHexadecimalMaraca(longHexText));

    //FastIsHex
    benchmark(ITERATIONS,"fastIsHex",() -> fastIsHex(longHexText));
}

public static void benchmark(int iterations,String name,Runnable block) {
    //Start Time
    long stime = System.currentTimeMillis();

    //Benchmark
    for(int i = 0; i < iterations; i++) {
        block.run();
    }

    //Done
    System.out.println(
        String.format("Method \"%s\" => #%d iterations in %dms",name,iterations,(System.currentTimeMillis()-stime))
    );
}

NOP method:

public static boolean nop(String value) { return true; }

fastIsHex method:

public static boolean fastIsHex(String value) {

    //Value must be at least 4 characters long (0x00)
    if(value.length() < 4) {
        return false;
    }

    //Compute where the data starts
    int start = ((value.charAt(0) == '-') ? 1 : 0) + 2;

    //Check prefix
    if(value.charAt(start-2) != '0' || value.charAt(start-1) != 'x') {
        return false;
    }

    //Verify data
    for(int i = start; i < value.length(); i++) {
        switch(value.charAt(i)) {
            case '0':case '1':case '2':case '3':case '4':case '5':case '6':case '7':case '8':case '9':
            case 'a':case 'b':case 'c':case 'd':case 'e':case 'f':
            case 'A':case 'B':case 'C':case 'D':case 'E':case 'F':
                continue;

            default:
                return false;
        }
    }

    return true;
}

So, the answer is no, for short-strings and the task at hand, RegExp is not faster.

When it comes to a longer strings, the balance is quite different, below are results for the 8192 long hex string, I've generated with:

hexdump -n 8196 -v -e '/1 "%02X"' /dev/urandom

and 10.000 iterations:

Method "NOP" => #10000 iterations in 2ms
Method "isHexadecimal (OP)" => #10000 iterations in 1512ms
Method "RegExp" => #10000 iterations in 1303ms
Method "RegExp (Compiled)" => #10000 iterations in 1263ms
Method "isHexadecimal (maraca)" => #10000 iterations in 553ms
Method "fastIsHex" => #10000 iterations in 530ms

As you can see, hand-written methods (the one by macara and my fastIsHex), still beat the RegExp, but original method does not, (due to substring() and toLowerCase()).

Sidenote:

This benchmark is very simple indeed and only tests the "worst case" scenario (i.e. a fully valid string), a real life results, with the mixed data lengths and a non-0 valid-invalid ratio, might be quite different.

Update:

I also gave a try to the char[] array version:

 char[] chars = value.toCharArray();
 for (idx += 2; idx < chars.length; idx++) { ... }

and it was even a bit slower than getCharAt(i) version:

  Method "isHexadecimal (maraca) char[] array version" => #10000000 iterations in 194ms
  Method "fastIsHex, char[] array version" => #10000000 iterations in 164ms

my guess is that is due to array copy inside toCharArray.

Update (#2):

I've run an additional 8k/100.000 iterations test to see if there is any real difference in speed between the "maraca" and "fastIsHex" methods, and have also normalized them to use exactly the same precondition code:

Run #1

Method "isHexadecimal (maraca) *normalized" => #100000 iterations in 5341ms
Method "fastIsHex" => #100000 iterations in 5313ms

Run #2

Method "isHexadecimal (maraca) *normalized" => #100000 iterations in 5313ms
Method "fastIsHex" => #100000 iterations in 5334ms

I.e. the speed difference between these two methods is marginal at best, and is probably due to a measurement error (as I'm running this on my workstation and not a specially setup clean test environment).

Aircondition answered 3/11, 2016 at 11:39 Comment(4)
Very nice, did you also check a version where you convert the String to a charArray first? It could be worth it if the .length() and .charAt() calls are not that efficient.Nor
Btw. to really compare your version to mine, you have to remove the check for unary plus in my code or add it to yours. It seems strange that the switch is faster, because in my mind this would be the same as c == '1' || c == '2' || ... || c == 'F' which are more checks on average.Nor
AFAIK Java will generate a different byte-code for the switch statements see: #768321 and #10288200. (and will not do boolean '||'). But in this case, I think that these two methods are pretty much on par, and the difference is around the margin of error (maybe that even due to that extra '+' check, I will give it a try, once I have some time).Aircondition
I tested some implementations using your code - yet I cannot explain why compiled regexes with brics are that fast. Maybe the jit-compiler does some magic. You might consider to turn this benchmark to a jmh micro benchmark.Shunt
G
8

Checking whether string characters are within a certain range is exactly what regular expressions are built to do. They convert the expression into an atomic series of instructions; They're essentially writing out your manual parsing steps but at a lower level.

What tends to be slow with regular expressions is the conversion of the expression into instructions. You can see real performance gains when a regex is used more than once. That's when you can compile the expression ahead of time and then simply apply the resulting compiled instructions in a match, search, replace, etc.

As is the case with anything to do with performance, perform some tests and measure the results.

Guernsey answered 23/10, 2016 at 18:34 Comment(0)
A
7

I've written a small benchmark to estimate the performance of the:

  • NOP method (to get an idea of the baseline iteration speed);
  • Original method, as provided by the OP ;
  • RegExp;
  • Compiled Regexp;
  • The version provided by @maraca (w/o toLowerCase and substring);
  • "fastIsHex" version (switch-based), I've added just for fun.

The test machine configuration is as follows:

  • JVM: Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
  • CPU: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz

And here are the results I got for the original test string "0x123fa" and 10.000.000 iterations:

Method "NOP" => #10000000 iterations in 9ms
Method "isHexadecimal (OP)" => #10000000 iterations in 300ms
Method "RegExp" => #10000000 iterations in 4270ms
Method "RegExp (Compiled)" => #10000000 iterations in 1025ms
Method "isHexadecimal (maraca)" => #10000000 iterations in 135ms
Method "fastIsHex" => #10000000 iterations in 107ms

as you can see even the original method by the OP is faster than the RegExp method (at least when using JDK-provided RegExp implementation).

(for your reference)

Benchmark code:

public static void main(String[] argv) throws Exception {
    //Number of ITERATIONS
    final int ITERATIONS = 10000000;

    //NOP
    benchmark(ITERATIONS,"NOP",() -> nop(longHexText));

    //isHexadecimal
    benchmark(ITERATIONS,"isHexadecimal (OP)",() -> isHexadecimal(longHexText));

    //Un-compiled regexp
    benchmark(ITERATIONS,"RegExp",() -> longHexText.matches("0x[0-9a-fA-F]+"));

    //Pre-compiled regexp
    final Pattern pattern = Pattern.compile("0x[0-9a-fA-F]+");
    benchmark(ITERATIONS,"RegExp (Compiled)", () -> {
        pattern.matcher(longHexText).matches();
    });

    //isHexadecimal (maraca)
    benchmark(ITERATIONS,"isHexadecimal (maraca)",() -> isHexadecimalMaraca(longHexText));

    //FastIsHex
    benchmark(ITERATIONS,"fastIsHex",() -> fastIsHex(longHexText));
}

public static void benchmark(int iterations,String name,Runnable block) {
    //Start Time
    long stime = System.currentTimeMillis();

    //Benchmark
    for(int i = 0; i < iterations; i++) {
        block.run();
    }

    //Done
    System.out.println(
        String.format("Method \"%s\" => #%d iterations in %dms",name,iterations,(System.currentTimeMillis()-stime))
    );
}

NOP method:

public static boolean nop(String value) { return true; }

fastIsHex method:

public static boolean fastIsHex(String value) {

    //Value must be at least 4 characters long (0x00)
    if(value.length() < 4) {
        return false;
    }

    //Compute where the data starts
    int start = ((value.charAt(0) == '-') ? 1 : 0) + 2;

    //Check prefix
    if(value.charAt(start-2) != '0' || value.charAt(start-1) != 'x') {
        return false;
    }

    //Verify data
    for(int i = start; i < value.length(); i++) {
        switch(value.charAt(i)) {
            case '0':case '1':case '2':case '3':case '4':case '5':case '6':case '7':case '8':case '9':
            case 'a':case 'b':case 'c':case 'd':case 'e':case 'f':
            case 'A':case 'B':case 'C':case 'D':case 'E':case 'F':
                continue;

            default:
                return false;
        }
    }

    return true;
}

So, the answer is no, for short-strings and the task at hand, RegExp is not faster.

When it comes to a longer strings, the balance is quite different, below are results for the 8192 long hex string, I've generated with:

hexdump -n 8196 -v -e '/1 "%02X"' /dev/urandom

and 10.000 iterations:

Method "NOP" => #10000 iterations in 2ms
Method "isHexadecimal (OP)" => #10000 iterations in 1512ms
Method "RegExp" => #10000 iterations in 1303ms
Method "RegExp (Compiled)" => #10000 iterations in 1263ms
Method "isHexadecimal (maraca)" => #10000 iterations in 553ms
Method "fastIsHex" => #10000 iterations in 530ms

As you can see, hand-written methods (the one by macara and my fastIsHex), still beat the RegExp, but original method does not, (due to substring() and toLowerCase()).

Sidenote:

This benchmark is very simple indeed and only tests the "worst case" scenario (i.e. a fully valid string), a real life results, with the mixed data lengths and a non-0 valid-invalid ratio, might be quite different.

Update:

I also gave a try to the char[] array version:

 char[] chars = value.toCharArray();
 for (idx += 2; idx < chars.length; idx++) { ... }

and it was even a bit slower than getCharAt(i) version:

  Method "isHexadecimal (maraca) char[] array version" => #10000000 iterations in 194ms
  Method "fastIsHex, char[] array version" => #10000000 iterations in 164ms

my guess is that is due to array copy inside toCharArray.

Update (#2):

I've run an additional 8k/100.000 iterations test to see if there is any real difference in speed between the "maraca" and "fastIsHex" methods, and have also normalized them to use exactly the same precondition code:

Run #1

Method "isHexadecimal (maraca) *normalized" => #100000 iterations in 5341ms
Method "fastIsHex" => #100000 iterations in 5313ms

Run #2

Method "isHexadecimal (maraca) *normalized" => #100000 iterations in 5313ms
Method "fastIsHex" => #100000 iterations in 5334ms

I.e. the speed difference between these two methods is marginal at best, and is probably due to a measurement error (as I'm running this on my workstation and not a specially setup clean test environment).

Aircondition answered 3/11, 2016 at 11:39 Comment(4)
Very nice, did you also check a version where you convert the String to a charArray first? It could be worth it if the .length() and .charAt() calls are not that efficient.Nor
Btw. to really compare your version to mine, you have to remove the check for unary plus in my code or add it to yours. It seems strange that the switch is faster, because in my mind this would be the same as c == '1' || c == '2' || ... || c == 'F' which are more checks on average.Nor
AFAIK Java will generate a different byte-code for the switch statements see: #768321 and #10288200. (and will not do boolean '||'). But in this case, I think that these two methods are pretty much on par, and the difference is around the margin of error (maybe that even due to that extra '+' check, I will give it a try, once I have some time).Aircondition
I tested some implementations using your code - yet I cannot explain why compiled regexes with brics are that fast. Maybe the jit-compiler does some magic. You might consider to turn this benchmark to a jmh micro benchmark.Shunt
V
4

Brute force approach to solve the problem is to systematically test all combinations. It is not Your case.

You can get better performance from hand written procedure. You can take advantage of the data distribution if You know it in advance. Or You can make some clever shortcuts that apply on Your case. But it really is not guaranteed that what You write would be automatically faster that regex. Regex implementation is optimized too and You can easily end up with code that is worse than that.

The code in Your question is really nothing special and most probably it would be on par with the regex. As I tested it, there was no clear winner, sometimes one was faster, sometimes the other, the difference was small. Your time is limited, think wisely where You spend it.

Venator answered 23/10, 2016 at 22:58 Comment(0)
I
4

You're misusing the term "brute force." A better term is ad hoc custom matching.

Regex interpreters are generally slower than custom pattern matchers. The regex is compiled into a byte code, and compilation takes time. Even ignoring compilation (which might be fine if you compile only once and match a very long string and/or many times so the compilation cost isn't important), machine instructions spent in the matching interpreter are overhead that the custom matcher doesn't have.

In cases where the regex matcher wins out, it's normally that the regex engine is implemented in very fast native code, while the custom matcher is written in something slower.

Now you can compile regexes to native code that runs just as fast as a well-done custom matcher. This is the approach of e.g. lex/flex and others. But the most common library or built-in languages don't take this approach (Java, Python, Perl, etc.). They use interpreters.

Native code-generating libraries tend to be cumbersome to use except maybe in C/C++ where they've been part of the air for decades.

In other languages, I'm a fan of state machines. To me they are easier to understand and get correct than either regexes or custom matchers. Below is one for your problem. State 0 is the start state, and D stands for a hex digit.

state machines

Implementation of the machine can be extremely fast. In Java, it might look like this:

static boolean isHex(String s) {
  int state = 0;
  for (int i = 0; i < s.length(); i++) {
    char c = s.charAt(i);
    switch (state) {
      case 0:
        if (c == '-') state = 1;
        else if (c == '0') state = 2;
        else return false;
        break;
      case 1:
        if (c == '0') state = 2;
        else return false;
        break;
      case 2:
        if (c == 'x') state = 3;
        else return false;
        break;
      case 3:
        if (isHexDigit(c)) state = 4;
        else return false;
        break;
      case 4:
        if (isHexDigit(c)) ; // state already = 4
        else return false;
        break;
    }
  }
  return true;
}

static boolean isHexDigit(char c) {
  return '0' <= c && c <= '9' || 'A' <= c && c <= 'F' || 'a' <= c && c <= 'f';
}

The code isn't super short, but it's a direct translation of the diagram. There's nothing to mess up short of simple typographical errors.

In C, you can implement states as goto labels:

int isHex(char *s) {
  char c;
  s0:
    c = *s++;
    if (c == '-') goto s1;
    if (c == '0') goto s2;
    return 0;
  s1:
    c = *s++;
    if (c == '0') goto s2;
    return 0;
  s2:
    c = *s++;
    if (c == 'x') goto s3;
    return 0;
  s3:
    c = *s++;
    if (isxdigit(c)) goto s4;
    return 0;
  s4: 
    c = *s++;
    if (isxdigit(c)) goto s4;
    if (c == '\0') return 1;
    return 0;
}

This kind of goto matcher written in C is generally the fastest I've seen. On my MacBook using an old gcc (4.6.4), this one compiles to only 35 machine instructions.

Inductee answered 5/11, 2016 at 4:37 Comment(5)
I agree - in theory you are right. Yet practice (see my posting) showed that efficient regex implementations in Java could outperform your Java implementation (hard to believe for me, maybe there is an error in the benchmark, yet I cannot find one).Shunt
@Shunt You should read #504603 . Your times can't really be trusted. Also oracle.com/technetwork/articles/java/…Inductee
... and the jmh benchmark shows the same results, astonishingly. If needed I can publish snippets on github for better reproducing these results.Shunt
@Gene, IMO a singular automaton which handles both the prefix (0x00) and the hex string body, might not be the best option in this case, as there will be a penalty for checking the state, for an each input hex digit, despite the state is not going to change any longer, once you are over the prefix. Your C implementation effectively works around this, by looping inside the D state with GOTO, but in Java it should be more effective to loop over the "body" separately.Aircondition
@Aircondition You're right that Java does not admit fastest-possible state machine implementation. Just reasonably fast. If you're willing to trade space for speed, you can use a rectangular array char x state -> state to get back some speed.Inductee
N
1

Usually what's better depends on your goals. If readability is the main goal (what it should be, unless you detected a performance issue) then regex are just fine.

If performance is your goal, then you have to analyze the problem first. E.g. if you know it's either a phone number or a hexadecimal number (and nothing else) then the problem becomes much simpler.

Now let's have a look at your function (performance-wise) to detect hexadecimal numbers:

  1. Getting the substring is bad (creating a new object in general), better work with an index and advance it.
  2. Instead of using toLower() it's better to compare to upper and lower case letters (the string is only iterated once, no superfluous substitutions are performed and no new object is created).

So a performance-optimized version could look something like this (you can maybe optimize further by using the charArray instead of the string):

public static final boolean isHexadecimal(String value) {
  if (value.length() < 3)
    return false;
  int idx;
  if (value.charAt(0) == '-' || value.charAt(0) == '+') { // also supports unary plus
    if (value.length() < 4) // necessairy because -0x and +0x are not valid
      return false;
    idx = 1;
  } else {
    idx = 0;
  }
  if (value.chartAt(idx) != '0' || value.charAt(idx + 1) != 'x')
    return false;
  for (idx += 2; idx < value.length(); idx++) {
    char c = value.charAt(idx);
    if (!((c >= '0' && c <= '9') || (c >= 'a' && c <= 'f') || (c >= 'A' && c <= 'F')))
      return false;
  }
  return true;
}
Nor answered 30/10, 2016 at 13:29 Comment(0)
Q
0

Well implemented regular expressions can be faster than naive brute force implementation of the same pattern. On the other hand you always can implement a faster solution for a specific case. Also as stated in the article above most implementations in popular languages are not efficient (in some cases).

I'd implement own solultions only when performance is an absolute priority and with extensive testing and profiling.

Quiteris answered 31/10, 2016 at 14:50 Comment(1)
The document to refer to points out that well implemented regular expressions do scale (in opposite to those that are proposed by Java,C#,Perl,PHP,Python ...). In practice there is probably no regex implementation that outperforms the already mentioned handcoded implementationsShunt
S
0

To get a perfomance that is better than naive handcoded validators, you may use a Regular Expression library that is based on deterministic automatons, e.g. Brics Automaton

I wrote a short jmh benchmark:

@State(Scope.Thread)
public abstract class MatcherBenchmark {

   private String longHexText;

   @Setup
   public void setup() {
     initPattern("0x[0-9a-fA-F]+");
     this.longHexText = "0x123fa";
   }

   public abstract void initPattern(String pattern);

   @Benchmark
   @BenchmarkMode(Mode.AverageTime)
   @OutputTimeUnit(TimeUnit.MICROSECONDS)
   @Warmup(iterations = 10)
   @Measurement(iterations = 10)
   @Fork(1)
   public void benchmark() {
     boolean result =  benchmark(longHexText);
     if (!result) {
        throw new RuntimeException();
     }
   }

   public abstract boolean benchmark(String text);

   @TearDown
   public void tearDown() {
     donePattern();
     this.longHexText = null;
   }

   public abstract void donePattern();

}

and implemented it with:

@Override
public void initPattern(String pattern) {
    RegExp r = new RegExp(pattern);
    this.automaton = new RunAutomaton(r.toAutomaton(true));
}

@Override
public boolean benchmark(String text) {
    return automaton.run(text);
}

I also created benchmarks for Zeppelins, Genes and the compiled java.util.Regex solution, and a solution with rexlex. These are the results of the jmh benchmark on my machine:

BricsMatcherBenchmark.benchmark      avgt   10  0,014 �  0,001  us/op
GenesMatcherBenchmark.benchmark      avgt   10  0,017 �  0,001  us/op
JavaRegexMatcherBenchmark.benchmark  avgt   10  0,097 �  0,005  us/op
RexlexMatcherBenchmark.benchmark     avgt   10  0,061 �  0,002  us/op
ZeppelinsBenchmark.benchmark         avgt   10  0,008 �  0,001  us/op

Starting the same benchmark with a non-hex-digit 0x123fax produces following results (note: I inverted the validation in benchmark for this benchmark)

BricsMatcherBenchmark.benchmark      avgt   10  0,015 �  0,001  us/op
GenesMatcherBenchmark.benchmark      avgt   10  0,019 �  0,001  us/op
JavaRegexMatcherBenchmark.benchmark  avgt   10  0,102 �  0,001  us/op
RexlexMatcherBenchmark.benchmark     avgt   10  0,052 �  0,002  us/op
ZeppelinsBenchmark.benchmark         avgt   10  0,009 �  0,001  us/op
Shunt answered 5/11, 2016 at 5:27 Comment(0)
T
-2

Regex have a huge lot of advantages but still Regex do have a performance issue.

Tagliatelle answered 23/10, 2016 at 18:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.