Is using a Regular Expression faster than IndexOf?
Asked Answered
D

9

13

I have an app running which looks at items in a queue, then based upon certain keywords a category is applied - then it is inserted into a database.

I'm using IndexOf to determine if a certain keyword is present.

Is this the ideal way or would a RegEX be faster?

There's about 10 items per second being processed or so.

Danuloff answered 21/2, 2012 at 15:19 Comment(3)
You should try both approaches and measure what is faster. Also, 10 times per second is nothing, you shouldn't worry about performances here.Down
Also, we'd need to know more about the relative complexity of the parsing. If you need to call String.IndexOf 10 times to achieve the same effect as the RegEx, the performance ratio will be different than if it is 1 for 1.Insatiable
10 items per second is nothing? When would you actually start to care about performance then?Danuloff
K
20

For just finding a keyword the IndexOf method is faster than using a regular expression. Regular expressions are powerful, but their power lies in flexibility, not raw speed. They don't beat string methods at simple string operations.

Anyway, if the strings are not huge, it shouldn't really matter as you are not doing it so often.

Keese answered 21/2, 2012 at 15:23 Comment(0)
P
14

http://ayende.com/blog/2930/regex-vs-string-indexof

It seems it may matter on the length of the string on efficiency.

Partisan answered 21/2, 2012 at 15:22 Comment(0)
I
3

The only way you know for sure is testing it. But making an educated guess it depends on the number of keywords your are testing, the length of the text, etc. The indexOf would probably win.

The only way you know for sure is write a test for your specific scenario.

Inexhaustible answered 21/2, 2012 at 15:24 Comment(0)
K
2

I doubt it - indexOf is a very simple algorithm that will just seek through your string and return the first occurrence it finds.

Regex is a far more complex mechanism that needs to be parsed and checked against the whole string. If your string is very large, you are better off with indexOf.

Kiele answered 21/2, 2012 at 15:24 Comment(0)
R
1

First of all, with 10 items per second you probably don't even need to think about performance.

IndexOf is probably faster than regex in most cases. Especially if you don't use a precompiled regex.

It's performance might also depend on the chosen string comparison/culture. I expect StringComparison.Ordinal to be fastest.

Rooks answered 21/2, 2012 at 15:22 Comment(0)
E
1

Why not experiment and measure the time elapsed using the System.Diagnostics.Stopwatch class? http://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch.aspx

Set up a Stopwatch object before your indexOf operation and then measure elapsed time after it. Then, swap out the indexOf for a regular expression. Finally, report back with your findings so that we can see them too!

Easternmost answered 21/2, 2012 at 15:24 Comment(0)
P
1

At least this programmer finds it faster to understand the code that uses IndexOf!

Does saving a little CPU time justify putting up the time it takes the next person to understand the code?

Proposition answered 21/2, 2012 at 15:24 Comment(4)
A regex that would find the first occurrence of a string to emulate indexOf wouldn't put any programmer into serious trouble if he wanted to understand it.Kiele
@FlorianPeschka, agreed the cost is low, but there is still a cost of looking at the RegEx.Proposition
RegEx.Match is hard to understand?Danuloff
If RegEx is hard to understand well, then developers need to learn a little. It's like a mechanic would say that hex keys are hard to use so they use something else instead. Learn the tools of your profession. There's no excuse for that.Geri
I
1

It seems correct that regex is faster in longer strings. My example: a 364kB file content is searched for the string "<product ". The starting point is moved to find the next and the next and so on. However, the searched string is not found in the entire value.

I used three test commands:

         i = value.IndexOf("<" & tag & " ", xstart)

         i = value.IndexOf("<" & tag & " ", xstart, StringComparison.Ordinal)

         i = Regex.IsMatch(value.Substring(xstart), "<" & tag & " ", RegexOptions.Singleline)

Command one (indexof standard) needs ~ 7500 ms to search the string Command two (indexof with ordinal) needs ~ 300 ms ! command three (regex) needs ~ 650 ms (~1000ms with IgnoreCase option).

Inhaul answered 3/6, 2021 at 14:23 Comment(0)
E
0

You can find information about this very query on this link: http://ayende.com/blog/2930/regex-vs-string-indexof

In summary it seems to indicate that the larger the searchpattern the better RegEx performs comparatively.

Experiential answered 21/2, 2012 at 15:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.