I'm trying to use the timeit module in Python (EDIT: We are using Python 3) to decide between a couple of different code flows. In our code, we have a series of if-statements that test for the existence of a character code in a string, and if it's there replace it like this:
if "<substring>" in str_var:
str_var = str_var.replace("<substring>", "<new_substring>")
We do this a number of times for different substrings. We're debating between that and using just the replace like this:
str_var = str_var.replace("<substring>", "<new_substring>")
We tried to use timeit to determine which one was faster. If the first code-block above is "stmt1" and the second is "stmt2", and our setup string looks like
str_var = '<string><substring><more_string>',
our timeit statements will look like this:
timeit.timeit(stmt=stmt1, setup=setup)
and
timeit.timeit(stmt=stmt2, setup=setup)
Now, running it just like that, on 2 of our laptops (same hardware, similar processing load) stmt1 (the statement with the if-statement) runs faster even after multiple runs (3-4 hundredths of a second vs. about a quarter of a second for stmt2).
However, if we define functions to do both things (including the setup creating the variable) like so:
def foo():
str_var = '<string><substring><more_string>'
if "<substring>" in str_var:
str_var = str_var.replace("<substring>", "<new_substring>")
and
def foo2():
str_var = '<string><substring><more_string>'
str_var = str_var.replace("<substring>", "<new_substring>")
and run timeit like:
timeit.timeit("foo()", setup="from __main__ import foo")
timeit.timeit("foo2()", setup="from __main__ import foo2")
the statement without the if-statement (foo2) runs faster, contradicting the non-functioned results.
Are we missing something about how Timeit works? Or how Python handles a case like this?
edit here is our actual code:
>>> def foo():
s = "hi 1 2 3"
s = s.replace('1','5')
>>> def foo2():
s = "hi 1 2 3"
if '1' in s:
s = s.replace('1','5')
>>> timeit.timeit(foo, "from __main__ import foo")
0.4094226634183542
>>> timeit.timeit(foo2, "from __main__ import foo2")
0.4815539780738618
vs this code:
>>> timeit.timeit("""s = s.replace("1","5")""", setup="s = 'hi 1 2 3'")
0.18738432400277816
>>> timeit.timeit("""if '1' in s: s = s.replace('1','5')""", setup="s = 'hi 1 2 3'")
0.02985000199987553
foo
function method, theif
statement method is always around 0.06, but the non-if
method is around 0.3. When I do use thefoo
functions, then, in that case, theif
statement method is around 0.61 and the non-if
method is around 0.53. (Those are the means from usingtimeit
10 times for each of the four possibilities.) I'm on a pretty fast Desktop machine using IPython with Python 2.7.3. – Calctufafoo2
still ends up taking longer, though it is down to a far smaller difference (.311 forfoo
vs .326 forfoo2
). – Assonanceif
one is shorter at all. It's actually doing more work than thereplace
alone since theif
test isTrue
(so thereplace
runs in both cases). The only way theif
can save you time is if the string does not contain the string-to-replace, thereby saving you some time in thereplace
call itself, and that should only be true ifin
has quicker time order thanreplace
. I don't know why it would make much of a difference, but you're using a different string-to-replace in the non-function withif
version. – Erida