How to use timeit module

C

15

469

How do I use timeit to compare the performance of my own functions such as "insertion_sort" and "tim_sort"?

Cockswain answered 22/11, 2011 at 1:15 Comment(0)

A

323

The way timeit works is to run setup code once and then make repeated calls to a series of statements. So, if you want to test sorting, some care is required so that one pass at an in-place sort doesn't affect the next pass with already sorted data (that, of course, would make the Timsort really shine because it performs best when the data already partially ordered).

Here is an example of how to set up a test for sorting:

>>> import timeit

>>> setup = '''
import random

random.seed('slartibartfast')
s = [random.random() for i in range(1000)]
timsort = list.sort
'''

>>> print(min(timeit.Timer('a=s[:]; timsort(a)', setup=setup).repeat(7, 1000)))
0.04485079200821929

Note that the series of statements makes a fresh copy of the unsorted data on every pass.

Also, note the timing technique of running the measurement suite seven times and keeping only the best time — this can really help reduce measurement distortions due to other processes running on your system.

Assurgent answered 22/11, 2011 at 1:38 Comment(9)

Yes, it includes the list copy (which is very fast compared to the sort itself). If you don't copy though, the first pass sorts the list and remaining passed don't have to do any work. If you want to know the time just for the sort, then run the above with and without the timsort(a) and take the difference :-) – Assurgent 7/2, 2012 at 1:58

I'd recommend to repeat 7 times for each setup, and then average; rather than the other way round. This way, if each spike due to other processes has a good chance of being ignored entirely, rather than averaged out. – Ivey 3/4, 2012 at 18:8

@Ivey Use the min() rather than the average of the timings. That is a recommendation from me, from Tim Peters, and from Guido van Rossum. The fastest time represents the best an algorithm can perform when the caches are loaded and the system isn't busy with other tasks. All the timings are noisy -- the fastest time is the least noisy. It is easy to show that the fastest timings are the most reproducible and therefore the most useful when timing two different implementations. – Assurgent 3/4, 2012 at 19:43

You calculate an average (well, total, but it's equivalent) for 1000 inputs; then repeat 7 times, and take the minimum. You need the averaging over 1000 inputs because you want the average (not best-case) algorithm complexity. You need the minimum for precisely the reason you gave. I thought I can improve your approach by choosing one input, running the algorithm 7 times, taking the minimum; then repeating it for 1000 different inputs, and taking the average. What I didn't realize is that your .repeat(7,1000) already does this (by using the same seed)! So your solution is perfect IMO. – Ivey 3/4, 2012 at 21:4

I can only add that how you allocate your budget of 7000 executions (e.g., .repeat(7, 1000) vs. .repeat(2, 3500) vs .repeat(35, 200) should depend on how the error due to system load compares to the error due to input variability. In the extreme case if your system is always under heavy load, and you see a long thin tail on the left of execution time distribution (when you catch it in a rare idle state), you might even find .repeat(7000,1) to be more useful than .repeat(7,1000) if you can't budget more than 7000 runs. – Ivey 3/4, 2012 at 21:19

How about duplicating the array already in the setup, creating an iterator it over them, and then timing 'a=next(it); timsort(a)'? – Salleysalli 26/12, 2017 at 3:13

Why repeat one thousand executions 7 times? Is the idea not to run the setup once per execution? In which case surely the correct way to do it is repeat one execution 7000 times using .repeat(7000, 1)? – Freeboard 4/6, 2021 at 18:36

Some elaboration: creating a copy of the original parameters defined in setup in each iteration is only necessary if your function changes those parametes, like inplace sorting. E. g. for sorted() you wouldn't need the copy. Of cause, one wants to test the performance of any function, with or without side effects. But if you avoid functions with side effects it should be safe to exclude the copy statement. Now if you want to compare both types of functions and want really accurate measurements, you can subtract the duration of the copy mechanism from the tests where you applied it. – Carlenacarlene 8/6, 2021 at 18:6

I just saw one of the above comments quoted about using the min time as it’s the least noisy. That advice is unsound. A low runtime can arise because of luck - for instance, that memory happened to arrive in a cache at just the right time. Anomalously low execution times are just that: anomalies. Repeating runs also has the effect of warming up the hardware and software caches, leading to likely unrealistic results. The best approach is to run an entire application end to end, and then forcibly flush any caches (including OS caches). Then report the median or mean and a confidence interval. – Gillyflower 10/3, 2023 at 13:58

V

348

If you want to use timeit in an interactive Python session, there are two convenient options:

Use the IPython shell. It features the convenient %timeit special function:

In [1]: def f(x):
   ...:     return x*x
   ...: 

In [2]: %timeit for x in range(100): f(x)
100000 loops, best of 3: 20.3 us per loop

In a standard Python interpreter, you can access functions and other names you defined earlier during the interactive session by importing them from __main__ in the setup statement:

>>> def f(x):
...     return x * x 
... 
>>> import timeit
>>> timeit.repeat("for x in range(100): f(x)", "from __main__ import f",
                  number=100000)
[2.0640320777893066, 2.0876040458679199, 2.0520210266113281]

Virtuoso answered 22/11, 2011 at 1:41 Comment(0)

A

323

The way timeit works is to run setup code once and then make repeated calls to a series of statements. So, if you want to test sorting, some care is required so that one pass at an in-place sort doesn't affect the next pass with already sorted data (that, of course, would make the Timsort really shine because it performs best when the data already partially ordered).

Here is an example of how to set up a test for sorting:

>>> import timeit

>>> setup = '''
import random

random.seed('slartibartfast')
s = [random.random() for i in range(1000)]
timsort = list.sort
'''

>>> print(min(timeit.Timer('a=s[:]; timsort(a)', setup=setup).repeat(7, 1000)))
0.04485079200821929