How can I time a code segment for testing performance with Pythons timeit?
Asked Answered
M

9

248

I've a python script which works just as it should, but I need to write the execution time. I've googled that I should use timeit but I can't seem to get it to work.

My Python script looks like this:

import sys
import getopt
import timeit
import random
import os
import re
import ibm_db
import time
from string import maketrans
myfile = open("results_update.txt", "a")

for r in range(100):
    rannumber = random.randint(0, 100)

    update = "update TABLE set val = %i where MyCount >= '2010' and MyCount < '2012' and number = '250'" % rannumber
    #print rannumber

    conn = ibm_db.pconnect("dsn=myDB","usrname","secretPWD")

for r in range(5):
    print "Run %s\n" % r        
    ibm_db.execute(query_stmt)
 query_stmt = ibm_db.prepare(conn, update)

myfile.close()
ibm_db.close(conn)

What I need is the time it takes to execute the query and write it to the file results_update.txt. The purpose is to test an update statement for my database with different indexes and tuning mechanisms.

Mantra answered 19/5, 2010 at 14:24 Comment(1)
Was / is your question specific about timeit? I guess not. In that case, you should probably remove "with Pythons timeit" from the title.Carthusian
L
403

You can use time.time() or time.clock() before and after the block you want to time.

import time

t0 = time.time()
code_block
t1 = time.time()

total = t1-t0

This method is not as exact as timeit (it does not average several runs) but it is straightforward.

time.time() (in Windows and Linux) and time.clock() (in Linux) are not precise enough for fast functions (you get total = 0). In this case or if you want to average the time elapsed by several runs, you have to manually call the function multiple times (As I think you already do in you example code and timeit does automatically when you set its number argument)

import time

def myfast():
   code

n = 10000
t0 = time.time()
for i in range(n): myfast()
t1 = time.time()

total_n = t1-t0

In Windows, as Corey stated in the comment, time.clock() has much higher precision (microsecond instead of second) and is preferred over time.time().

Lietuva answered 19/5, 2010 at 14:32 Comment(12)
fyi on windows, use time.clock() instead of time.time()Mcbroom
Thanks Corey, why? because clock is more precise (microseconds) or there is something more?Lietuva
You can use timeit.default_timer() to make your code platform independent; it returns either time.clock() or time.time() as appropriate for the OS.Scalable
Rather than select a clock by hand, use timeit.default_timer; Python has already done the work for you. But really, you should use timeit.timeit(myfast, number=n) instead of re-inventing the repetitive call wheel (and miss the fact that timeit disables the garbage collector while running the code repeatedly).Usher
update: time.clock() is now deprecated. You should now use time.time(). Actually, since version 3.3, the best option would be time.perf_counter()Sideline
I think it is worth mentioning that time.time() outputs the time in seconds, as a floating point number.Parotic
time.clock() hell of a accurate like insertion sort: 0.000394667314326 reverse insertion sort: 0.000821744938248 merge_sort sort: 0.000551385520222 recursive_insertion_sort sort: 0.0013575406893Unsuspected
For what it is worth, timeit takes the min time of repeated runs not the average.Seclusion
I believe time.monotonic() should be strictly better than time.time(): Always use monotonic time when measuring time differences if you like to avoid elusive bugs.Jori
@Lietuva Could you add time.perf_counter() to the answer as better alternative in Python 3? It does not misbehave when the system changes the time or leap seconds occur and always has the best precision available (except using perf_counter_ns() to avoid floating point inaccuracies).Algorism
I agree with @Algorism - especially since the question is asking for performance, you will get much more precise and accurate results using time.perf_counter(). Python's time.time() is not meant to be used for timing code performance.Dusk
Let's totally recommend a monotonic time source for such measurements!Brighton
D
73

If you are profiling your code and can use IPython, it has the magic function %timeit.

%%timeit operates on cells.

In [2]: %timeit cos(3.14)
10000000 loops, best of 3: 160 ns per loop

In [3]: %%timeit
   ...: cos(3.14)
   ...: x = 2 + 3
   ...: 
10000000 loops, best of 3: 196 ns per loop
Dupe answered 2/3, 2014 at 23:26 Comment(0)
R
45

Quite apart from the timing, this code you show is simply incorrect: you execute 100 connections (completely ignoring all but the last one), and then when you do the first execute call you pass it a local variable query_stmt which you only initialize after the execute call.

First, make your code correct, without worrying about timing yet: i.e. a function that makes or receives a connection and performs 100 or 500 or whatever number of updates on that connection, then closes the connection. Once you have your code working correctly is the correct point at which to think about using timeit on it!

Specifically, if the function you want to time is a parameter-less one called foobar you can use timeit.timeit (2.6 or later -- it's more complicated in 2.5 and before):

timeit.timeit('foobar()', number=1000)

Since 3.5 the globals parameter makes it straightforward to use timeit it with functions that take parameters:

timeit.timeit('foobar(x,y)', number=1000, globals = globals())

You'd better specify the number of runs because the default, a million, may be high for your use case (leading to spending a lot of time in this code;-).

Reahard answered 19/5, 2010 at 14:33 Comment(3)
After struggling with this for the last few mintues I want to let future viewers know that you also probably want to pass a setup variable if your function foobar is in a main file. Like this: timeit.timeit('foobar()','from __main__ import foobar',number=1000)Rillings
In Python 2.7.8, you could simply use timeit.timeit( foobar, number=1000 )Ohmmeter
since 3.5 with the globals parameter you can pass a function that takes parameters timeit.timeit('foobar(x,y)', number=1000, globals = globals())Borrego
U
18

Focus on one specific thing. Disk I/O is slow, so I'd take that out of the test if all you are going to tweak is the database query.

And if you need to time your database execution, look for database tools instead, like asking for the query plan, and note that performance varies not only with the exact query and what indexes you have, but also with the data load (how much data you have stored).

That said, you can simply put your code in a function and run that function with timeit.timeit():

def function_to_repeat():
    # ...

duration = timeit.timeit(function_to_repeat, number=1000)

This would disable the garbage collection, repeatedly call the function_to_repeat() function, and time the total duration of those calls using timeit.default_timer(), which is the most accurate available clock for your specific platform.

You should move setup code out of the repeated function; for example, you should connect to the database first, then time only the queries. Use the setup argument to either import or create those dependencies, and pass them into your function:

def function_to_repeat(var1, var2):
    # ...

duration = timeit.timeit(
    'function_to_repeat(var1, var2)',
    'from __main__ import function_to_repeat, var1, var2', 
    number=1000)

would grab the globals function_to_repeat, var1 and var2 from your script and pass those to the function each repetition.

Usher answered 26/3, 2016 at 16:0 Comment(1)
Putting the code into a function is a step I was looking for -since simply making code a string and evaling is not going to fly for anything not completely trivial. thxWoozy
A
9

Here's a simple wrapper for steven's answer. This function doesn't do repeated runs/averaging, just saves you from having to repeat the timing code everywhere :)

'''function which prints the wall time it takes to execute the given command'''
def time_func(func, *args): #*args can take 0 or more 
  import time
  start_time = time.time()
  func(*args)
  end_time = time.time()
  print("it took this long to run: {}".format(end_time-start_time))
Always answered 22/5, 2019 at 13:33 Comment(1)
Bonus: support keyword arguments with def time_func(func, *args, **kwargs): and func(*args, **kwargs)Triparted
B
5

How to time a function using timeit:

import timeit

def time_this():
    return 'a' + 'b'

timeit.timeit(time_this, number=1000)

It returns the time it took in seconds to run time_this() 1000 times.

Biota answered 21/2, 2021 at 20:41 Comment(2)
I think the lambda here is unnecessary, you could just timeit.timeit(time_this, number=1000)Haymow
I think the lambda is needed, it should look like this: timeit.timeit(lambda: time_this, number=1000)Blase
P
4

Another simple timeit example:

def your_function_to_test():
   # do some stuff...

time_to_run_100_times = timeit.timeit(your_function_to_test, number=100)
Papacy answered 23/7, 2020 at 7:48 Comment(4)
This won't work, you have to either call the function inside your lambda function, like timeit.timeit(lambda: your_function_to_test, number=100), or simply pass the actual funtion to test directly : timeit.timeit(your_function_to_test, number=100)Haymow
@Haymow , you meant timeit.timeit(lambda: your_function_to_test(), number=100), right?Acker
I had a good reason at the time for including lambda but seems to work without nowPapacy
@AiratK, yes it was a typo thank you for correcting.Haymow
B
3

I see the question has already been answered, but still want to add my 2 cents for the same.

I have also faced similar scenario in which I have to test the execution times for several approaches and hence written a small script, which calls timeit on all functions written in it.

The script is also available as github gist here.

Hope it will help you and others.

from random import random
import types

def list_without_comprehension():
    l = []
    for i in xrange(1000):
        l.append(int(random()*100 % 100))
    return l

def list_with_comprehension():
    # 1K random numbers between 0 to 100
    l = [int(random()*100 % 100) for _ in xrange(1000)]
    return l


# operations on list_without_comprehension
def sort_list_without_comprehension():
    list_without_comprehension().sort()

def reverse_sort_list_without_comprehension():
    list_without_comprehension().sort(reverse=True)

def sorted_list_without_comprehension():
    sorted(list_without_comprehension())


# operations on list_with_comprehension
def sort_list_with_comprehension():
    list_with_comprehension().sort()

def reverse_sort_list_with_comprehension():
    list_with_comprehension().sort(reverse=True)

def sorted_list_with_comprehension():
    sorted(list_with_comprehension())


def main():
    objs = globals()
    funcs = []
    f = open("timeit_demo.sh", "w+")

    for objname in objs:
        if objname != 'main' and type(objs[objname]) == types.FunctionType:
            funcs.append(objname)
    funcs.sort()
    for func in funcs:
        f.write('''echo "Timing: %(funcname)s"
python -m timeit "import timeit_demo; timeit_demo.%(funcname)s();"\n\n
echo "------------------------------------------------------------"
''' % dict(
                funcname = func,
                )
            )

    f.close()

if __name__ == "__main__":
    main()

    from os import system

    #Works only for *nix platforms
    system("/bin/bash timeit_demo.sh")

    #un-comment below for windows
    #system("cmd timeit_demo.sh")
Berey answered 19/12, 2015 at 11:7 Comment(0)
P
3

The testing suite doesn't make an attempt at using the imported timeit so it's hard to tell what the intent was. Nonetheless, this is a canonical answer so a complete example of timeit seems in order, elaborating on Martijn's answer.

The docs for timeit offer many examples and flags worth checking out. The basic usage on the command line is:

$ python -mtimeit "all(True for _ in range(1000))"
2000 loops, best of 5: 161 usec per loop
$ python -mtimeit "all([True for _ in range(1000)])"
2000 loops, best of 5: 116 usec per loop

Run with -h to see all options. Python MOTW has a great section on timeit that shows how to run modules via import and multiline code strings from the command line.

In script form, I typically use it like this:

import argparse
import copy
import dis
import inspect
import random
import sys
import timeit

def test_slice(L):
    L[:]

def test_copy(L):
    L.copy()

def test_deepcopy(L):
    copy.deepcopy(L)

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--n", type=int, default=10 ** 5)
    parser.add_argument("--trials", type=int, default=100)
    parser.add_argument("--dis", action="store_true")
    args = parser.parse_args()
    n = args.n
    trials = args.trials
    namespace = dict(L = random.sample(range(n), k=n))
    funcs_to_test = [x for x in locals().values() 
                     if callable(x) and x.__module__ == __name__]
    print(f"{'-' * 30}\nn = {n}, {trials} trials\n{'-' * 30}\n")

    for func in funcs_to_test:
        fname = func.__name__
        fargs = ", ".join(inspect.signature(func).parameters)
        stmt = f"{fname}({fargs})"
        setup = f"from __main__ import {fname}"
        time = timeit.timeit(stmt, setup, number=trials, globals=namespace)
        print(inspect.getsource(globals().get(fname)))

        if args.dis:
            dis.dis(globals().get(fname))

        print(f"time (s) => {time}\n{'-' * 30}\n")

You can pretty easily drop in the functions and arguments you need. Use caution when using impure functions and take care of state.

Sample output:

$ python benchmark.py --n 10000
------------------------------
n = 10000, 100 trials
------------------------------

def test_slice(L):
    L[:]

time (s) => 0.015502399999999972
------------------------------

def test_copy(L):
    L.copy()

time (s) => 0.01651419999999998
------------------------------

def test_deepcopy(L):
    copy.deepcopy(L)

time (s) => 2.136012
------------------------------
Peal answered 12/7, 2020 at 18:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.