How can I capture return value with Python timeit module?
Asked Answered
I

10

41

Im running several machine learning algorithms with sklearn in a for loop and want to see how long each of them takes. The problem is I also need to return a value and DONT want to have to run it more than once because each algorithm takes so long. Is there a way to capture the return value 'clf' using python's timeit module or a similar one with a function like this...

def RandomForest(train_input, train_output):
    clf = ensemble.RandomForestClassifier(n_estimators=10)
    clf.fit(train_input, train_output)
    return clf

when I call the function like this

t = Timer(lambda : RandomForest(trainX,trainy))
print t.timeit(number=1)

P.S. I also dont want to set a global 'clf' because I might want to do multithreading or multiprocessing later.

Iodometry answered 17/7, 2014 at 19:47 Comment(4)
Why do you even use timeit if you force number=1? timeit is useful to automatically handle repetitive timing, where you don't know how much time you should run the function to get a good timing etc. In your case simply using time would be fine and you wouldn't need any hack to get the return value.Hamlen
Can you provide an example link for me to see what you are referring to? I google time and it seems that the module which you might be talking about only seems to involve formatting dates and timezones, etcIodometry
Never heard of time.time()? Or time.clock()? The timeit module uses these functions to perform the timings. If you only have to do one timing you can simply call them directly, in the same way as the _timer function is used in unutbu answer (that is actually a reference to time.time or time.clock depending on the OS).Hamlen
@Hamlen I understood that timeit also does other things, like turn off garbage collection to make sure that we're doing a fair comparison. i.e., that we're looking at execution time, not wall time.Jeffreys
H
18

The problem boils down to timeit._template_func not returning the function's return value:

def _template_func(setup, func):
    """Create a timer function. Used if the "statement" is a callable."""
    def inner(_it, _timer, _func=func):
        setup()
        _t0 = _timer()
        for _i in _it:
            _func()
        _t1 = _timer()
        return _t1 - _t0
    return inner

We can bend timeit to our will with a bit of monkey-patching:

import timeit
import time

def _template_func(setup, func):
    """Create a timer function. Used if the "statement" is a callable."""
    def inner(_it, _timer, _func=func):
        setup()
        _t0 = _timer()
        for _i in _it:
            retval = _func()
        _t1 = _timer()
        return _t1 - _t0, retval
    return inner

timeit._template_func = _template_func

def foo():
    time.sleep(1)
    return 42

t = timeit.Timer(foo)
print(t.timeit(number=1))

returns

(1.0010340213775635, 42)

The first value is the timeit result (in seconds), the second value is the function's return value.

Note that the monkey-patch above only affects the behavior of timeit when a callable is passed timeit.Timer. If you pass a string statement, then you'd have to (similarly) monkey-patch the timeit.template string.

Hylton answered 17/7, 2014 at 20:0 Comment(5)
Hmmm,this seems to be returning me the function and not the functions return value. But what I have to do is capture it with ret_val = t.timeit(number=1)[1]() to actually run the function and get me back the value. Isnt that running the function twice though?Iodometry
Given the code you posted, I don't see why t.timeit should be returning a function. Do you get the same result as I do when you run the code I posted? If so, then you need to compare the what's different between that code and your code (paying particular attention to the type of the objects passed and returned.)Hylton
You are right I was still using timeit.Timer( lambda: dummy) instead of just timeit.Timer( dummy). There are some exceptionally smart ppl on StackOverflow. Damn I love this site.Iodometry
From looking at the source for timeit; it appears the purpose of the module is for it to be used at the command line as a testing tool for optimization of your code and for Python itself. If you are writing an app to test something; say the speed of an API call you may be better of using time.perf_counter twice and doing a subtraction on the two numbers.Mckinleymckinney
This may have worked, but as of 2023-07-11 it doesn't. The provided code only returns the time taken, as it would normally. The answer by Brendan Cody-Kenny resolves that (although it ain't pretty)Pileous
G
25

For Python 3.5 you can override the value of timeit.template

timeit.template = """
def inner(_it, _timer{init}):
    {setup}
    _t0 = _timer()
    for _i in _it:
        retval = {stmt}
    _t1 = _timer()
    return _t1 - _t0, retval
"""

unutbu's answer works for python 3.4 but not 3.5 as the _template_func function appears to have been removed in 3.5

Geisel answered 2/11, 2016 at 17:20 Comment(0)
H
18

The problem boils down to timeit._template_func not returning the function's return value:

def _template_func(setup, func):
    """Create a timer function. Used if the "statement" is a callable."""
    def inner(_it, _timer, _func=func):
        setup()
        _t0 = _timer()
        for _i in _it:
            _func()
        _t1 = _timer()
        return _t1 - _t0
    return inner

We can bend timeit to our will with a bit of monkey-patching:

import timeit
import time

def _template_func(setup, func):
    """Create a timer function. Used if the "statement" is a callable."""
    def inner(_it, _timer, _func=func):
        setup()
        _t0 = _timer()
        for _i in _it:
            retval = _func()
        _t1 = _timer()
        return _t1 - _t0, retval
    return inner

timeit._template_func = _template_func

def foo():
    time.sleep(1)
    return 42

t = timeit.Timer(foo)
print(t.timeit(number=1))

returns

(1.0010340213775635, 42)

The first value is the timeit result (in seconds), the second value is the function's return value.

Note that the monkey-patch above only affects the behavior of timeit when a callable is passed timeit.Timer. If you pass a string statement, then you'd have to (similarly) monkey-patch the timeit.template string.

Hylton answered 17/7, 2014 at 20:0 Comment(5)
Hmmm,this seems to be returning me the function and not the functions return value. But what I have to do is capture it with ret_val = t.timeit(number=1)[1]() to actually run the function and get me back the value. Isnt that running the function twice though?Iodometry
Given the code you posted, I don't see why t.timeit should be returning a function. Do you get the same result as I do when you run the code I posted? If so, then you need to compare the what's different between that code and your code (paying particular attention to the type of the objects passed and returned.)Hylton
You are right I was still using timeit.Timer( lambda: dummy) instead of just timeit.Timer( dummy). There are some exceptionally smart ppl on StackOverflow. Damn I love this site.Iodometry
From looking at the source for timeit; it appears the purpose of the module is for it to be used at the command line as a testing tool for optimization of your code and for Python itself. If you are writing an app to test something; say the speed of an API call you may be better of using time.perf_counter twice and doing a subtraction on the two numbers.Mckinleymckinney
This may have worked, but as of 2023-07-11 it doesn't. The provided code only returns the time taken, as it would normally. The answer by Brendan Cody-Kenny resolves that (although it ain't pretty)Pileous
C
8

Funnily enough, I'm also doing machine-learning, and have a similar requirement ;-)

I solved it as follows, by writing a function, that:

  • runs your function
  • prints the running time, along with the name of your function
  • returns the results

Let's say you want to time:

clf = RandomForest(train_input, train_output)

Then do:

clf = time_fn( RandomForest, train_input, train_output )

Stdout will show something like:

mymodule.RandomForest: 0.421609s

Code for time_fn:

import time

def time_fn( fn, *args, **kwargs ):
    start = time.clock()
    results = fn( *args, **kwargs )
    end = time.clock()
    fn_name = fn.__module__ + "." + fn.__name__
    print fn_name + ": " + str(end-start) + "s"
    return results
Cychosz answered 20/12, 2014 at 3:37 Comment(0)
P
3

If I understand it well, after python 3.5 you can define globals at each Timer instance without having to define them in your block of code. I am not sure if it would have the same issues with parallelization.

My approach would be something like:

clf = ensemble.RandomForestClassifier(n_estimators=10)
myGlobals = globals()
myGlobals.update({'clf'=clf})
t = Timer(stmt='clf.fit(trainX,trainy)', globals=myGlobals)
print(t.timeit(number=1))
print(clf)
Pisciculture answered 1/5, 2018 at 20:12 Comment(1)
Nice shot, definitely the more elegant solution, it also allows to pass dictionary to timeit.Timer. Thank you for sharingDropsonde
C
3

As of 2020, in ipython or jupyter notebook it is

t = %timeit -n1 -r1 -o RandomForest(trainX, trainy)
t.best
Celebes answered 26/11, 2020 at 18:16 Comment(1)
You're mixing results: The OP wants the result of the timed function clf in order to not run this function twice (once to get the result, once to get the time), not the result of the "magic" timeit IPython function (which -o indeed provides).Calvinna
H
1

If you don't want to monkey-patch timeit, you could try using a global list, as below. This will also work in python 2.7, which doesn't have globals argument in timeit():

from timeit import timeit
import time

# Function to time - plaigiarised from answer above :-)
def foo():
    time.sleep(1)
    return 42

result = []
print timeit('result.append(foo())', setup='from __main__ import result, foo', number=1)
print result[0]

will print the time and then the result.

Hillie answered 21/9, 2020 at 20:20 Comment(0)
T
0

An approach I'm using it is to "append" the running time to the results of the timed function. So, I write a very simple decorator using the "time" module:

def timed(func):
    def func_wrapper(*args, **kwargs):
        import time
        s = time.clock()
        result = func(*args, **kwargs)
        e = time.clock()
        return result + (e-s,)
    return func_wrapper

And then I use the decorator for the function I want to time.

Trapeziform answered 13/3, 2018 at 16:43 Comment(0)
S
0

The original question wanted allowance for multiple results, multithreading, and multiprocessing. For all those, a queue will do the trick.

# put the result to the queue inside the function, via globally named qname
def RandomForest(train_input, train_output):
    clf = ensemble.RandomForestClassifier(n_estimators=10)
    clf.fit(train_input, train_output)
    global resultq
    resultq.put(clf)
    return clf

# put the result to the queue inside the function, to a queue parameter
def RandomForest(train_input, train_output,resultq):
    clf = ensemble.RandomForestClassifier(n_estimators=10)
    clf.fit(train_input, train_output)
    resultq.put(clf)
    return clf

# put the result to the queue outside the function
def RandomForest(train_input, train_output):
    clf = ensemble.RandomForestClassifier(n_estimators=10)
    clf.fit(train_input, train_output)
    return clf


#usage:
#     global resultq
#     t=RandomForest(train_input, train_output)
#     resultq.put(t)

# in a timeit usage, add an import for the resultq into the setup.
setup="""
from __main__ import resultq
"""

# # in __main__  # #

#  for multiprocessing and/or mulithreading
import multiprocessing as mp
global resultq=mp.Queue() # The global keyword is unnecessary if in __main__ ' Doesn't hurt

# Alternatively, 

# for multithreading only
import queue
global resultq=queue.Queue() # The global keyword is unnecessary if in __main__ ' Doesn't hurt

#   do processing

# eventually, drain the queue

while not resultq.empty():
  aclf=resultq.get()
  print(aclf)
Sentry answered 25/7, 2023 at 9:45 Comment(0)
A
0

You can create a callable class that wraps your function, and captures its return value, like that:

class CaptureReturnValue:
    def __init__(self, func):
        self.func = func
        self.return_value = None

    def __call__(self, *args, **kwargs):
        self.return_value = self.func(*args, **kwargs)

Then call timeit like that:

    crv = CaptureReturnValue(f1)
    elapsed_time = timeit.timeit(lambda: crv(your_parameters), number=1, globals=globals())
    print(crv.return_value)
    print(elapsed_time)

Note that the timing overhead is a little larger in this case because of the extra function calls.

Ardin answered 27/11, 2023 at 10:4 Comment(0)
I
-1

For Python 3.X I use this approach:

# Redefining default Timer template to make 'timeit' return
#     test's execution timing and the function return value
new_template = """
def inner(_it, _timer{init}):
    {setup}
    _t0 = _timer()
    for _i in _it:
        ret_val = {stmt}
    _t1 = _timer()
    return _t1 - _t0, ret_val
"""
timeit.template = new_template
Infrared answered 9/5, 2019 at 9:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.