Importing a local variable in a function into timeit
Asked Answered
T

2

6

I need to time the execution of a function across variable amounts of data.

def foo(raw_data):
   preprocessed_data = preprocess_data(raw_data)
   time = timeit.Timer('module.expensive_func(preprocessed_data)', 'import module').timeit()

However, preprocessed_data is not a global variable. It cannot be imported with from __main__. It is local to this subroutine.

How can i import data into the timeit.Timer environment?

Taishataisho answered 10/9, 2014 at 15:4 Comment(5)
Why not time = timeit.Timer('module.expensive_func(data)', 'import module;data = generate_data()').timeit()? Also, if you need something more complicated you may actually want a profiler.Richardricharda
@StevenRumbalski: Works for this scenario, but what if you need the data outside the timer too?Zamir
Bingo. Sorry, @StevenRumbalski, this is indeed the case - the data is outside the timer too. I've updated the question to reflect this.Taishataisho
@EMiller: 'import module;from __main__ import otherdata1, othedata2;data = generate_data()' You can shove as much code as you want inside that bit of setup code. If you have a lot of code for setup define setup as a multiline string before the timeit call.Richardricharda
It's more than just an answer to your question, but I feel I should advertise my guide to the timeit module: https://mcmap.net/q/46595/-how-to-use-timeit-moduleButyrate
Z
6

Pass it a callable to time, rather than a string. (Unfortunately, this introduces some extra function call overhead, so it's only viable when the thing to time swamps that overhead.)

time = timeit.timeit(lambda: module.expensive_func(data))

In Python 3.5 and up, you can also specify an explicit globals dictionary with a string statement to time:

time = timeit.timeit('module.expensive_func(data)',
                     globals={'module': module, 'data': data})
Zamir answered 10/9, 2014 at 15:16 Comment(3)
FWIW, using functools.partial removes about half of the function call overhead.Butyrate
@Veedrac: I don't think that's actually removing the function call overhead, though. I think it's removing name lookup overhead that would occur in the real case.Zamir
@Zamir True, but the point is that it's faster :P. Proper timings should time and subtract call overhead, so the smaller it is the less error it produces.Butyrate
Z
1

The accepted answer didn't work for me inside pdb debugger and a class method. The solution that worked is to add the variables to globals():

globals()['data'] = data
globals()['self'] = self
timeit.timeit(lambda: self.function(data))

Note that the timing overhead is a little larger in this case because of the extra function calls. [source]

Zook answered 9/2, 2017 at 2:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.