I can not understand the difference between number
and repeat
in timeit
library, so would you kindly tell me what is the difference between them?
repeat
specifies the number of samples to take.
number
specifies the number of times to repeat the code for each sample.
Internally there is a loop like this:
samples = []
for _ in range(repeat):
# start timer
for _ in range(number):
do_work()
# end timer
samples.append(duration)
end_timer - start_timer
-type approach for measuring the execution time of some code. –
Alcatraz n
iterations and then average of r
experiments? –
Polard Whenever you do a statistical experiment (in this case a timing experiment) you want to repeat (or replicate) the experiment in order to be able to quantify uncertainty.
Now IPython's %timeit
has two parameters:
n
, the number of loops (samples)r
, the number of repeats (replications of the experiment)
One single experiment returns the timing of n
loops (that means you will need to divide that value by n
to obtain the average timing over all loops).
The experiment is repeated r
times.
Uncertainty or uncontrolled variation is given by standard deviation over the r
experiments.
This can be seen in this line in the source code (self.loops
is the n
):
timings = [dt / self.loops for dt in all_runs]
From https://github.com/ipython/ipython/blob/master/IPython/core/magics/execution.py:
class TimeitResult(object):
"""
Object returned by the timeit magic with info about the run.
Contains the following attributes :
loops: (int) number of loops done per measurement
repeat: (int) number of times the measurement has been repeated
best: (float) best execution time / number
all_runs: (list of float) execution time of each run (in s)
compile_time: (float) time of statement compilation (s)
"""
def __init__(self, loops, repeat, best, worst, all_runs, compile_time, precision):
self.loops = loops
self.repeat = repeat
self.best = best
self.worst = worst
self.all_runs = all_runs
self.compile_time = compile_time
self._precision = precision
self.timings = [ dt / self.loops for dt in all_runs]
@property
def average(self):
return math.fsum(self.timings) / len(self.timings)
@property
def stdev(self):
mean = self.average
return (math.fsum([(x - mean) ** 2 for x in self.timings]) / len(self.timings)) ** 0.5
Default n
and r
Note that the default number of loops is one million and the default repeat is 7, so I would say for all daily needs it is convenient to use %timeit
specifying smaller values, otherwise the timing might take too long/use up too many resources.
Should I use %time instead?
Still, even for a quick timing I wouldn't use the basic %time
(just one run) because the timing of a single execution/run could be influenced by many contingent factors.
n
iterations and then average of r
experiments? –
Polard self.timings
and an average over these is computed (average over runs). –
Albumose timeit.repeat
documentation states as much docs.python.org/3/library/timeit.html#timeit.Timer.repeat . You should only take the min
, not the mean
–
Uhf std
is calculated? –
Shadbush I agree that the docs are confusing. For those coming to this article from a search engine as I did, you may want to look at the detailed answer to a related question at https://stackoverflow.com/a/59543135.
Basically the issue is clock granularity. If your function is so fast that it is on the same order as the clock tick, you will need to make number large enough that the execution time of running your code that many times between the start and stop of the timer gives a reasonable answer.
In addition to making number high enough to deal with clock granularity, you may have issues with other processes or just wanting to repeat the experiment and that is what repeat is for.
TL;DR: Make number large enough so your test is above the clock granularity (e.g., make number large enough for your test to be at least 0.01 seconds). Beyond that, repeat is just a convenience so you can get a list of timings to do your own statistics.
© 2022 - 2024 — McMap. All rights reserved.
repeat
overnumber
, or like what is the use case of each? – Lolita