The reason is precompilation overhead. To see this define:
julia> h() = quadgk(x -> x, 0., 1.)
h (generic function with 1 method)
julia> @time h()
1.151921 seconds (915.60 k allocations: 48.166 MiB, 1.64% gc time)
(0.5, 0.0)
julia> @time h()
0.000013 seconds (21 allocations: 720 bytes)
(0.5, 0.0)
as opposed to
julia> @time quadgk(x -> x, 0., 1.)
0.312454 seconds (217.94 k allocations: 11.158 MiB, 2.37% gc time)
(0.5, 0.0)
julia> @time quadgk(x -> x, 0., 1.)
0.279686 seconds (180.17 k allocations: 9.234 MiB)
(0.5, 0.0)
What happens here is that in the first call, wrapping quadgk
in a function, anonymous function x->x
is defined only once, since it is wrapped in a function, and thus quadgk
is compiled only once. In the second call x->x
is defined anew with every call and thus compilation has to be performed each time.
And now the crucial point is that BenchmarkTools.jl wraps your code in a function which you can check by inspecting how generate_benchmark_definition
function works in this package, so it is equivalent to the first approach presented above.
Another way to run the code without redefining the optimized function would be:
julia> g(x) = x
g (generic function with 1 method)
julia> @time quadgk(g, 0., 1.)
1.184723 seconds (951.18 k allocations: 49.977 MiB, 1.58% gc time)
(0.5, 0.0)
julia> @time quadgk(g, 0., 1.)
0.000020 seconds (23 allocations: 752 bytes)
(0.5, 0.0)
(though this is not what BenchmarkTools.jl does - I add it to show that when you use function g
you do not pay precompilation tax twice)