Python equivalence to inline functions or macros
Asked Answered
D

7

74

I just realized that doing

x.real*x.real+x.imag*x.imag

is three times faster than doing

abs(x)**2

where x is a numpy array of complex numbers. For code readability, I could define a function like

def abs2(x):
    return x.real*x.real+x.imag*x.imag

which is still far faster than abs(x)**2, but it is at the cost of a function call. Is it possible to inline such a function, as I would do in C using macro or using inline keyword?

Demars answered 22/6, 2011 at 15:2 Comment(8)
If you need this kind of optimisations, you probably need to use something like Cython.Niobic
PyPy to the rescue!Matta
If you care about such small optimisations, you should be using C, not python. python is not about speed, really.Evacuee
Have you tried timing the statement vs. function call to see if there's really a difference?Illstarred
In addition the the very correct and important (seriously, listen to them), note that due to the dynamic nature of Python, the only time inlining could possible happen is at runtime. This is one of the many optimizations PyPy does (although it doesn't have a remotely complete NumPy yet; but at least it's being worked on), and PyPy works best on idiomatic Python code, not on code written to shave off tiny bits of time off execution overhead.Khamsin
@vartec: I measured the same for small arrays (100 elements). For large arrays (10000 elments), however, he is probably right.Weekley
Obviously extracting square root is much slower than doing two multiplications and one addition. Why not just x*x.conj(), by the way?Unmask
@MarkRansom Python function calls are notoriously expensive, it's the price paid for the monkey-patching capabilities of Python, etc. https://mcmap.net/q/138668/-why-is-a-function-method-call-in-python-expensiveUndue
K
46

Is it possible to inline such a function, as I would do in C using macro or using inline keyword?

No. Before reaching this specific instruction, Python interpreters don't even know if there's such a function, much less what it does.

As noted in comments, PyPy will inline automatically (the above still holds - it "simply" generates an optimized version at runtime, benefits from it, but breaks out of it when it's invalidated), although in this specific case that doesn't help as implementing NumPy on PyPy started only shortly ago and isn't even beta level to this day. But the bottom line is: Don't worry about optimizations on this level in Python. Either the implementations optimize it themselves or they don't, it's not your responsibility.

Khamsin answered 22/6, 2011 at 15:31 Comment(6)
+1 "Don't worry about optimizations on this level in Python. Either the implementations optimize it themselves or they don't, it's not your responisbility."Duma
@Duma Not sure why you guys like that quote so much... It basically says you cannot optimize without making code ugly. I just had to inline several calls to make my program twice as fast. At least it was worth it...Daphie
I also find it a bit hard to accept that last comment. It's nice and all that it's "not my responsibility", but at the end of the day, I can't tell my boss that it's somebody else's fault if my code misses performance targets.Pownall
If PyPy inlines automatically, does that allow it to do optimizations such as omitting to return values that will not be used by the caller, so that those variables can be destructed earlier to free up memory or perhaps not be computed at all if not needed? For some applications where the variables take up a lot of space such optimizations can be critical.Auvergne
Does it really matter who's responsibility you think it is to optimize the code? What you're expressing is your opinion. The fact is that whether you choose to optimize or not comes with real consequences, both those that are beneficial and those that are not so beneficial.Auvergne
I can't believe this is the most up-voted answer... 😭 There's plenty of scope for AST transformations in Python, including something very close to what the OP asked for. The next answer is a lot better, and I'm surprised that kind of inlining isn't used more often.Subacute
L
37

Not exactly what the OP has asked for, but close:

Inliner inlines Python function calls. Proof of concept for this blog post

from inliner import inline

@inline
def add_stuff(x, y):
    return x + y

def add_lots_of_numbers():
    results = []
    for i in xrange(10):
         results.append(add_stuff(i, i+1))

In the above code the add_lots_of_numbers function is converted into this:

def add_lots_of_numbers():
    results = []
    for i in xrange(10):
         results.append(i + i + 1)

Also anyone interested in this question and the complications involved in implementing such optimizer in CPython, might also want to have a look at:

Lithographer answered 26/10, 2016 at 11:53 Comment(2)
Sorry what is the difference between your solution and the question?Cafeteria
@RogerS, the OP had asked about something similar to C macros (inline keyword) which are very flexible and efficient. This library has some limitations and has a startup time cost, but other than those, it does what the question asks.Lithographer
R
10

I'll agree with everyone else that such optimizations will just cause you pain on CPython, that if you care about performance you should consider PyPy (though our NumPy may be too incomplete to be useful). However I'll disagree and say you can care about such optimizations on PyPy, not this one specifically as has been said PyPy does that automatically, but if you know PyPy well you really can tune your code to make PyPy emit the assembly you want, not that you need to almost ever.

Rabbitfish answered 24/6, 2011 at 3:28 Comment(0)
R
9

No.

The closest you can get to C macros is a script (awk or other) that you may include in a makefile, and which substitutes a certain pattern like abs(x)**2 in your python scripts with the long form.

Rocco answered 22/6, 2011 at 15:9 Comment(3)
... which is a horrible idea, a lot of extra work and a decent chance of obscure breakage for nearly zero practical gain.Khamsin
Python is not the fastest language there is anyway, which is ok because of its fast development cycles. Adding a "preprocessing" step for a new python project is indeed strongly discouraged.Rocco
He did not claim that this was a good idea. Technically, he is correct.Weekley
T
7

Actually it might be even faster to calculate, like:

x.real** 2+ x.imag** 2

Thus, the extra cost of function call will likely to diminish. Lets see:

In []: n= 1e4
In []: x= randn(n, 1)+ 1j* rand(n, 1)
In []: %timeit x.real* x.real+ x.imag* x.imag
10000 loops, best of 3: 100 us per loop
In []: %timeit x.real** 2+ x.imag** 2
10000 loops, best of 3: 77.9 us per loop

And encapsulating the calculation in a function:

In []: def abs2(x):
   ..:     return x.real** 2+ x.imag** 2
   ..: 
In []: %timeit abs2(x)
10000 loops, best of 3: 80.1 us per loop

Anyway (as other have pointed out) this kind of micro-optimization (in order to avoid a function call) is not really productive way to write python code.

Touraine answered 22/6, 2011 at 16:58 Comment(2)
~3us might not be a lot if you do something 100 times, or 10000. Do something a million times and you'll want to shave thatInstauration
@Instauration there is C for thatRaid
P
3

You can try to use lambda:

abs2 = lambda x : x.real*x.real+x.imag*x.imag

then call it by:

y = abs2(x)
Pentothal answered 24/6, 2019 at 13:38 Comment(1)
Good thought but I just tried it... That didn't improve performance at all: def foo(bar): return bar vs foo = lambda bar: bar both execute in 57.5 nanoseconds on my system. Measured with timeit. So lambdas are exactly like regular functions and their calls. At least on CPython 3.8.Tiruchirapalli
C
0

Python is a dynamic programming language. Luckily Python does compile to bytecode before execution. So you can inline code. For simple solutions that don't require fat external packages you can use Pythons in house functions:

from inspect import getsource

abs2 = lambda z : z.real * z.real + z.imag * z.imag

def loop (zz, zs):
  for z in zs:
    zz += abs2 (z)

print ( f"loop code:\n{getsource (loop)}" )

inlined = getsource (loop).replace ("abs2 (z)", getsource (abs2).split(":")[1] )

print ( f"inlined loop code:\n{inlined}" )

compiled = compile (inlined, '<string>', 'exec').co_code

def loop2 (zz, zs):
  for z in zs:
    zz += z.real * z.real + z.imag * z.imag

compiled2 = compile (getsource (loop2), '<string>', 'exec').co_code

print ( f"compiled loop  code: {compiled}" )
print ( f"compiled loop2 code: {compiled2}")

Note: this only supports one line lambdas with the parameters having the same name than the passed variables. A simple and very hackish solution, but Python isn't an interpreter language to not support real time code editing.

Camire answered 28/8, 2023 at 21:18 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.