In Python, why is a module implemented in C faster than a pure Python module, and how do I write one?
Asked Answered
A

5

12

The python documentation states, that the reason cPickle is faster than Pickle is, that the former is implemented in C. What does that mean exactly?

I am making a module for advanced mathematics in Python, and some calculations take a significant amount of time. Does that mean that if my program is implemented in C it can be made much faster?

I wish to import this module from other Python programs, just the way I can import cPickle.

Can you explain how to do implement a Python module in C?

Aldrin answered 6/1, 2011 at 14:46 Comment(0)
S
18

You can write fast C code and then use it in your python scripts, so your program will run faster.[1] http://docs.python.org/extending/index.html#extending-index

An example is Numpy, written in C ( https://numpy.org/ )

Typical use is to implement the bottleneck in C (or to use a library written in C, of course ;) ), due to its speed, and to use python for the remaining code

[1] by the way, this is why cPickle is faster than pickle

edit:

take a look at Pyrex: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/About.html

'Pyrex is a language specially designed for writing Python extension modules. It's designed to bridge the gap between the nice, high-level, easy-to-use world of Python and the messy, low-level world of C. '

It's not the 'official' way but it may be useful

Safir answered 6/1, 2011 at 14:54 Comment(2)
+1 You should benchmark a pure C implementation of your core calculations against a Python implementation that uses Numpy to do the heavy lifting. There's a good chance that the Numpy implementation will be competitive.Lewls
+1 for Numpy. There is a very good chance that if Numpy does what you want, it will be faster than anything you could write without investing a substantial amount of time optimizing the code.Stipple
H
11

As mentioned, numpy is excellent for vector computations. (Could be better still, but the comment that it's better than anything you could write without actually doing work is definitely true.)

Not everything can be easily vectorized, though, so if you do have tight inner loops with lots of function calls (say a heavily recursive algorithm) you still have a couple of options: probably the most popular is Cython, which allows you to write modules and functions in a kind of annotated Python and get C-like speed when you need it.

Or maybe your time is all dominated by library calls to compute eigenvalues or invert matrices or evaluate special functions or divide really large integers -- many of which the Sage project handles very well, by the way, if what you're doing is more mathematical than pure crunching -- in which case the time spent in Python might not even matter. It all depends on the details of the kind of numerics you're doing.

Hutt answered 6/1, 2011 at 15:26 Comment(0)
R
10

When you write a function in python, a new function object is created, the function code is parsed and bytecompiled[and saved in the "func_code" attribute], so when you call that function the interpreter reads its bytecode and executes it.

If you write the same function in C, following C/Python API to make it avaiable in python, the interpreter will create the function object, but this function won't have a bytecode. When the interpreter finds a call to that function it calls the real C function, thus it executes at "machine" speed and not at "python-machine" speed.

You can verify this checking functions written in C:

>>> map.func_code
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'builtin_function_or_method' object has no attribute 'func_code'
>>> def mymap():pass
... 
>>> mymap.func_code
<code object mymap at 0xcfb5b0, file "<stdin>", line 1>

To understand how you can write C code for python use follow the guides in the official site.

Anyway, if you are simply doing N-dimensional array calculations numpy ought to be sufficient.

Royall answered 8/1, 2011 at 17:33 Comment(0)
N
7

Besides Pyrex/Cython, already mentioned, you have other alternatives:

Shed Skin: Translates (a restricted subset of) Python to C++. Can automatically generate an extension for you. You'd create an extension doing this (assuming Linux):

wget http://shedskin.googlecode.com/files/shedskin-0.7.tgz
tar -xzf shedskin-0.7.tgz
# On your code folder:
PYTHONPATH=/path/to/shedskin-0.7 python shedskin -e yourmodule.py
# The above generates a Makefile and a yourmodule.h/.cpp pair
make
# Now you can "import yourmodule" from Python and check it's from the .so by "print yourmodule.__file__

PyPy: A faster Python, with a JIT compiler. You could simply run your code on it instead of CPython. Only supports Python 2.5 now, 2.7 support soon. Can give huge speedups on math-heavy code. To install and run it (assuming Linux 32-bit):

wget http://pypy.org/download/pypy-1.4.1-linux.tar.bz2
tar -xjf pypy-1.4.1-linux.tar.bz2
sudo ln -s /path/to/pypy-1.4.1-linux/bin/pypy /usr/local/bin
# Then, instead of "python yourprogram.py" you'll just run "pypy yourprogram.py"

Weave: Allows you to write C inline, the compiles it.

Edit: If you want us to run these tools for you and benchmark, just post your code ;)

Nevus answered 6/1, 2011 at 20:30 Comment(0)
B
0

I am Very late, but if anyone needs to know this in 2023, here is the solution:

  1. Make you library (in C)
  2. Compile it to an Object file (If using gcc, use gcc --Wall --save-temps library.c) There should now be a .o file in the same directory.
  3. Convert the object file (.o) to a shared object (.so) with gcc -shared library.o -fPIC
  4. Open your Python Project, and import the ctypes module.
  5. type clibrary = ctypes.CDLL("path\to\file.so")
  6. Your're done! Now type clibrary.function(args) and you called a C funtion in Python!
Biron answered 21/4, 2023 at 23:9 Comment(1)
This works only for simple arguments (int, for example). Set .argtypes and .restype on the function to explicitly specify the arguments and return type.Monogenic

© 2022 - 2025 — McMap. All rights reserved.