How do I specify different compiler flags for just one Python/C extension source file?
Asked Answered
F

3

14

I have a Python extension which uses CPU-specific features, if available. This is done through a run-time check. If the hardware supports the POPCNT instruction then it selects one implementation of my inner loop, if SSSE3 is available then it selects another, otherwise it falls back to generic versions of my performance critical kernel. (Some 95%+ of the time is spent in this kernel.)

Unfortunately, there's a failure mode I didn't expect. I use -mssse3 and -O3 to compile all of the C code, even though only one file needs that -mssse3 option. As a result, the other files are compiled with the expectation that SSSE3 will exist. This causes a segfault for the line:

start_target_popcount = (int)(query_popcount * threshold);

because the compiler used fisttpl, which is an SSSE3 instruction. After all, I told it to assume that SSSE3 exists.

The Debian packager for my package recently ran into this problem, because the test machine has a GCC which understands -mssse3 and generates code with that in mind, but the machine itself has an older CPU without those instructions.

I want a solution where the same binary can work on older machines and on newer ones, that the Debian maintainer can use for that distro.

Ideally, I would like to say that only one file is compiled with the -mssse3 option. Since my CPU-specific selector code isn't part of this file, no SSSE3 code will ever be executed unless the CPU supports it.

However, I can't figure out any way to tell distutils that a set of compiler options are specific to a single file.
Is that even possible?

Fineable answered 20/3, 2013 at 15:20 Comment(2)
In thinking about danodonovan's answer, I realized that a hack is to have a "CC" wrapper, which inserts the right flags for a specific file. Inelegant, but it might be enough for Debian.Fineable
I've recently found some happiness merging CMake and distutils, using CMake to generate a static library that is linked with the extension. You could do something similar. See our setup.py here: github.com/CoolProp/CoolProp/blob/master/wrappers/Python/…Ebner
E
6

A very ugly solution would be to create two (or more Extension) classes, one to hold the SSSE3 code and the other for everything else. You could then tidy the interface up in the python layer.

c_src = [f for f in my_files if f != 'ssse3_file.c']

c_gen = Extension('c_general', sources=c_src,
                 libraries=[], extra_compile_args=['-O3'])

c_ssse3 = Extension('c_ssse_three', sources=['ssse3_file.c'],
                 libraries=[], extra_compile_args=['-O3', '-mssse3'])

and in an __init__.py somewhere

from c_general import *
from c_ssse_three import *

Of course you don't need me to write out that code! And I know this isn't DRY, I look forward to reading a better answer!

Earth answered 20/3, 2013 at 15:46 Comment(1)
Unfortunately it's C code which decides which compute kernel to run, so your suggestion, while doable, becomes rather more difficult. Basically, I would need to implement C shared libraries, or I would have to have some sort of dynamic API to register available compute kernels. Both are a lot of work, compared to the ideal solution of specifying per-file flags.Fineable
F
4

It's been 5 years but I figured out a solution which I like better than my "CC" wrapper.

The "build_ext" command creates a self.compiler instance. The compiler.compile() method takes the list of all source files to compile. The base class does some setup, then has a compiler._compile() hook for a concrete compiler subclass to implement the actual per-file compilation step.

I felt that this was stable enough that I could intercept the code at that point.

I derived a new command from distutils.command.build_ext.build_ext which tweaks self.compiler._compile to wrap the bound class method with a one-off function attached to the instance:

class build_ext_subclass(build_ext):
    def build_extensions(self):

        original__compile = self.compiler._compile
        def new__compile(obj, src, ext, cc_args, extra_postargs, pp_opts):
            if src != "src/popcount_SSSE3.c":
                extra_postargs = [s for s in extra_postargs if s != "-mssse3"]
            return original__compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
        self.compiler._compile = new__compile
        try:
            build_ext.build_extensions(self)
        finally:
            del self.compiler._compile

I then told setup() to use this command-class:

setup(
   ...
   cmdclass = {"build_ext": build_ext_subclass}
)
Fineable answered 11/4, 2018 at 10:39 Comment(2)
I just burned an hour reading through the distutils source, and this is likely the only good solution on Unix, so +1 to Andrew Dalke. However, I don't think it will work on Windows because the MSVC compiler does not appear to have a _compile method.Episode
If you want access to the Extension object in new__compile, you can override build_extension(self, extension), which is what build_ext.build_extensions(self) does for each extension in self.distribution.ext_modules (i.e., the ext_modules set in setup().Episode
P
2

Unfortunately the OP's solution will work only for Unix compilers. Here is a cross-compiler one:
(MSVC doesn't support an automatic SSSE3 code generation, so I'll use an AVX for example)

from setuptools import setup, Extension
import distutils.ccompiler


filename = 'example_avx'

compiler_options = {
    'unix': ('-mavx',),
    'msvc': ('/arch:AVX',)
}

def spawn(self, cmd, **kwargs):
    extra_options = compiler_options.get(self.compiler_type)
    if extra_options is not None:
        # filenames are closer to the end of command line
        for argument in reversed(cmd):
            # Check if argument contains a filename. We must check for all
            # possible extensions; checking for target extension is faster.
            if not argument.endswith(self.obj_extension):
                continue

            # check for a filename only to avoid building a new string
            # with variable extension
            off_end = -len(self.obj_extension)
            off_start = -len(filename) + off_end
            if argument.endswith(filename, off_start, off_end):
                if self.compiler_type == 'bcpp':
                    # Borland accepts a source file name at the end,
                    # insert the options before it
                    cmd[-1:-1] = extra_options
                else:
                    cmd += extra_options

                # we're done, restore the original method
                self.spawn = self.__spawn

            # filename is found, no need to search any further
            break

    distutils.ccompiler.spawn(cmd, dry_run=self.dry_run, **kwargs)

distutils.ccompiler.CCompiler.__spawn = distutils.ccompiler.CCompiler.spawn
distutils.ccompiler.CCompiler.spawn = spawn


setup(
    ...
    ext_modules = [
        Extension('extension_name', ['example.c', 'example_avx.c'])
    ],
    ...
)

See my answer here for a cross-compiler way to specify compiler/​linker options in general.

Philander answered 24/7, 2021 at 9:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.