I have a Python extension which uses CPU-specific features,
if available. This is done through a run-time check. If the
hardware supports the POPCNT
instruction then it selects one
implementation of my inner loop, if SSSE3 is available then
it selects another, otherwise it falls back to generic versions
of my performance critical kernel. (Some 95%+ of the time is
spent in this kernel.)
Unfortunately, there's a failure mode I didn't expect. I
use -mssse3
and -O3
to compile all of the C code, even though
only one file needs that -mssse3
option. As a result, the other files are compiled with the expectation that SSSE3 will exist. This causes a segfault for the line:
start_target_popcount = (int)(query_popcount * threshold);
because the compiler used fisttpl
, which is an SSSE3 instruction.
After all, I told it to assume that SSSE3 exists.
The Debian packager for my package recently ran into this problem,
because the test machine has a GCC which understands -mssse3
and
generates code with that in mind, but the machine itself has an
older CPU without those instructions.
I want a solution where the same binary can work on older machines and on newer ones, that the Debian maintainer can use for that distro.
Ideally, I would like to say that only one file is compiled
with the -mssse3
option. Since my CPU-specific selector code
isn't part of this file, no SSSE3 code will ever be executed
unless the CPU supports it.
However, I can't figure out any way to tell distutils
that
a set of compiler options are specific to a single file.
Is that even possible?