How can I create a library that will dynamically switch between SSE, AVX, and AVX2 code paths depending on the host processor/OS? I am using Agner Fog's VCL (Vector Class Library) and compiling with GCC for Linux.
See the section "Instruction sets and CPU dispatching" in the manual to the Vector Class Library. In that section Agner writes
The file dispatch_example.cpp shows an example of how to make a CPU dispatcher that selects the appropriate code version.
Read the source code to distpatch_example.cpp
. At the start of the file you should see the comment
# Compile dispatch_example.cpp five times for different instruction sets:
| g++ -O3 -msse2 -c dispatch_example.cpp -od2.o
| g++ -O3 -msse4.1 -c dispatch_example.cpp -od5.o
| g++ -O3 -mavx -c dispatch_example.cpp -od7.o
| g++ -O3 -mavx2 -c dispatch_example.cpp -od8.o
| g++ -O3 -mavx512f -c dispatch_example.cpp -od9.o
| g++ -O3 -msse2 -otest instrset_detect.cpp d2.o d5.o d7.o d8.o d9.o
| ./test
The file instrset_detect.cpp
. You should read the source code to this also. This is what calls CPUID.
Here is a summary of some, but not all of, my questions and answers on CPU dispatchers.
dispatch_example.cpp
calls instrset_detect
which is declared in instrset.h
and defined in instrset_detect.cpp
. –
Hypogastrium The assembly instruction cpuid
can give you this information at runtime. Someone has helpfully created a library based on this to just what you need.
You could create a function dispatch table, and populate it with the correct code path functions based on the results of querying using this code.
UPDATE: (answer to question in comments)
To create the different code paths in the first place, you need to compile the different code paths separately, and then link them together. For each one, you specify the architecture needed by using various values of the -march
switch in your compile line.
-D
switch on the compile line and use the given define to append to your function name in the code so they have distinct names in the object files). –
Pammie if(supportsAvx2) { func1 = func1_AVX2; func2 = func2_AVX2; ... } else if(supportsSSE2) { func1 = func1_SSE2; func2 = func2_SSE2; ... } ...
–
Pammie The file https://github.com/vectorclass/version2/blob/master/dispatch_example2.cpp shows how to make automatic dispatching into different code versions with one namespace for each. This works on all x86 platforms.
© 2022 - 2024 — McMap. All rights reserved.