Compile multi-architecture code using Agner's Vector Class Library
Asked Answered
C

3

6

How can I create a library that will dynamically switch between SSE, AVX, and AVX2 code paths depending on the host processor/OS? I am using Agner Fog's VCL (Vector Class Library) and compiling with GCC for Linux.

Cantonment answered 7/6, 2016 at 13:34 Comment(2)
Sounds like a makefile solution to me. You know about the host processor/OS when you build. No need to do so at runtime.Hyperboloid
For those who read this question, but is not limited to VCL and GCC, there is a family of "-axcode" compilation flags for Intel Compilers, which make it possible to generate several code paths targeting multiple instruction set architectures (e.g. for SSE, AVX and AVX-512) in the same library/executable and to automatically (invisibly) dispatch between them in runtime. Look at the bottom of this page: software.intel.com/en-us/blogs/2016/01/13/…Deflate
H
5

See the section "Instruction sets and CPU dispatching" in the manual to the Vector Class Library. In that section Agner writes

The file dispatch_example.cpp shows an example of how to make a CPU dispatcher that selects the appropriate code version.

Read the source code to distpatch_example.cpp. At the start of the file you should see the comment

# Compile dispatch_example.cpp five times for different instruction sets:
| g++ -O3 -msse2    -c dispatch_example.cpp -od2.o
| g++ -O3 -msse4.1  -c dispatch_example.cpp -od5.o
| g++ -O3 -mavx     -c dispatch_example.cpp -od7.o
| g++ -O3 -mavx2    -c dispatch_example.cpp -od8.o
| g++ -O3 -mavx512f -c dispatch_example.cpp -od9.o
| g++ -O3 -msse2 -otest instrset_detect.cpp d2.o d5.o d7.o d8.o d9.o
| ./test

The file instrset_detect.cpp. You should read the source code to this also. This is what calls CPUID.

Here is a summary of some, but not all of, my questions and answers on CPU dispatchers.

Hypogastrium answered 8/6, 2016 at 7:19 Comment(1)
dispatch_example.cpp calls instrset_detect which is declared in instrset.h and defined in instrset_detect.cpp.Hypogastrium
P
4

The assembly instruction cpuid can give you this information at runtime. Someone has helpfully created a library based on this to just what you need.

You could create a function dispatch table, and populate it with the correct code path functions based on the results of querying using this code.

UPDATE: (answer to question in comments)

To create the different code paths in the first place, you need to compile the different code paths separately, and then link them together. For each one, you specify the architecture needed by using various values of the -march switch in your compile line.

Pammie answered 7/6, 2016 at 13:43 Comment(5)
The trouble is that I can't (easily?) create different code paths, because VCL uses intrinsics (not inline assembly) that the compiler converts to whatever instruction set was specified in the compiler arguments. I supposed I should have mentioned I am using GCC.Cantonment
How can I mangle the function names for each invocation of the compiler?Cantonment
You give them different names for each invocation (use a -D switch on the compile line and use the given define to append to your function name in the code so they have distinct names in the object files).Pammie
Then have some central dispatch code that populates some function pointers: if(supportsAvx2) { func1 = func1_AVX2; func2 = func2_AVX2; ... } else if(supportsSSE2) { func1 = func1_SSE2; func2 = func2_SSE2; ... } ...Pammie
Thank you, you've been very helpful.Cantonment
S
3

The file https://github.com/vectorclass/version2/blob/master/dispatch_example2.cpp shows how to make automatic dispatching into different code versions with one namespace for each. This works on all x86 platforms.

Suckle answered 25/2, 2020 at 15:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.