Modern x86_64 linux with glibc will detect that CPU has support of AVX extension and will switch many string functions from generic implementation to AVX-optimized version (with help of ifunc dispatchers: 1, 2).
This feature can be good for performance, but it prevents several tool like valgrind (older libVEXs, before valgrind-3.8) and gdb's "target record
" (Reverse Execution) from working correctly (Ubuntu "Z" 17.04 beta, gdb 7.12.50.20170207-0ubuntu2, gcc 6.3.0-8ubuntu1 20170221, Ubuntu GLIBC 2.24-7ubuntu2):
$ cat a.c
#include <string.h>
#define N 1000
int main(){
char src[N], dst[N];
memcpy(dst, src, N);
return 0;
}
$ gcc a.c -o a -fno-builtin
$ gdb -q ./a
Reading symbols from ./a...(no debugging symbols found)...done.
(gdb) start
Temporary breakpoint 1 at 0x724
Starting program: /home/user/src/a
Temporary breakpoint 1, 0x0000555555554724 in main ()
(gdb) record
(gdb) c
Continuing.
Process record does not support instruction 0xc5 at address 0x7ffff7b60d31.
Process record: failed to record execution log.
Program stopped.
__memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:416
416 VMOVU (%rsi), %VEC(4)
(gdb) x/i $pc
=> 0x7ffff7b60d31 <__memmove_avx_unaligned_erms+529>: vmovdqu (%rsi),%ymm4
There is error message "Process record does not support instruction 0xc5
" from gdb's implementation of "target record", because AVX instructions are not supported by the record/replay engine (sometimes the problem is detected on _dl_runtime_resolve_avx
function): https://sourceware.org/ml/gdb/2016-08/msg00028.html "some AVX instructions are not supported by process record", https://bugs.launchpad.net/ubuntu/+source/gdb/+bug/1573786, https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=836802, https://bugzilla.redhat.com/show_bug.cgi?id=1136403
Solution proposed in https://sourceware.org/ml/gdb/2016-08/msg00028.html "You can recompile libc (thus ld.so), or hack __init_cpu_features and thus __cpu_features at runtime (see e.g. strcmp)." or set LD_BIND_NOW=1
, but recompiled glibc still has AVX, and ld bind-now doesn't help.
I heard that there are /etc/ld.so.nohwcap
and LD_HWCAP_MASK
configurations in glibc. Can they be used to disable ifunc dispatching to AVX-optimized string functions in glibc?
How does glibc (rtld?) detects AVX, using cpuid
, with /proc/cpuinfo
(probably not), or HWCAP aux (LD_SHOW_AUXV=1 /bin/echo |grep HWCAP
command gives AT_HWCAP: bfebfbff
)?
ENTRY(__new_memcpy) .type __new_memcpy, @gnu_indirect_function .. .HAS_ARCH_FEATURE (Prefer_ERMS)
where ..feature are defined at github.com/bminor/glibc/blob/master/sysdeps/x86/cpu-features.h; tested field is filled byinit_cpu_features
by usingcpuid
instruction of eax=7,ecx=0. How to hack intoinit_cpu_features
and mask out AVX/ERMS incpu_features->cpuid[COMMON_CPUID_INDEX_7].ecx
? – Asarumsysdeps/x86/libc-start.c
(__libc_start_main
callsinit_cpu_features (&_dl_x86_cpu_features)
), but at that point symbols already seem resolved (based onp *memcpy
pointing to__memmove_avx_unaligned_erms
). – Bonkersdpkg-buildpackage
, without strip) AND binary patching in the__get_cpu_features
function (get_common_indeces
/get_common_indeces.constprop.1
), cpuid,.., then just aftercpm 0xf,.. ;je ..; cmp 0x6
replacedjle
withjg
(0x7e to 0x7f) - probably disabling all the code afterif .. max_cpuid>=7
ofsysdeps/x86/cpu-features.c
. Or try to use more recent valgrind & gdb record tools or older glibc or implement missing instruction emulation in gdb record if it is not done. – Asarumrr
works with AVX: stackoverflow.com/questions/40125154/… – Vaduz