How to determine whether NEON engine exists on given ARM processor? Any status/flag register can be queried for such purpose?
I believe unixsmurf's answer is about as good as you'll get if using an OS with privileged kernel. For general purpose feature detection, it seems ARM has made it a requirement to get this from the OS, and so you must use an OS API to get it.
- On Android NDK use
#include <cpu-features.h>
with(android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM) && (android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_NEON)
. Note this is for 32 bit ARM. ARM 64 bit has different flags but the idea is the same. See the sources/docs. - On Linux, if available use
#include <sys/auxv.h>
and#include <asm/hwcap.h>
withgetauxval(AT_HWCAP) & HWCAP_NEON
. - On iOS, I'm not sure there is a dynamic call, the methodology seems to be that you build your app targeting NEON, then make sure your app is flagged to require NEON so it will only install on devices which support it. Of course you should use the pre-defined preprocessor flag
__ARM_NEON__
to make sure everything is in order at compile time. - On whatever Microsoft does or if you are using some other RTOS... I don't know...
Actually you'll see a lot of Android implementations which just parse /proc/cpuinfo in order to implement android_getCpuFeatures().... Heh. But still it seems to be getting improved and newest versions use the getauxval method.
sysctl
if you really wanted to). –
Air getauxval(AT_HWCAP) & HWCAP_NEON
works for Aarch64, but not Aarch32. I believe there's other defines you need to check. –
Misspeak In C/C++:
uint64_t id_aa64pfr0_el1;
__asm__ ("mrs %0, ID_AA64PFR0_EL1" : "=r" (id_aa64pfr0_el1));
const uint8_t AdvSIMD = (id_aa64pfr0_el1 >> 20) & (1 << 0 | 1 << 1 | 1 << 2 | 1 << 3);
if (AdvSIMD == 15) {
// NEON is not present, Float16 is not present (not even with 1<<0)
} else {
// NEON is present
if (AdvSIMD & (1 << 0)) {
// Float16 is also available
} else {
// Float16 is missing
}
}
Can also do
bool is_neon_available() {
static bool is_cached = false;
static bool cached;
if (is_cached) {
return cached;
}
uint64_t id_aa64pfr0_el1;
__asm__ ("mrs %0, ID_AA64PFR0_EL1" : "=r" (id_aa64pfr0_el1));
cached = (15 != ((id_aa64pfr0_el1 >> 20) & (1 << 0 | 1 << 1 | 1 << 2 | 1 << 3)));
is_cached = true;
return cached;
}
Related documentation: https://developer.arm.com/documentation/ddi0595/2021-12/AArch64-Registers/ID-AA64PFR0-EL1--AArch64-Processor-Feature-Register-0?lang=en
One reliable way is to check the architectural feature trap register. For example, on ARM Cortex A35, you can check the value of HCPTR register to see whether NEON is implemented (0x000033FF), or not (0x0000BFFF). The register name and indication value are platform dependent, making sure to check the technical reference manual.
If /proc/config.gz
is present, you can test for NEON support in the kernel using the command:
zcat /proc/config.gz | grep NEON
If the NEON unit is present, the command outputs:
CONFIG_NEON=y
To ensure that the processor supports the NEON extension, you can issue the command:
cat /proc/cpuinfo | grep neon
If it supports the NEON extension, the output shows neon, for example:
# cat /proc/cpuinfo | grep neon
Features : half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
Features : half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
© 2022 - 2024 — McMap. All rights reserved.
/proc/cpuinfo
to look for the NEON or Advanced SIMD flag. For privileged code, look at the ARMv7 Architecture Reference Manual, Section B3.12.19 c1, Coprocessor Access Control Register (CPACR); Bit 31 of that register is what you want. – CircumfuseUNK/SBZP
, which the glossary reports as Reads Unknown/Writes Should Be Zero or Preserve (I don't know why). And yet CPACR is the register that boot software must configure in order to enable CP10 and CP11, which are the Advanced SIMD coprocessors. – CircumfuseHWCAPS
does not scale. It only works for Linux and NEON, but not other platforms or other extensions like CRC32 and Crypto. What I found in practice is: determine compiler support with__ARM_FEATURE_XXX
; and determine runtime support by trying an instruction with aSIGILL
handler in place. The compile time/runtime strategy is the only thing I have found that works well across platforms and compilers. – MisspeakHWCAP2
for the 32-bit ARM architecture. – Iapetus