How to check the existence of NEON on arm?
Asked Answered
R

4

17

How to determine whether NEON engine exists on given ARM processor? Any status/flag register can be queried for such purpose?

Rimrock answered 2/11, 2014 at 15:58 Comment(13)
there are a ton of coprocessor registers that are there for that purpose to give you the gory details on what is supported in that core and what isnt. get the TRM for that or a similar core to see where these registers live.Fibroid
I believe that ARM processors are designed s.t. this information and those registers are actually privileged; Under Linux, therefore, you must look at /proc/cpuinfo to look for the NEON or Advanced SIMD flag. For privileged code, look at the ARMv7 Architecture Reference Manual, Section B3.12.19 c1, Coprocessor Access Control Register (CPACR); Bit 31 of that register is what you want.Circumfuse
Bit 31 of CPACR disables NEON instructions deocding when set to 1, which seems not a direct way to detect NEON engine.Rimrock
@Rimrock Read immediately below; On an implementation that: As well, the bit resets to zero if supported.Circumfuse
@Iwillnotexist Idonotexist, your are right. It seems to be a good option.Rimrock
@Rimrock Although on second thought I'm not entirely pleased with it now; For a processor that supports neither VFP nor NEON the bit is UNK/SBZP, which the glossary reports as Reads Unknown/Writes Should Be Zero or Preserve (I don't know why). And yet CPACR is the register that boot software must configure in order to enable CP10 and CP11, which are the Advanced SIMD coprocessors.Circumfuse
Ah; I'm digging in B5.3 Advanced SIMD and VFP feature identification registers now.Circumfuse
Got it. See Linux's VFP initialization code here lxr.free-electrons.com/source/arch/arm/vfp/…; It examines the MVFR1 described in the ARM architecture reference manual Section B5.3, and if Advanced SIMD hardware supports all of a) Single-Precision Floating Point Operations b) Integer Operations and c) Load-Store Operations, then the HWCAP_NEON flag is set.Circumfuse
Although the comment appears irrelevant to @Thomson's scenario, I will just keep doing my broken record thing and point out that parsing /proc/cpuinfo is never the correct answer. HWCAPS is the way to determine CPU features from a Linux userland process. community.arm.com/groups/android-community/blog/2014/10/10/…Iapetus
@unixsmurf, I have no idea if Mr or Ms Thomson has access to all the registers, but for most people looking for this type of info, they are just using Android/Linux/iOS or whatever and that is the correct answer. I will use your link.Benjamin
@PeterM: thanks for being less lazy than me :)Iapetus
@Iapetus - HWCAPS does not scale. It only works for Linux and NEON, but not other platforms or other extensions like CRC32 and Crypto. What I found in practice is: determine compiler support with __ARM_FEATURE_XXX; and determine runtime support by trying an instruction with a SIGILL handler in place. The compile time/runtime strategy is the only thing I have found that works well across platforms and compilers.Misspeak
@jww: a little bit confused by the "does not scale" statement, given that CRC32 and Crypto are explicitly supported through the hwcaps strategy - albeit using HWCAP2 for the 32-bit ARM architecture.Iapetus
B
17

I believe unixsmurf's answer is about as good as you'll get if using an OS with privileged kernel. For general purpose feature detection, it seems ARM has made it a requirement to get this from the OS, and so you must use an OS API to get it.

  • On Android NDK use #include <cpu-features.h> with (android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM) && (android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_NEON). Note this is for 32 bit ARM. ARM 64 bit has different flags but the idea is the same. See the sources/docs.
  • On Linux, if available use #include <sys/auxv.h> and #include <asm/hwcap.h> with getauxval(AT_HWCAP) & HWCAP_NEON.
  • On iOS, I'm not sure there is a dynamic call, the methodology seems to be that you build your app targeting NEON, then make sure your app is flagged to require NEON so it will only install on devices which support it. Of course you should use the pre-defined preprocessor flag __ARM_NEON__ to make sure everything is in order at compile time.
  • On whatever Microsoft does or if you are using some other RTOS... I don't know...

Actually you'll see a lot of Android implementations which just parse /proc/cpuinfo in order to implement android_getCpuFeatures().... Heh. But still it seems to be getting improved and newest versions use the getauxval method.

Benjamin answered 14/11, 2014 at 0:11 Comment(7)
All iOS hardware supported by iOS 5 and later have NEON; you can simply assume that NEON is present, there's no need for any check (but you could check dynamically using sysctl if you really wanted to).Air
On Android NDK check if (android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM && (android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_NEON) != 0)Marybethmaryellen
Thanks, yeah, I updated.... obviously more complicated now with ARM64 etc.Benjamin
I think getauxval(AT_HWCAP) & HWCAP_NEON works for Aarch64, but not Aarch32. I believe there's other defines you need to check.Misspeak
HWCAP_NEON is for Aarch32, and HWCAP_ASIMD is for Aarch64Korten
The unixsmurf link is down, but it looks like there's an archived copy here: web.archive.org/web/20160629111439/https://community.arm.com/…Knorr
On iOS i think it's safe to assume it's available, i don't think Apple has made anything lacking NEON since 2011's iPad2/iPhone 4 (with iPad1/iPhone3 lacking NEON)Lillylillywhite
L
2

In C/C++:

uint64_t id_aa64pfr0_el1;
__asm__ ("mrs %0, ID_AA64PFR0_EL1" : "=r" (id_aa64pfr0_el1));
const uint8_t AdvSIMD = (id_aa64pfr0_el1 >> 20) & (1 << 0 | 1 << 1 | 1 << 2 | 1 << 3);
if (AdvSIMD == 15) {
    // NEON is not present, Float16 is not present (not even with 1<<0)
} else {
    // NEON is present
    if (AdvSIMD & (1 << 0)) {
        // Float16 is also available
    } else {
        // Float16 is missing
    }
}

Can also do

bool is_neon_available() {
    static bool is_cached = false;
    static bool cached;
    if (is_cached) {
        return cached;
    }
    uint64_t id_aa64pfr0_el1;
    __asm__ ("mrs %0, ID_AA64PFR0_EL1" : "=r" (id_aa64pfr0_el1));
    cached = (15 != ((id_aa64pfr0_el1 >> 20) & (1 << 0 | 1 << 1 | 1 << 2 | 1 << 3)));
    is_cached = true;
    return cached;
}

Related documentation: https://developer.arm.com/documentation/ddi0595/2021-12/AArch64-Registers/ID-AA64PFR0-EL1--AArch64-Processor-Feature-Register-0?lang=en

Lillylillywhite answered 6/2 at 12:7 Comment(0)
P
1

One reliable way is to check the architectural feature trap register. For example, on ARM Cortex A35, you can check the value of HCPTR register to see whether NEON is implemented (0x000033FF), or not (0x0000BFFF). The register name and indication value are platform dependent, making sure to check the technical reference manual.

Palumbo answered 29/10, 2018 at 21:58 Comment(0)
C
0

If /proc/config.gz is present, you can test for NEON support in the kernel using the command:

zcat /proc/config.gz | grep NEON

If the NEON unit is present, the command outputs:

CONFIG_NEON=y

To ensure that the processor supports the NEON extension, you can issue the command:

cat /proc/cpuinfo | grep neon

If it supports the NEON extension, the output shows neon, for example:

# cat /proc/cpuinfo | grep neon
Features        : half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm 
Features        : half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm

see Learn the architecture - Neon programmers' guide

Committee answered 30/8, 2023 at 10:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.