ARM Cortex-A8: How to make use of both NEON and vfpv3
Asked Answered
O

1

8

I'm using Cortex-A8 processor and I'm not understanding how to use the -mfpu flag.

On the Cortex-A8 there are both vfpv3 and neon co-processors. Previously I was not knowing how to use neon so I was only using

gcc -marm -mfloat-abi=softfp -mfpu=vfpv3

Now I have understood how SIMD processors run and I have written certain code using NEON intrinsics. To use neon co-processor now my -mfpu flag has to change to -mfpu=neon, so my compiler command line looks like this

gcc -marm -mfloat-abi=softfp -mfpu=neon

Now, does this mean that my vfpv3 is not used any more? I have lots of code which is not making use of NEON, do those parts not make use of vfpv3.

If both neon and vfpv3 are still used then I have no issues, but if only one of them is used how can I make use of both?

Outhaul answered 18/11, 2010 at 10:2 Comment(0)
H
11

NEON implies having the traditional VFP support too. VFP can be used for "normal" (non-vector) floating-point calculations. Also, NEON does not support double-precision FP so only VFP instructions can be used for that.
What you can do is add -S to gcc's command line and check the assembly. Instructions starting with V (e.g. vld1.32, vmla.f32) are NEON instructions, and those starting with F (fldd, fmacd) are VFP. (Although ARM docs now prefer using the V prefix even for VFP instructions, GCC does not do that.)

Hoch answered 18/11, 2010 at 10:48 Comment(5)
Igor I have only single precision floating point values. Indeed I see a lot of f-instructions (fadds, fsitos), so I think the vfp instructions are still being issued by the compiler.Outhaul
An unrelated question, do you know what a dual-issue processor mean? Cortex-A8 is dual issue processor what does it mean? Can you point me to any links? My search was not so productive.Outhaul
Vikram, Coretex A8 is a dual issue out of order processor. That means (variously) that it can do some or all of 1) decode and enqueue for out-of-order execution two instructions per clock, 2) execute two such-queued instructions per clock and/or 3) it can retire two such results per clock. In other words, in best conditions it can execute two instructions per clock, sustained. Best conditions only occur when there are no cache misses, branch mispredictions, etc.Harts
Thank you Jan. I got some wiki links to TI website (processors.wiki.ti.com/index.php/Cortex-A8_Features). It is a very good link.Outhaul
Was gonna ask for sabre-lite,using a Cortex A9 board. I guess -mfpu=neon is the way to go as it uses both co-processors. Thank you.Lennie

© 2022 - 2024 — McMap. All rights reserved.