Common SIMD techniques
Asked Answered
M

2

17

Where can I find information about common SIMD tricks? I have an instruction set and know, how to write non-tricky SIMD code, but I know, SIMD now is much more powerful. It can hold complex conditional branchless code.
For example (ARMv6), the following sequence of instructions sets each byte of Rd equal to the unsigned minimum of the corresponding bytes of Ra and Rb:

USUB8 Rd, Ra, Rb
SEL Rd, Rb, Ra

Links to tutorials / uncommon SIMD techniques are good too :) ARMv6 is the most interesting for me, but x86(SSE,...)/Neon(in ARMv7)/others are good too.

Momus answered 28/1, 2010 at 17:4 Comment(0)
L
14

One of the best SIMD resources ever was the old AltiVec mailing list. Although PowerPC/AltiVec-specific I suspect that a lot of the material on this list would be of general interest to anyone working with other SIMD architectures. Sadly this list seems now to be defunct after being moved to a forum on power.org, but you may be able to find archived versions of it. (If not then let me know - I have pretty much all the posts from 2000 - 2007.)

There is also a lot of potentially useful info on AltiVec, SSE, SIMD vectorization and performance in general at developer.apple.com/hardwaredrivers/ve, a good deal of which may be transferable to other SIMD architectures.

Lindell answered 28/1, 2010 at 17:19 Comment(3)
Agreed on the altivec page -- some of the sections on "writing altivec code" are quite helpful on general vector programming best practices.Frontal
Unfortunately, as of right now that page now redirects to the the Apple Advanced Conmputation Group. The page is still in the Google cache: 209.85.229.132/search?q=cache:eHR6ni6SROoJ:developer.apple.com/… and the subpages don't redirect, so the content is still available, though I've archived it Just In Case(TM).Participial
@Pierre: that's a real shame - I guess Apple feels that no one cares about AltiVec any more. Unfortunately the old AltiVec mailing list also seems to have disappeared - it got moved to power.org apparently but I can't find it now. Fortunately I have most of it archived.Lindell
I
7

Try AMD's SSEPlus project on sourceforge

Incomprehensible answered 28/5, 2010 at 9:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.