Do all 64 bit intel architectures support SSSE3/SSE4.1/SSE4.2 instructions?
Asked Answered
A

2

26

I did searched on web and intel Software manual . But am unable to confirm if all Intel 64 architectures support upto SSSE3 or upto SSE4.1 or upto SSE4.2 or AVX etc. So that I would be able to use minimum SIMD supported instructions in my programme. Please help.

Archil answered 28/1, 2015 at 6:14 Comment(2)
I think that's probably correct (for SSSE3 only) but watch out for AMD64 which typically doesn't have SSSE3.Puberulent
Intel first-gen Core2 (merom/conroe from 2006) has SSSE3. AMD PhenomII (k10) is the most recent microarchitecture to not have SSE3 / SSSE3. If you're doing runtime CPU detection to enable vectorized functions, you might not bother hand-writing an SSE2 version with intrinsics, maybe only an SSSE3 and an AVX version, for example. If anything in SSE4.1 helps a lot for your code (e.g. 32bit integer stuff), you might also make a version for SSE4.1 without AVX (for Penryn/Nehalem/Silvermont and crippled Pentium / Celeron SnB-family CPUs (AVX disabled).)Constantia
T
40

A x64 native (AMD64 or Intel 64) processor is only mandated to support SSE and SSE2.

SSE3 is supported by Intel Pentium 4 processors (“Prescott”), AMD Athlon 64 (“revision E”), AMD Phenom, and later processors. This means most, but not quite all, x64 capable CPUs should support SSE3.

Supplemental SSE3 (SSSE3) is supported by Intel Core 2 Duo, Intel Core i7/i5/i3, Intel Atom, AMD Bulldozer, AMD Bobcat, and later processors.

SSE4.1 is supported on Intel Core 2 (“Penryn”), Intel Core i7 (“Nehalem”), Intel Atom (Silvermont core), AMD Bulldozer, AMD Jaguar, and later processors.

SSE 4.1 and SSE4.2 are supported on Intel Core i7 (“Nehalem”), Intel Atom (Silvermont core), AMD Bulldozer, AMD Jaguar, and later processors.

AVX is supported by Intel “Sandy Bridge”, AMD Bulldozer, AMD Jaguar, and later processors.

See this blog series.

A CPU with x64 native support but no SSE3 support is going to be 'first-generation' 64-bit which isn't supported by Windows 8.1 x64 native due to the requirements for CMPXCHG16b, PrefetchW, and LAHF/SAHF; so in practice SSE3 is highly likely in newer machines. SSSE3 or later is more restrictive depending on exactly who you are aiming at. For example, the Valve Hardware Survey puts SSE4.1 at 77%, SSE 4.2 at 72% (anything from AMD or Intel with SSE4.1 is going to also have SSE3 and SSSE3).

UPDATE: Per the comment below, the support for SSE3 for PC gamers per the Valve survey is now 100%. SSSE3, SSE4.1, and SSE4.2 are all in the 97-98% range. AVX is around 92%--the current generation gaming consoles from Sony & Microsoft support up through AVX. The primary value of AVX is that you can use the /arch:AVX switch which allows all SSE code-generation to use the 3-operand VEX prefix which makes register scheduling more efficient. See this blog post.

AVX2 is approaching 75% which is really good, but still potentially a blocker for a game to rely on without a fallback path. AVX2 is supported by Intel “Haswell”, AMD Excavator, and later processors. See this blog post.

Windows on ARM: Note that the x86 emulation for Windows on ARM64 only supports up to SSE4.1, and the x64 emulation in Windows 11 only supports up to SSE 4.2. AVX/AVX2 is not supported for these platforms.

Tobi answered 28/1, 2015 at 7:13 Comment(4)
Celeron and Pentium processors from Sandy Bridge and Haswell don't support AVX (or AVX2). I don't think the Atom processors support AVX eitherHandbook
This answer is about 5 years old, to date. Current Steam Hardware Survey lists SSE3 at 100%, Supplemental SSE3 at 98.61%, SSE4.1 at 97.94% and SSE4.2 at 97.25%.Dry
Note that brand-new Pentium and Celeron CPUs unfortunately still don't support AVX (or BMI1 / BMI2 because(?) they also include VEX-coded instruction), so that 8% isn't just old machines. e.g. Pentium® Gold G6600 (launched Q2 2020, Comet Lake uarch) only has up to SSE4.2. Some budget gamers may buy a very cheap CPU and spend more on a GPU.Constantia
The other part of that SSE4.2-without-AVX is crusty old Nehalem machines, and maybe some low-power Silvermont-family netbooks or micro-desktops. TL:DR: requiring AVX can exclude some brand new machines, which will make some people unhappy because they probably aren't CPU-architecture experts and will blame you, instead of Intel for selling them a crippled CPU.Constantia
A
4

I have been trying to figure this out because failed to compile third party software using SSE. I found this might be helpful:

cat /proc/cpuinfo

Then pay attention to the flags section

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts md_clear flush_l1d

I can see:

sse4_1 sse4_2

If you are trying to write some code to detect this automatically the following might be useful:

cat /proc/cpuinfo | grep flags | uniq | sed 's/.\+: //' | tr ' ' '\n' | grep -o "sse.*"
sse
sse2
sse3
sse4_1
sse4_2
Artisan answered 29/6, 2021 at 0:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.