It is impossible to achieve what is formulated in the question the way it is formulated for multiple reasons listed below. However, by generalizing the idea it may become something that might be making sense to include into some future revision of the language.
The reasons why it will not work:
asm
blocks are opaque to the C++ compiler. The syntax of such blocks is compiler-specific. I do not think that MS VC++ accepts clobber lists the way it is supported by GCC and Intel compiler. Moreover, Microsoft's x86_64 compilers stopped supporting assembly blocks as they force people to use intrinsics. By the way, maybe relying on presence of intrinsic functions can be used to offer a compile-time CPU dispatching instead? Could be worth exploring this idea.
asm
blocks are target architecture-specific. There are other ways to detect the target architecture at compile-time.
The very notion of an instruction being present/absent is very vague. Which entity is authorized to make a decision on any given asm
expression: the assembler program that translates its text into machine code or the target processor itself that runs the actual code? Both choices are problematic.
- As an example, "MOV" is a popular mnemonic name for a multitude of architectures. But is it the same instruction in all cases? The semantics bound to the mnemonic is unlikely to match between non-related architectures.
- Merely being successful in assembling does not mean it will execute fine. For example, on Intel 64 architecture an instruction may fault with #UD (undefined instruction signal) even if it is correct, because its behavior depends on runtime values of CR0 and CR4 registers which are controlled by an operating system. An assembler program will process it just fine in any case. One has to run the code. But what if we do cross-compilation, and cannot run it as the target processor does not match the host processor?
As it is, there is no way to know the outcome of an opaque block without executing it first. So, the compiler may want to call an arbitrary program to return a value which then will be used for template expansion. Such a program can then do processor- or instruction-sensing and return its findings to guide compilation further.
Now this looks abstract enough to be a language feature, as we dictate no assumptions on the nature of such an external program. There are still portability (+ cross-compiling) issues and security (running an external program is risky) issues. All in all, it looks better to me to rely on existing macrodefinitions coming into compiler from the environment.
__SSE4_2__
? – Brierwoodasm
block is basically saying to the C++ front-end "ignore this bit, it goes directly to the backend". SFINAE is fully in the front-end, and eliminated code doesn't make it to the back-end. – Dvandva__SSE4_2__
) but this only parrots back at you what you told your build system in the first place. Fun. – Ailanthuscmov
and 686 features. But if you care about your code running on ancient CPUs, CPUID will fault there. See wiki.osdev.org/CPUID#Checking_CPUID_availability for a detection sequence, and the notes in sandpile.org/x86/cpuid.htm. Basically; checking which bits in FLAGS stay set after writing can detect 386 vs. 486 vs. 586, and specifically support for CPUID, which appeared in 486-SL and Pentium. – Forde