(This question was originally about the CVTSI2SD
instruction and the fact that I thought it didn't work on the Pentium M CPU, but in fact it's because I'm using a custom OS and I need to manually enable SSE.)
I have a Pentium M CPU and a custom OS which so far used no SSE instructions, but I now need to use them.
Trying to execute any SSE instruction results in an interruption 6, illegal opcode (which in Linux would cause a SIGILL
, but this isn't Linux), also referred to in the Intel architectures software developer's manual (which I refer from now on as IASDM) as #UD - Invalid Opcode (UnDefined Opcode).
Edit: Peter Cordes actually identified the right cause, and pointed me to the solution, which I resume below:
If you're running an ancient OS that doesn't support saving XMM regs on context switches, the SSE-enabling bit in one of the machine control registers won't be set.
Indeed, the IASDM mentions this:
If an operating system did not provide adequate system level support for SSE, executing an SSE or SSE2 instructions can also generate #UD.
Peter Cordes pointed me to the SSE OSDev wiki, which describes how to enable SSE by writing to both CR0
and CR4
control registers:
clear the CR0.EM bit (bit 2) [ CR0 &= ~(1 << 2) ]
set the CR0.MP bit (bit 1) [ CR0 |= (1 << 1) ]
set the CR4.OSFXSR bit (bit 9) [ CR4 |= (1 << 9) ]
set the CR4.OSXMMEXCPT bit (bit 10) [ CR4 |= (1 << 10) ]
Note that, in order to be able to write to these registers, if you are in protected mode, then you need to be in privilege level 0. The answer to this question explains how to test it: if in protected mode, that is, when bit 0 (PE
) in CR0
is set to 1, then you can test bits 0 and 1 from the CS
selector, which should be both 0.
Finally, the custom OS must properly handle XMM registers during context switches, by saving and restoring them when necessary.
CVTSI2SD—Convert Dword Integer to Scalar Double-Precision FP Value
belongs to the SSE2 instruction set, and this is confirmed in the Intel Software Developer Manuals. – TransgressSIGILL
("illegal instruction") or something else ? – Galloromance(gdb) disas /r
at the crash site? – Transgresseax
after executingmov eax, 1 / cpuid
? – Trousseau0010 0111 1110 1001 1111 1011 1011 1111
in binary. – Aurelia