There is no way to use Amd64 instructions (Long mode) in 32-bit general-purpose OS (without kernel modification/special drivers/hypervisor).
This is because:
1) to use native 64-bit instructions you need to switch into long mode. This is privileged action. 32-bit OS kernel can't continue to work if the CPU is switched into 64-bit mode, so you should switch back before entering a kernel
2) But kernel are often called asynchronously, for timer (scheduler) and other hardware interrupts (drivers). It will not save 64-bit registers nor change mode from long into protected.
May be it is possible to write special driver, which will do the 64-bit tasks on 32-bit OS, but such driver is more like 64-bit kernel and dynamic patcher of kernel. I know no one such solution.
You can only use MMX, SSE, SSE2, SSE3, AVX for accessing 64-bit ALU and registers of your CPU when running in 32-bit OS.
I can say, that Linux, some BSD, Mac OS X have a mode, when 64-bit kernel is used, but user-space software is 32-bit. In such case it will be possible to run both 32-bit and 64-bit applications, because kernel knows about 64-bit mode and can access 64-bit registers to do a task switch. As far as I know, MS Windows have not such mode itself (the W7 emulates 32bit mode, but this is called my MS as simulator so I assume it is not an in-kernel feature).
Other possibility (this is better, is your CPU has support of hardware virtualization), is to use 64-bit hypervisor (VMware/Xen, other overpriced solutions) with 32-bit and 64-bit guest OSes. VirtualBox is other option of using hypervisor, and it is free to use.