I am studying RDTSC and learning about how it is virtualized for the purposes of virtual machines like VirtualBox and VMWare. Why did Intel/AMD go to all the trouble of virtualizing this instruction?
I feel like it can be easily simulated with a trap and it's not exactly a super-common instruction (I tested and there's no noticable slow-down for general usage in a virtual machine where hardware RDTSC virtualization is disabled).
However, I know Intel/AMD wouldn't have gone to all the trouble to add this instruction to the virtualizing hardware unless it was important to able to execute very fast.
Does anyone know why?