Although this doesn't answer the question, this is related to removal of segment limit checks in 64 bit mode supposedly making protection of hypervisor trap handlers 'impossible' without hardware virtualisation, which is a discussion people may expect to see seeing this question title. 'breaking some existing implementations' I agree with, but not 'impossible'.
The initial version of x86-64 (AMD64) did not allow for a software-only full virtualization due to the lack of segmentation support in long mode, which made the protection of the hypervisor's memory impossible, in particular, the protection of the trap handler that runs in the guest kernel address space.
I see this all over but I'm not convinced. You don't need segmentation to protect hypervisor trap handlers or the IDT. You can do it with paging.
Make a certain virtual address range in the SPT always map the hypervisor trap handlers, and the IDT at the virtual address it is at on the host. The IDT requires 1 4KiB page. These pages in the SPT are made supervisor, meaning the guest kernel in ring 1 cannot write to them because it will cause a trap right into the IDT which is mapped into the guest. Now ring 0, the code can be executed. When the guest does read/write to those reserved virtual address ranges, it needs to be unaware that it is reserved, i.e. the hypervisor driver silently redirects the accesses with certain CR3s to a clean page. A read/write from a ring 1 guest kernel will cause a GPF, and the CR2 will contain the address that was attempted to be written to. Usually, the page is faulted in and then the RIP pushed is of the instruction to be reattempted, but it can't do this in this case. It needs to decode and perform the read/write itself at the RIP in the trap frame to a new host physical page by translating it using a special internal table purely for those reserved regions built when the guest modifies the PTEs, and then increase the RIP.