In https://github.com/qemu/qemu/blob/stable-4.2/cpus.c#L1290 lies a very important piece of Qemu. I guess it's the event loop for a CPU on KVM.
Here is the code:
static void *qemu_kvm_cpu_thread_fn(void *arg)
{
CPUState *cpu = arg;
int r;
rcu_register_thread();
qemu_mutex_lock_iothread();
qemu_thread_get_self(cpu->thread);
cpu->thread_id = qemu_get_thread_id();
cpu->can_do_io = 1;
current_cpu = cpu;
r = kvm_init_vcpu(cpu);
if (r < 0) {
error_report("kvm_init_vcpu failed: %s", strerror(-r));
exit(1);
}
kvm_init_cpu_signals(cpu);
/* signal CPU creation */
cpu->created = true;
qemu_cond_signal(&qemu_cpu_cond);
qemu_guest_random_seed_thread_part2(cpu->random_seed);
do {
if (cpu_can_run(cpu)) {
r = kvm_cpu_exec(cpu);
if (r == EXCP_DEBUG) {
cpu_handle_guest_debug(cpu);
}
}
qemu_wait_io_event(cpu);
} while (!cpu->unplug || cpu_can_run(cpu));
qemu_kvm_destroy_vcpu(cpu);
cpu->created = false;
qemu_cond_signal(&qemu_cpu_cond);
qemu_mutex_unlock_iothread();
rcu_unregister_thread();
return NULL;
}
I'm interested on the do
loop. It calls kvm_cpu_exec
in a loop, which is defined here: https://github.com/qemu/qemu/blob/stable-4.2/accel/kvm/kvm-all.c#L2285
At one point of kvm_cpu_exec
, it calls run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
, which calls the KVM_RUN
ioctl documented here: https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt
4.10 KVM_RUN
Capability: basic
Architectures: all
Type: vcpu ioctl
Parameters: none
Returns: 0 on success, -1 on error
Errors:
EINTR: an unmasked signal is pendingThis ioctl is used to run a guest virtual cpu. While there are no
explicit parameters, there is an implicit parameter block that can be
obtained by mmap()ing the vcpu fd at offset 0, with the size given by
KVM_GET_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct
kvm_run' (see below).
I am struggling to understand if this ioctl blocks execution? In which cases it returns?
I would like to have a little context of what is happening. Given the line qemu_wait_io_event(cpu)
, it looks like at least, the ioctl would return every time an event was to be read/written to/from the CPU. I don't know, I'm confused.