I cannot add directly to the exellent answers given by David, templatetypedef etc. - if you want to avoid inter-thread comms latency and resource-waste, don't do inter-thread comms with sleep() loops.
Preemptive scheduling/dispatching:
At CPU level, interrupts are the key. The OS does nothing until an interrupt occurs that causes its code to be entered. Note that, in OS-terms, interrupts come in two flavours - 'real' hardware interrupts that cause drivers to be run and 'software interrupts' - these are OS system calls from already-running threads that can potentially cause the set of running threads to change. Keypresses, mouse-movements, network cards, disks, page-faults all generate hardware interrupts. The wait and signal functions, and sleep(), belong to that second category. When a hardware interrupt causes a driver to be run, the driver performs whatever hardware-management it was designed to do. If the driver needs to signal the OS that some thread needs to be made running, (perhaps a disk buffer is now full and needs to be processed), the OS provides an entry mechanism that the driver can call instead of directly performing an interrupt-return itself, (important!).
Interrupts like the above examples can make threads that were waiting ready to run and/or can make a thread that is running enter a waiting state. After processing the code of the interrupt, the OS applies its scheduling algorithm/s to decide if the set of threads that were running before the interrupt is the same as the set that should now be run. If they are, the OS just interrupt-returns, if not, the OS must preempt one or more of the running threads. If the OS needs to preempt a thread that is running on a CPU core that is not the one that handled the interrupt, it has to gain control of that CPU core. It does this by means of a 'real' hardware interrupt - the OS inter-processor driver sets a hardware signal that hard-interrupts the core running the thread that is to be preempted.
When a thread that is to be preempted enters the OS code, the OS can save a complete context for the thread. Some of the registers will have already been saved onto the stack of the thread by means of the interrupt entry and so saving the stack-pointer of the thread will effectively 'save' all those registers, but the OS will normally need to do more, eg. caches may need to be flushed, the FPU state may need to be saved and, in the case where the new thread to be run belongs to a different process than the one to be preempted, memory-management protection registers will need to be swapped out. Usually, the OS switches from the interrupted-thread stack to a private OS stack as soon as possible to avoid inflicting OS stack requirements onto every thread stack.
Once the context/s is/are saved, the OS can 'swap in' the extended context/s for the new thread/s that are to be made running. Now, the OS can finally load the stack-pointer for the new thread/s and perform interrupt-returns to make its new ready threads running.
The OS then does nothing at all. The running threads run until another interrupt, (hard or soft), occurs.
Important points:
1) The OS kernel should be looked at as a big interrupt-handler that can decide to interrupt-return to a different set of threads than the ones interrupted.
2) The OS can get control of, and stop if necessary, any thread in any process, no matter what state it is in or what core it may be running on.
3) Preemptive scheduling and dispatching does generate all the synchronization etc. problems that are posted about on these forums. The big upside is fast response at thread-level to hard interrupts. Without this, all those high-performance apps you run on your PC - video streaming, fast networks etc, would be virtually impossible.
4) The OS timer is just one of a large set of interrupts that can change the set of running threads. 'Time-slicing', (ugh - I hate that term), between ready threads only occurs when the computer is overloaded, ie. the set of ready threads is larger than the number of CPU cores available to run them. If any text purporting to explain OS scheduling mentions 'time-slicing' before 'interrupts', it is likely to cause more confusion than explanation. The timer interrupt is only 'special' in that many system calls have timeouts to back up their primary function, (OK, for sleep(), the timeout IS the primary function:).