I wrote a Win32 application (in Delphi-7 which is 32-bit using TThread class) to create 100 threads. Each thread when resumed will continuously (in a loop) increment a 64 bit counter associated with the thread object (so no locking or sharing of data).
If you let the system run for 10 to 15 seconds and stop after that, you should see roughly the same counts in each of the threads. But what I observed was that 81 threads ran under 400 million loops and the remaining ones looped more than 950 million times. Slowest thread got only 230 million compared to the fastest 2111 million.
According to MSDN, the preemptive multitasking is at the thread-level (not process level), so each of my thread should have gotten its time-slice in a round-robin fashion. What am I missing here and why is this discrepancy?
Edit1: Machine configuration: Intel i7 Quad Core 3.4GHz with hyper-threading turned on (8 active threads at a time). Running Windows-7 64 bit professional (and the test application is 32 bit)
Edit2 (thread code): The test application is built with optimization turned on and without any debug info. Run the test application outside of IDE.
type
TMyThread = class(TThread)
protected
FCount: Int64;
public
constructor Create;
procedure Execute; override;
property Count: Int64 read FCount;
end;
{ TMyThread }
constructor TMyThread.Create;
begin
inherited Create(True);
FCount := 0;
end;
procedure TMyThread.Execute;
begin
inherited;
while not Terminated do
begin
Inc(FCount);
end;
end;