IOCP threads - Clarification?
Asked Answered
I

4

17

After reading this article which states :

After a device finishes its job , (IO operation)- it notifies the CPU via interrupt.

... ... ...

However, that “completion” status only exists at the OS level; the process has its own memory space that must be notified

... ... ...

Since the library/BCL is using the standard P/Invoke overlapped I/O system, it has already registered the handle with the I/O Completion Port (IOCP), which is part of the thread pool.

... ... ...

So an I/O thread pool thread is borrowed briefly to execute the APC, which notifies the task that it’s complete.

I was interesting about the bold part :

If I understood correctly , after the the IO operation is finished , it has to notify to the actual process which executed the IO operation.

Question #1:

Does it mean that it grabs a new thread pool thread for each completed IO operation ? Or is it a dedicated number of threads for this ?

Question #2:

Looking at :

for (int i=0;i<1000;i++)
    {
      PingAsync_NOT_AWAITED(i); //notice not awaited !
    }

Does it mean that I'll have 1000 IOCP threadpool thread simultaneously ( sort of) running here , when all are finished ?

Isolated answered 24/2, 2015 at 8:6 Comment(3)
Royi, you may want to check my little experiment here.Tramroad
@Noseratio Thank you !. I'm surely going to look at it.Isolated
You may also want to read this, to understand how it works on the OS level: I/O Completion Ports.Tramroad
U
15

This is a bit broad, so let me just address the major points:

The IOCP threads are on a separate thread pool, so to speak - that's the I/O threads setting. So they do not clash with the user thread-pool threads (like the ones you have in normal await operations or ThreadPool.QueueWorkerItem).

Just like the normal thread pool, it will only allocate new threads slowly over time. So even if there's a peak of async responses that happen all at once, you're not going to have 1000 I/O threads.

In a properly asynchronous application, you're not going to have more than the number of cores, give or take, just like with the worker threads. That's because you're either doing significant CPU work and you shold post it on a normal worker thread or you're doing I/O work and you should do that as an asynchronous operation.

The idea is that you spend very little time in the I/O callback - you don't block, and you don't do a lot of CPU work. If you violate this (say, add Thread.Sleep(10000) to your callback), then yes, .NET will create tons and tons of IO threads over time - but that's just improper usage.

Now, how are I/O threads different from normal CPU threads? They're almost the same, they just wait for a different signal - both are (simplification alert) just a while loop over a method that gives control when a new work item is queued by some other part of the application (or the OS). The main difference is that I/O threads are using IOCP queue (OS managed), while normal worker threads have their own queue, completely .NET managed and accessible by the application programmer.

As a side note, don't forget that your request might have completed synchronously. Perhaps you're reading from a TCP stream in a while loop, 512 bytes at a time. If the socket buffer has enough data in it, multiple ReadAsyncs can return immediately without doing any thread switching at all. This isn't usually a problem because I/O tends to be the most time-intensive stuff you do in a typical application, so not having to wait for I/O is usually fine. However, bad code depending on some part happenning asynchronously (even though that isn't guaranteeed) can easily break your application.

Unbridled answered 24/2, 2015 at 8:15 Comment(4)
There is a separation but both types of threads are in the same ThreadPool. You can set how many you want with the same method: ThreadPoo.SetMaxThreads(int workerThreads, int completionPortThreads)Splore
@Splore ThreadPool is not the pool, though. It's just a bunch of methods in a static class. There's separate work queues and thread pools and some of those are managed by the OS, and some are managed by CLR native code, and some are managed by managed CLR code... It's all a bit complicated. You interact with all of those through the ThreadPool class, but they don't even have the same interface (BindHandle versus QueueUserWorkItem, for example). Try digging through CLR code now that it's public, it's a lot of fun and interesting insights on multi-threading and asynchronous code.Unbridled
Well, i guess it depends on how you want to define the thread pool. I would stay with MSDN's "The thread pool provides new worker threads or I/O completion threads on demand until it reaches the minimum for each category. When a minimum is reached, the thread pool can create additional threads in that category or wait until some tasks complete"Splore
@Splore "Additional threads in that category" on its own means that there's different pools :) But that's really just going into the naming. As long as you understand that there's two separate pools of threads (worker vs. I/O), it's just a confusion in naming.Unbridled
M
17

Does it mean that it grabs a new thread pool thread for each completed IO operation ? Or is it a dedicated number of threads for this ?

It would be terribly inefficient to create a new thread for every single I/O request, to the point of defeating the purpose. Instead, the runtime starts off with a small number of threads (the exact number depends on your environment) and adds and removes worker threads as necessary (the exact algorithm for this likewise varies with your environment). Ever major version of .NET has seen changes in this implementation, but the basic idea stays the same: the runtime does its best to create and maintain only as many threads as are necessary to service all I/O efficiently. On my system (Windows 8.1, .NET 4.5.2) a brand new console application has only 3 threads in the process on entering Main, and this number doesn't increase until actual work is requested.

Does it mean that I'll have 1000 IOCP threadpool thread simultaneously ( sort of) running here , when all are finished ?

No. When you issue an I/O request, a thread will be waiting on a completion port to get the result and call whatever callback was registered to handle the result (be it via a BeginXXX method or as the continuation of a task). If you use a task and don't await it, that task simply ends there and the thread is returned to the thread pool.

What if you did await it? The results of 1000 I/O requests won't really arrive all at the same time, since interrupts don't all arrive at the same time, but let's say the interval is much shorter than the time we need to process them. In that case, the thread pool will keep spinning up threads to handle the results until it reaches a maximum, and any further requests will end up queueing on the completion port. Depending on how you configure it, those threads may take some time to spin up.

Consider the following (deliberately awful) toy program:

static void Main(string[] args) {
    printThreadCounts();
    var buffer = new byte[1024];
    const int requestCount = 30;
    int pendingRequestCount = requestCount;
    for (int i = 0; i != requestCount; ++i) {
        var stream = new FileStream(
            @"C:\Windows\win.ini",
            FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 
            buffer.Length, FileOptions.Asynchronous
        );
        stream.BeginRead(
            buffer, 0, buffer.Length,
            delegate {
                Interlocked.Decrement(ref pendingRequestCount);
                Thread.Sleep(Timeout.Infinite);
            }, null
        );
    }
    do {
        printThreadCounts();
        Thread.Sleep(1000);
    } while (Thread.VolatileRead(ref pendingRequestCount) != 0);
    Console.WriteLine(new String('=', 40));
    printThreadCounts();
}

private static void printThreadCounts() {
    int completionPortThreads, maxCompletionPortThreads;
    int workerThreads, maxWorkerThreads;
    ThreadPool.GetMaxThreads(out maxWorkerThreads, out maxCompletionPortThreads);
    ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads);
    Console.WriteLine(
        "Worker threads: {0}, Completion port threads: {1}, Total threads: {2}", 
        maxWorkerThreads - workerThreads, 
        maxCompletionPortThreads - completionPortThreads, 
        Process.GetCurrentProcess().Threads.Count
    );
}

On my system (which has 8 logical processors), the output is as follows (results may vary on your system):

Worker threads: 0, Completion port threads: 0, Total threads: 3
Worker threads: 0, Completion port threads: 8, Total threads: 12
Worker threads: 0, Completion port threads: 9, Total threads: 13
Worker threads: 0, Completion port threads: 11, Total threads: 15
Worker threads: 0, Completion port threads: 13, Total threads: 17
Worker threads: 0, Completion port threads: 15, Total threads: 19
Worker threads: 0, Completion port threads: 17, Total threads: 21
Worker threads: 0, Completion port threads: 19, Total threads: 23
Worker threads: 0, Completion port threads: 21, Total threads: 25
Worker threads: 0, Completion port threads: 23, Total threads: 27
Worker threads: 0, Completion port threads: 25, Total threads: 29
Worker threads: 0, Completion port threads: 27, Total threads: 31
Worker threads: 0, Completion port threads: 29, Total threads: 33
========================================
Worker threads: 0, Completion port threads: 30, Total threads: 34

When we issue 30 asynchronous requests, the thread pool quickly makes 8 threads available to handle the results, but after that it only spins up new threads at a leisurely pace of about 2 per second. This demonstrates that if you want to properly utilize system resources, you'd better make sure that your I/O processing completes quickly. Indeed, let's change our delegate to the following, which represents "proper" processing of the request:

stream.BeginRead(
    buffer, 0, buffer.Length,
    ar => {
        stream.EndRead(ar);
        Interlocked.Decrement(ref pendingRequestCount);
    }, null
);

Result:

Worker threads: 0, Completion port threads: 0, Total threads: 3
Worker threads: 0, Completion port threads: 1, Total threads: 11
========================================
Worker threads: 0, Completion port threads: 0, Total threads: 11

Again, results may vary on your system and across runs. Here we barely glimpse the completion port threads in action while the 30 requests we issued are completed without spinning up new threads. You should find that you can change "30" to "100" or even "100000": our loop can't start requests faster than they complete. Note, however, that the results are skewed heavily in our favor because the "I/O" is reading the same bytes over and over and is going to be serviced from the operating system cache and not by reading from a disk. This isn't meant to demonstrate realistic throughput, of course, only the difference in overhead.

To repeat these results with worker threads rather than completion port threads, simply change FileOptions.Asynchronous to FileOptions.None. This makes file access synchronous and the asynchronous operations will be completed on worker threads rather than using the completion port:

Worker threads: 0, Completion port threads: 0, Total threads: 3
Worker threads: 8, Completion port threads: 0, Total threads: 15
Worker threads: 9, Completion port threads: 0, Total threads: 16
Worker threads: 10, Completion port threads: 0, Total threads: 17
Worker threads: 11, Completion port threads: 0, Total threads: 18
Worker threads: 12, Completion port threads: 0, Total threads: 19
Worker threads: 13, Completion port threads: 0, Total threads: 20
Worker threads: 14, Completion port threads: 0, Total threads: 21
Worker threads: 15, Completion port threads: 0, Total threads: 22
Worker threads: 16, Completion port threads: 0, Total threads: 23
Worker threads: 17, Completion port threads: 0, Total threads: 24
Worker threads: 18, Completion port threads: 0, Total threads: 25
Worker threads: 19, Completion port threads: 0, Total threads: 26
Worker threads: 20, Completion port threads: 0, Total threads: 27
Worker threads: 21, Completion port threads: 0, Total threads: 28
Worker threads: 22, Completion port threads: 0, Total threads: 29
Worker threads: 23, Completion port threads: 0, Total threads: 30
Worker threads: 24, Completion port threads: 0, Total threads: 31
Worker threads: 25, Completion port threads: 0, Total threads: 32
Worker threads: 26, Completion port threads: 0, Total threads: 33
Worker threads: 27, Completion port threads: 0, Total threads: 34
Worker threads: 28, Completion port threads: 0, Total threads: 35
Worker threads: 29, Completion port threads: 0, Total threads: 36
========================================
Worker threads: 30, Completion port threads: 0, Total threads: 37

The thread pool spins up one worker thread per second rather than the two it started for completion port threads. Obviously these numbers are implementation-dependent and may change in new releases.

Finally, let's demonstrate the use of ThreadPool.SetMinThreads to ensure a minimum number of threads is available to complete requests. If we go back to FileOptions.Asynchronous and add ThreadPool.SetMinThreads(50, 50) to the Main of our toy program, the result is:

Worker threads: 0, Completion port threads: 0, Total threads: 3
Worker threads: 0, Completion port threads: 31, Total threads: 35
========================================
Worker threads: 0, Completion port threads: 30, Total threads: 35

Now, instead of patiently adding one thread every two seconds, the thread pool keeps spinning up threads until the maximum is reached (which doesn't happen in this case, so the final count stays at 30). Of course, all of these 30 threads are stuck in infinite waits -- but if this had been a real system, those 30 threads would now presumably be doing useful if not terribly efficient work. I wouldn't try this with 100000 requests, though.

Manned answered 24/2, 2015 at 8:16 Comment(0)
U
15

This is a bit broad, so let me just address the major points:

The IOCP threads are on a separate thread pool, so to speak - that's the I/O threads setting. So they do not clash with the user thread-pool threads (like the ones you have in normal await operations or ThreadPool.QueueWorkerItem).

Just like the normal thread pool, it will only allocate new threads slowly over time. So even if there's a peak of async responses that happen all at once, you're not going to have 1000 I/O threads.

In a properly asynchronous application, you're not going to have more than the number of cores, give or take, just like with the worker threads. That's because you're either doing significant CPU work and you shold post it on a normal worker thread or you're doing I/O work and you should do that as an asynchronous operation.

The idea is that you spend very little time in the I/O callback - you don't block, and you don't do a lot of CPU work. If you violate this (say, add Thread.Sleep(10000) to your callback), then yes, .NET will create tons and tons of IO threads over time - but that's just improper usage.

Now, how are I/O threads different from normal CPU threads? They're almost the same, they just wait for a different signal - both are (simplification alert) just a while loop over a method that gives control when a new work item is queued by some other part of the application (or the OS). The main difference is that I/O threads are using IOCP queue (OS managed), while normal worker threads have their own queue, completely .NET managed and accessible by the application programmer.

As a side note, don't forget that your request might have completed synchronously. Perhaps you're reading from a TCP stream in a while loop, 512 bytes at a time. If the socket buffer has enough data in it, multiple ReadAsyncs can return immediately without doing any thread switching at all. This isn't usually a problem because I/O tends to be the most time-intensive stuff you do in a typical application, so not having to wait for I/O is usually fine. However, bad code depending on some part happenning asynchronously (even though that isn't guaranteeed) can easily break your application.

Unbridled answered 24/2, 2015 at 8:15 Comment(4)
There is a separation but both types of threads are in the same ThreadPool. You can set how many you want with the same method: ThreadPoo.SetMaxThreads(int workerThreads, int completionPortThreads)Splore
@Splore ThreadPool is not the pool, though. It's just a bunch of methods in a static class. There's separate work queues and thread pools and some of those are managed by the OS, and some are managed by CLR native code, and some are managed by managed CLR code... It's all a bit complicated. You interact with all of those through the ThreadPool class, but they don't even have the same interface (BindHandle versus QueueUserWorkItem, for example). Try digging through CLR code now that it's public, it's a lot of fun and interesting insights on multi-threading and asynchronous code.Unbridled
Well, i guess it depends on how you want to define the thread pool. I would stay with MSDN's "The thread pool provides new worker threads or I/O completion threads on demand until it reaches the minimum for each category. When a minimum is reached, the thread pool can create additional threads in that category or wait until some tasks complete"Splore
@Splore "Additional threads in that category" on its own means that there's different pools :) But that's really just going into the naming. As long as you understand that there's two separate pools of threads (worker vs. I/O), it's just a confusion in naming.Unbridled
C
5

Does it mean that I'll have 1000 IOCP threadpool thread simultaneously ( sort of) running here , when all are finished ?

No, not at all. Same like the worker threads available in ThreadPool we also have "Completion port threads".

These threads are dedicated for Async I/O. There will not be threads created upfront. They are created on demand the sameway as worker threads. They will be destroyed eventually when threadpool decides.

By borrowed briefly author means that to notify the completion of IO to the process some arbitrary thread from "Completion port threads"(of ThreadPool) is used. It will not be executing any lengthy operation but completion of IO notification.

Crystallization answered 24/2, 2015 at 8:18 Comment(2)
(relates a bit) If I downloaded a html from a site , and it has finished , and not being read yet from the app (but did notify), where does this data is stored ?Isolated
@RoyiNamir It's in some buffer somewhere. There's many layers of buffering, so it's not easy to say where exactly. However, when you get the notification, it already has to be in your buffer - of course, if you're using something like HttpClient, it's his buffer, while if you're using e.g. TcpClient directly, it's the byte[] buffer you gave it when you did ReceiveAsync. Of course, that's one of the reasons you want to work with highest available abstraction - networking (and any asynchronicity) is hard, let the smart guys handle the hardest parts :DUnbridled
D
2

As we've talked before, IOCP and worker threads have a separate resource inside the threadpool.

Disregarding if you await an IO operation or not, a registering to either IOCP or overlapped IO will occur. await is a higher level mechanism that has nothing to do with the registration of those IOCP.

By a simple test, you can see that although no await occurs, the IOCP are still being used by the application:

private static void Main(string[] args)
{
    Task.Run(() =>
    {
        int count = 0;
        while (count < 30)
        {
            int _;
            int iocpThreads;
            ThreadPool.GetAvailableThreads(out _, out iocpThreads);
            Console.WriteLine("Current number of IOCP threads availiable: {0}", iocpThreads);
            count++;
            Thread.Sleep(10);
        }
    });

    for (int i = 0; i < 30; i++)
    {
        GetUrl(@"http://www.ynet.co.il");
    }

    Console.ReadKey();
}

private static async Task<string> GetUrl(string url)
{
    var httpClient = new HttpClient();
    var response = await httpClient.GetAsync(url);
    return await response.Content.ReadAsStringAsync();
}

Depending on the amount of time it takes to do each request, you'll see the IOCP narrow down while you're making requests. The more concurrent requests you'll try to make the less threads will be available to you.

Dominique answered 24/2, 2015 at 8:25 Comment(11)
I would have changed the connection limit since you're limited here to ~4 connections.... System.Net.ServicePointManager.DefaultConnectionLimit = 1000 ( imho)Isolated
It doesn't really matter if it's 4 or not. The point is to see that those IOCP are really put to use while you don't await any of the requests.Dominique
Oh , just wanted to point it out to see more accurate reasults :-) - for other who might wonder why....Isolated
Who limits you to ~4 connections?Dominique
#866850Isolated
Are you sure this applies to HttpClient?Dominique
I suspect the first comment here but I doubt the OP's testIsolated
Also , I think we're talking about different Threads here. In your code we see the out iocpThreads value changes but it's becuase after the await is finshed , the continuation runs on a different thread. hence the change in number , Also , other people here said that it is a DIFFERENT thread pool.....( so which statement is true?) , Also I'm talking about the thread that "borrowed briefly to notifies the task that it’s completed"... not the one who serve the continuationIsolated
@RoyiNamir It's a different thread pool wrapped by the ThreadPool class :) That's why you get the amount of I/O threads as a separate value from "normal" worker threads. It's just some silliness in naming. It's just that all the threads from ThreadPool you directly work with are worker threads - that's just half of the picture. All the methods like ThreadPool.SetMaxThread take two parameters - amount of worker threads, and amount of I/O threads. There's actually a lot of different "physical" pools - each thread has one, for example. But they're just two "logical" pools - workers and I/O.Unbridled
@RoyiNami Yeah, basically. Of course, if you're doing it right, it will never go over the amount your computer can process concurrently. This is now easier than ever with await and similar constructs. If you don't make your whole I/O async from start to end, you can still find yourself with tons of I/O threads, for example in legacy WCF services (the I/O thread is the one with the request context). On the client, though, there really is no excuse.Unbridled
@YuvalItzchakov You might be interested in thisIsolated

© 2022 - 2024 — McMap. All rights reserved.