When to use thread pool in C#? [closed]
Asked Answered
B

14

127

I have been trying to learn multi-threaded programming in C# and I am confused about when it is best to use a thread pool vs. create my own threads. One book recommends using a thread pool for small tasks only (whatever that means), but I can't seem to find any real guidelines.

What are some pros and cons of thread pools vs creating my own threads? And what are some example use cases for each?

Bosh answered 28/9, 2008 at 6:0 Comment(0)
C
50

If you have lots of logical tasks that require constant processing and you want that to be done in parallel use the pool+scheduler.

If you need to make your IO related tasks concurrently such as downloading stuff from remote servers or disk access, but need to do this say once every few minutes, then make your own threads and kill them once you're finished.

Edit: About some considerations, I use thread pools for database access, physics/simulation, AI(games), and for scripted tasks ran on virtual machines that process lots of user defined tasks.

Normally a pool consists of 2 threads per processor (so likely 4 nowadays), however you can set up the amount of threads you want, if you know how many you need.

Edit: The reason to make your own threads is because of context changes, (thats when threads need to swap in and out of the process, along with their memory). Having useless context changes, say when you aren't using your threads, just leaving them sit around as one might say, can easily half the performance of your program (say you have 3 sleeping threads and 2 active threads). Thus if those downloading threads are just waiting they're eating up tons of CPU and cooling down the cache for your real application

Cytology answered 28/9, 2008 at 6:9 Comment(8)
Ok, but can you explain why this is how you approach it? For example, what is the downside of using the thread pool to download from remote servers or do disk IO?Bosh
If a thread is waiting on a synchronization object (event, semaphore, mutex, etc) then the thread does not consume CPU.Walkway
As Brannon said, a common myth is that creation of multiple threads does impact performance. Actually, unused threads consume very few resources. Context switches begins to be a problem only in very high demand servers (in this case, see I/O completion ports for an alternative).Alicea
Ok, in as you mention in a small application the overhead may be small, but it is there. And in servers its not negligible at all (I work on servers BTW), if you want the real story just ask the folk at Intel and ADM as I've done before.Cytology
Do idle threads impact performance? It depends on how they wait. If well written and waiting on a synchronization object, then they should consume no CPU resources. If waiting in a loop that periodically wakes up to check results, then it is wasting CPU. As always it comes down to good coding.Storer
"2 threads per processor" is not true at all. The formula is more complicated than just x per processor, but it's much closer to 25 per processor.Goggler
Idle managed threads eat memory for their stack. By default is 1 MiB per thread. So it is better to have all threads working.Melville
@Storer - if a thread 'wakes up to check results', then by definition it isn't idle.Ashcroft
P
50

I would suggest you use a thread pool in C# for the same reasons as any other language.

When you want to limit the number of threads running or don't want the overhead of creating and destroying them, use a thread pool.

By small tasks, the book you read means tasks with a short lifetime. If it takes ten seconds to create a thread which only runs for one second, that's one place where you should be using pools (ignore my actual figures, it's the ratio that counts).

Otherwise you spend the bulk of your time creating and destroying threads rather than simply doing the work they're intended to do.

Phrixus answered 28/9, 2008 at 6:13 Comment(0)
S
14

I highly recommend reading the this free e-book: Threading in C# by Joseph Albahari

At least read the "Getting Started" section. The e-book provides a great introduction and includes a wealth of advanced threading information as well.

Knowing whether or not to use the thread pool is just the beginning. Next you will need to determine which method of entering the thread pool best suits your needs:

  • Task Parallel Library (.NET Framework 4.0)
  • ThreadPool.QueueUserWorkItem
  • Asynchronous Delegates
  • BackgroundWorker

This e-book explains these all and advises when to use them vs. create your own thread.

Stickney answered 29/9, 2010 at 14:9 Comment(0)
W
8

The thread pool is designed to reduce context switching among your threads. Consider a process that has several components running. Each of those components could be creating worker threads. The more threads in your process, the more time is wasted on context switching.

Now, if each of those components were queuing items to the thread pool, you would have a lot less context switching overhead.

The thread pool is designed to maximize the work being done across your CPUs (or CPU cores). That is why, by default, the thread pool spins up multiple threads per processor.

There are some situations where you would not want to use the thread pool. If you are waiting on I/O, or waiting on an event, etc then you tie up that thread pool thread and it can't be used by anyone else. Same idea applies to long running tasks, though what constitutes a long running task is subjective.

Pax Diablo makes a good point as well. Spinning up threads is not free. It takes time and they consume additional memory for their stack space. The thread pool will re-use threads to amortize this cost.

Note: you asked about using a thread pool thread to download data or perform disk I/O. You should not use a thread pool thread for this (for the reasons I outlined above). Instead use asynchronous I/O (aka the BeginXX and EndXX methods). For a FileStream that would be BeginRead and EndRead. For an HttpWebRequest that would be BeginGetResponse and EndGetResponse. They are more complicated to use, but they are the proper way to perform multi-threaded I/O.

Walkway answered 28/9, 2008 at 7:27 Comment(1)
ThreadPool is a clever automate. "If its queue remains stationary for more than half a second, it responds by creating more threads — one every half-second — up to the capacity of the thread pool" (albahari.com/threading/#_Optimizing_the_Thread_Pool). Also almost asynchronous operations with BeginXXX-EndXXX are used via ThreadPool. So it is normal to use ThreadPool to download data and often implicitly used.Unconventionality
S
6

Beware of the .NET thread pool for operations that may block for any significant, variable or unknown part of their processing, as it is prone to thread starvation. Consider using the .NET parallel extensions, which provide a good number of logical abstractions over threaded operations. They also include a new scheduler, which should be an improvement on ThreadPool. See here

Sericin answered 30/9, 2008 at 10:59 Comment(1)
We discovered this the hard way! ASP.Net uses the Threadpool is appears and so we couldn't use it as aggressive as we'd like to.Camel
L
3

One reason to use the thread pool for small tasks only is that there are a limited number of thread pool threads. If one is used for a long time then it stops that thread from being used by other code. If this happens many times then the thread pool can become used up.

Using up the thread pool can have subtle effects - some .NET timers use thread pool threads and will not fire, for example.

Lisa answered 28/9, 2008 at 13:38 Comment(0)
S
3

If you have a background task that will live for a long time, like for the entire lifetime of your application, then creating your own thread is a reasonable thing. If you have short jobs that need to be done in a thread, then use thread pooling.

In an application where you are creating many threads, the overhead of creating the threads becomes substantial. Using the thread pool creates the threads once and reuses them, thus avoiding the thread creation overhead.

In an application that I worked on, changing from creating threads to using the thread pool for the short lived threads really helpped the through put of the application.

Storer answered 28/9, 2008 at 13:47 Comment(1)
Please clarify if you mean "a thread pool" or "the thread pool". These are very different things (at least in the MS CLR).Mikkimiko
R
2

For the highest performance with concurrently executing units, write your own thread pool, where a pool of Thread objects are created at start up and go to blocking (formerly suspended), waiting on a context to run (an object with a standard interface implemented by your code).

So many articles about Tasks vs. Threads vs. the .NET ThreadPool fail to really give you what you need to make a decision for performance. But when you compare them, Threads win out and especially a pool of Threads. They are distributed the best across CPUs and they start up faster.

What should be discussed is the fact that the main execution unit of Windows (including Windows 10) is a thread, and OS context switching overhead is usually negligible. Simply put, I have not been able to find convincing evidence of many of these articles, whether the article claims higher performance by saving context switching or better CPU usage.

Now for a bit of realism:

Most of us won’t need our application to be deterministic, and most of us do not have a hard-knocks background with threads, which for instance often comes with developing an operating system. What I wrote above is not for a beginner.

So what may be most important is to discuss is what is easy to program.

If you create your own thread pool, you’ll have a bit of writing to do as you’ll need to be concerned with tracking execution status, how to simulate suspend and resume, and how to cancel execution – including in an application-wide shut down. You might also have to be concerned with whether you want to dynamically grow your pool and also what capacity limitation your pool will have. I can write such a framework in an hour but that is because I’ve done it so many times.

Perhaps the easiest way to write an execution unit is to use a Task. The beauty of a Task is that you can create one and kick it off in-line in your code (though caution may be warranted). You can pass a cancellation token to handle when you want to cancel the Task. Also, it uses the promise approach to chaining events, and you can have it return a specific type of value. Moreover, with async and await, more options exist and your code will be more portable.

In essence, it is important to understand the pros and cons with Tasks vs. Threads vs. the .NET ThreadPool. If I need high performance, I am going to use threads, and I prefer using my own pool.

An easy way to compare is start up 512 Threads, 512 Tasks, and 512 ThreadPool threads. You’ll find a delay in the beginning with Threads (hence, why write a thread pool), but all 512 Threads will be running in a few seconds while Tasks and .NET ThreadPool threads take up to a few minutes to all start.

Below are the results of such a test (i5 quad core with 16 GB of RAM), giving each 30 seconds to run. The code executed performs simple file I/O on an SSD drive.

Test Results

Resolute answered 27/4, 2017 at 15:42 Comment(5)
FYI, forgot to mention that Tasks and .NET Threads are simulated concurrency within .NET and with management executing within .NET not the OS - the latter being much more efficient at managing concurrent executions. I use Tasks for many things but I use an OS Thread for heavy execution performance. MS claims Tasks and .NET Threads are better, but they are in general to balance concurrency among .NET apps. A server app however would perform best letting the OS handle concurrency.Resolute
Would love to see the implementation of your custom Threadpool. Nice write up!Holcomb
I don' t understand your Test Results. What das "Units Ran" mean? You compare 34 taks with 512 threads? Would you please explain this?Crispate
Unit is just a method to execute concurrently in a Task, Thread, or .NET ThreadPool worker thread, my test comparing startup/run performance. Each test has 30 seconds to spawn 512 Threads from scratch, 512 Tasks, 512 ThreadPool worker threads, or resuming a pool of 512 started Threads awaiting a context to execute. The Tasks and ThreadPool worker threads have a slow spin up so 30 seconds isn't enough time to spin them all up. However, if the ThreadPool min worker thread count is first set to 512, both Tasks and ThreadPool worker threads will spin up almost as fast as 512 Threads from scratch.Resolute
@Holcomb github.com/grabe/NativeWindowsThreadPoolResolute
C
1

Thread pools are great when you have more tasks to process than available threads.

You can add all the tasks to a thread pool and specify the maximum number of threads that can run at a certain time.

Check out this page on MSDN: http://msdn.microsoft.com/en-us/library/3dasc8as(VS.80).aspx

Concoct answered 28/9, 2008 at 6:7 Comment(2)
Ok I guess this ties into my other question. How do you know how many available threads you have at any given time?Bosh
Well, it's hard to tell. You'll have to do performance testing. After a point adding more threads will not give you more speed. Find out how many processors are on the machine, that'll be a good starting point. Then go up from there, if processing speed doesn't improve, don't add more threads.Concoct
I
1

Always use a thread pool if you can, work at the highest level of abstraction possible. Thread pools hide creating and destroying threads for you, this is usually a good thing!

Invoke answered 28/9, 2008 at 6:8 Comment(0)
S
1

Most of the time you can use the pool as you avoid the expensive process of creating the thread.

However in some scenarios you may want to create a thread. For example if you are not the only one using the thread pool and the thread you create is long-lived (to avoid consuming shared resources) or for example if you want to control the stacksize of the thread.

Sascha answered 28/9, 2008 at 8:1 Comment(0)
D
1

Don't forget to investigate the Background worker.

I find for a lot of situations, it gives me just what i want without the heavy lifting.

Cheers.

Dossal answered 26/6, 2009 at 5:48 Comment(1)
when it's a simple app that stays running and you have one other task to do, very easy to do this code. you didn't provide links though: specification and tutorialBrouwer
C
0

I usually use the Threadpool whenever I need to just do something on another thread and don't really care when it runs or ends. Something like logging or maybe even background downloading a file (though there are better ways to do that async-style). I use my own thread when I need more control. Also what I've found is using a Threadsafe queue (hack your own) to store "command objects" is nice when I have multiple commands that I need to work on in >1 thread. So you'd may split up an Xml file and put each element in a queue and then have multiple threads working on doing some processing on these elements. I wrote such a queue way back in uni (VB.net!) that I've converted to C#. I've included it below for no particular reason (this code might contain some errors).

using System.Collections.Generic;
using System.Threading;

namespace ThreadSafeQueue {
    public class ThreadSafeQueue<T> {
        private Queue<T> _queue;

        public ThreadSafeQueue() {
            _queue = new Queue<T>();
        }

        public void EnqueueSafe(T item) {
            lock ( this ) {
                _queue.Enqueue(item);
                if ( _queue.Count >= 1 )
                    Monitor.Pulse(this);
            }
        }

        public T DequeueSafe() {
            lock ( this ) {
                while ( _queue.Count <= 0 )
                    Monitor.Wait(this);

                return this.DeEnqueueUnblock();

            }
        }

        private T DeEnqueueUnblock() {
            return _queue.Dequeue();
        }
    }
}
Camel answered 16/10, 2008 at 13:42 Comment(1)
Some problems with this approach: - Calls to DequeueSafe() will wait until an item is EnqueuedSafe(). Consider using one of the Monitor.Wait() overloads specifying a timeout. - Locking on this is not according to best practices, rather create a readonly object field. - Even though Monitor.Pulse() is lightweight, calling it when the queue contains only 1 item would be more efficient. - DeEnqueueUnblock() should preferrably check the queue.Count > 0. (needed if Monitor.PulseAll or wait timeouts are used)Stope
C
0

I wanted a thread pool to distribute work across cores with as little latency as possible, and that didn't have to play well with other applications. I found that the .NET thread pool performance wasn't as good as it could be. I knew I wanted one thread per core, so I wrote my own thread pool substitute class. The code is provided as an answer to another StackOverflow question over here.

As to the original question, the thread pool is useful for breaking repetitive computations up into parts that can be executed in parallel (assuming they can be executed in parallel without changing the outcome). Manual thread management is useful for tasks like UI and IO.

Crimea answered 8/2, 2010 at 3:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.