Using ThreadPool.QueueUserWorkItem in ASP.NET in a high traffic scenario
Asked Answered
D

11

114

I've always been under the impression that using the ThreadPool for (let's say non-critical) short-lived background tasks was considered best practice, even in ASP.NET, but then I came across this article that seems to suggest otherwise - the argument being that you should leave the ThreadPool to deal with ASP.NET related requests.

So here's how I've been doing small asynchronous tasks so far:

ThreadPool.QueueUserWorkItem(s => PostLog(logEvent))

And the article is suggesting instead to create a thread explicitly, similar to:

new Thread(() => PostLog(logEvent)){ IsBackground = true }.Start()

The first method has the advantage of being managed and bounded, but there's the potential (if the article is correct) that the background tasks are then vying for threads with ASP.NET request-handlers. The second method frees up the ThreadPool, but at the cost of being unbounded and thus potentially using up too many resources.

So my question is, is the advice in the article correct?

If your site was getting so much traffic that your ThreadPool was getting full, then is it better to go out-of-band, or would a full ThreadPool imply that you're getting to the limit of your resources anyway, in which case you shouldn't be trying to start your own threads?

Clarification: I'm just asking in the scope of small non-critical asynchronous tasks (eg, remote logging), not expensive work items that would require a separate process (in these cases I agree you'll need a more robust solution).

Deanery answered 25/8, 2009 at 2:11 Comment(2)
The plot thickens - I found this article (blogs.msdn.com/nicd/archive/2007/04/16/…), which I can't quite decode. On the one hand, it seems to be saying that IIS 6.0+ always processes requests on thread pool worker threads (and that earlier versions might do so), but then there's this: "However if you are using new .NET 2.0 async pages (Async="true") or ThreadPool.QueueUserWorkItem(), then the asynchronous part of the processing will be done inside [a completion port thread]." The asynchronous part of the processing?Cordilleras
One other thing - this should be easy enough to test on an IIS 6.0+ installation (which I don't have right now) by inspecting whether the thread pool's available worker threads is lower than its max worker threads, then doing the same within queued work items.Cordilleras
A
106

Other answers here seem to be leaving out the most important point:

Unless you are trying to parallelize a CPU-intensive operation in order to get it done faster on a low-load site, there is no point in using a worker thread at all.

That goes for both free threads, created by new Thread(...), and worker threads in the ThreadPool that respond to QueueUserWorkItem requests.

Yes, it's true, you can starve the ThreadPool in an ASP.NET process by queuing too many work items. It will prevent ASP.NET from processing further requests. The information in the article is accurate in that respect; the same thread pool used for QueueUserWorkItem is also used to serve requests.

But if you are actually queuing enough work items to cause this starvation, then you should be starving the thread pool! If you are running literally hundreds of CPU-intensive operations at the same time, what good would it do to have another worker thread to serve an ASP.NET request, when the machine is already overloaded? If you're running into this situation, you need to redesign completely!

Most of the time I see or hear about multi-threaded code being inappropriately used in ASP.NET, it's not for queuing CPU-intensive work. It's for queuing I/O-bound work. And if you want to do I/O work, then you should be using an I/O thread (I/O Completion Port).

Specifically, you should be using the async callbacks supported by whatever library class you're using. These methods are always very clearly labeled; they start with the words Begin and End. As in Stream.BeginRead, Socket.BeginConnect, WebRequest.BeginGetResponse, and so on.

These methods do use the ThreadPool, but they use IOCPs, which do not interfere with ASP.NET requests. They are a special kind of lightweight thread that can be "woken up" by an interrupt signal from the I/O system. And in an ASP.NET application, you normally have one I/O thread for each worker thread, so every single request can have one async operation queued up. That's literally hundreds of async operations without any significant performance degradation (assuming the I/O subsystem can keep up). It's way more than you'll ever need.

Just keep in mind that async delegates do not work this way - they'll end up using a worker thread, just like ThreadPool.QueueUserWorkItem. It's only the built-in async methods of the .NET Framework library classes that are capable of doing this. You can do it yourself, but it's complicated and a little bit dangerous and probably beyond the scope of this discussion.

The best answer to this question, in my opinion, is don't use the ThreadPool or a background Thread instance in ASP.NET. It's not at all like spinning up a thread in a Windows Forms application, where you do it to keep the UI responsive and don't care about how efficient it is. In ASP.NET, your concern is throughput, and all that context switching on all those worker threads is absolutely going to kill your throughput whether you use the ThreadPool or not.

Please, if you find yourself writing threading code in ASP.NET - consider whether or not it could be rewritten to use pre-existing asynchronous methods, and if it can't, then please consider whether or not you really, truly need the code to run in a background thread at all. In the majority of cases, you will probably be adding complexity for no net benefit.

Arceliaarceneaux answered 15/4, 2010 at 4:44 Comment(8)
Thanks for that detailed response and you're right, I do try and use async methods when possible (coupled with async controllers in ASP.NET MVC). In the case of my example, with the remote logger, this is exactly what I can do. It's an interesting design problem though because it pushes the async handling all the way down to the lowest level of your code (ie, the logger implementation), instead of being able to decide it from, say, the controller level (in the latter case, you'd need, for example, two logger implementations to be able to choose from).Deanery
@Michael: Async callbacks are generally pretty easy to wrap if you want to push it up more levels; you could create a façade around the async methods and wrap them with a single method that uses an Action<T> as a callback, for example. If you mean that the choice of whether to use a worker thread or I/O thread happens at the lowest level, that's intentional; only that level can decide whether or not it needs an IOCP.Arceliaarceneaux
Although, as a point of interest, it's only the .NET ThreadPool that limits you this way, probably because they didn't trust developers to get it right. The unmanaged Windows Thread Pool has a very similar API but actually allows you to choose the thread type.Arceliaarceneaux
The acronym should be IOCP not IOPC, can someone with enough points correct the answerEstevan
I/O Completion Ports (IOCP). the description of IOCP is not quite correct. in IOCP, you have a static number of worker threads which take turns working on ALL pending tasks. not to be confused with thread pools which can be fixed or dynamic in size BUT have one thread per task - scales dreadfully. unlike ASYNC, you do not have one thread per task. a IOCP thread may work a bit on task 1, then switch to task 3, task 2 then back to task 1 again. task session states are saved and are passed between threads.Tincal
What about database inserts? Is there an ASYNC SQL command (like Execute)? Database inserts are about the slowest I/O operation around (because of locking) and having the main thread wait for the row(s) to be inserted is just a waste of CPU cycles.Pizarro
@IanThompson: I'd encourage you to read the documentation for whatever database driver/library you're using. There's not one single answer to that question, and it can vary over time. Oracle, for example, only started supporting async recently, and may still not support TPL-style async.Arceliaarceneaux
@Arceliaarceneaux thanks. ODBCCommand doesn't support it, but SQLCommand has an ASYNC BeginExecuteNonQuery which I have used successfully.Pizarro
C
46

Per Thomas Marquadt of the ASP.NET team at Microsoft, it is safe to use the ASP.NET ThreadPool (QueueUserWorkItem).

From the article:

Q) If my ASP.NET Application uses CLR ThreadPool threads, won’t I starve ASP.NET, which also uses the CLR ThreadPool to execute requests? ..

A) To summarize, don’t worry about starving ASP.NET of threads, and if you think there’s a problem here let me know and we’ll take care of it.

Q) Should I create my own threads (new Thread)? Won’t this be better for ASP.NET, since it uses the CLR ThreadPool.

A) Please don’t. Or to put it a different way, no!!! If you’re really smart—much smarter than me—then you can create your own threads; otherwise, don’t even think about it. Here are some reasons why you should not frequently create new threads:

  1. It is very expensive, compared to QueueUserWorkItem...By the way, if you can write a better ThreadPool than the CLR’s, I encourage you to apply for a job at Microsoft, because we’re definitely looking for people like you!.
Clayberg answered 17/5, 2011 at 21:41 Comment(0)
C
4

Websites shouldn't go around spawning threads.

You typically move this functionality out into a Windows Service that you then communicate with (I use MSMQ to talk to them).

-- Edit

I described an implementation here: Queue-Based Background Processing in ASP.NET MVC Web Application

-- Edit

To expand why this is even better than just threads:

Using MSMQ, you can communicate to another server. You can write to a queue across machines, so if you determine, for some reason, that your background task is using up the resources of the main server too much, you can just shift it quite trivially.

It also allows you to batch-process whatever task you were trying to do (send emails/whatever).

Cromorne answered 25/8, 2009 at 2:14 Comment(6)
I wouldn't agree that this blanket statement is always true - especially for non-critical tasks. Creating a Windows Service, just for the purposes of asynchronous logging definitely seems overkill. Besides, that option is not always available (being able to deploy MSMQ and/or a Windows Service).Deanery
Sure, but it's the 'standard' way to implement asynchronous tasks from a website (queue theme against some other process).Cromorne
Not all asynchronous tasks are created equal, which is why for example asynchronous pages exist in ASP.NET. If I want to fetch a result from a remote web service to display, I'm not going to do that via MSMQ. In this case, I'm writing to a log using a remote post. It doesn't fit the problem to write a Windows Service, nor hook up MSMQ for that (and nor can I as this particular app is on Azure).Deanery
Consider: you are writing to a remote host? What if that host is down or un-reachable? Will you want to re-try your write? Maybe you will, maybe you won't. With your implementation, it's hard to retry. With the service, it becomes quite trivial. I appreciate that you may not be able to do it, and I'll let someone else answer to the specific problems with creating threads from websites [i.e. if your thread wasn't background, etc], but I am outlining the 'proper' way to do it. I am not familiar with azure, though I've used ec2 (you can install an OS on that, so anything is fine).Cromorne
@silky, thanks for the comments. I had said "non-critical" to avoid this more heavyweight (yet durable) solution. I've clarified the question so that it's clear I'm not asking for best-practice around queued work items. Azure does support this type of scenario (it has its own queue storage) - but the queuing operation is too expensive for synchronous logging, so I'd need an asynchronous solution anyway. In my case I'm aware of the pitfalls of failure, but I'm not going to add more infrastructure just in case this particular logging provider fails - I've got other logging providers too.Deanery
No worries, apologies if I sounded overly rude :)Cromorne
Z
4

I definitely think that general practice for quick, low-priority asynchronous work in ASP.NET would be to use the .NET thread pool, particularly for high-traffic scenarios as you want your resources to be bounded.

Also, the implementation of threading is hidden - if you start spawning your own threads, you have to manage them properly as well. Not saying you couldn't do it, but why reinvent that wheel?

If performance becomes an issue, and you can establish that the thread pool is the limiting factor (and not database connections, outgoing network connections, memory, page timeouts etc) then you tweak the thread pool configuration to allow more worker threads, higher queued requests, etc.

If you don't have a performance problem then choosing to spawn new threads to reduce contention with the ASP.NET request queue is classic premature optimization.

Ideally you wouldn't need to use a separate thread to do a logging operation though - just enable the original thread to complete the operation as quickly as possible, which is where MSMQ and a separate consumer thread / process come in to the picture. I agree that this is heavier and more work to implement, but you really need the durability here - the volatility of a shared, in-memory queue will quickly wear out its welcome.

Zaffer answered 25/8, 2009 at 5:32 Comment(0)
E
2

You should use QueueUserWorkItem, and avoid creating new threads like you would avoid the plague. For a visual that explains why you won't starve ASP.NET, since it uses the same ThreadPool, imagine a very skilled juggler using two hands to keep a half dozen bowling pins, swords, or whatever in flight. For a visual of why creating your own threads is bad, imagine what happens in Seattle at rush hour when heavily used entrance ramps to the highway allow vehicles to enter traffic immediately instead of using a light and limiting the number of entrances to one every few seconds. Finally, for a detailed explanation, please see this link:

http://blogs.msdn.com/tmarq/archive/2010/04/14/performing-asynchronous-work-or-tasks-in-asp-net-applications.aspx

Thanks, Thomas

Embryectomy answered 15/4, 2010 at 4:4 Comment(2)
That link is very useful, thanks for that Thomas. I'd be interested to hear what you think of @Aaronaught's response too.Deanery
I agree with Aaronaught, and said the same in my blog post. I put it this way, "In an attempt to simplify this decision, you should only switch [to another thread] if you would otherwise block the ASP.NET request thread while you do nothing. This is an oversimplification, but I'm trying to make the decision simple." In other words, don't do it for non-blocking computational work, but do do it if you're making async web service requests to a remote server. Listen to Aaronaught! :)Embryectomy
D
1

That article is not correct. ASP.NET has it's own pool of threads, managed worker threads, for serving ASP.NET requests. This pool is usually a few hundred threads and is separate from the ThreadPool pool, which is some smaller multiple of processors.

Using ThreadPool in ASP.NET will not interfere with ASP.NET worker threads. Using ThreadPool is fine.

It would also be acceptable to setup a single thread which is just for logging messages and using producer/consumer pattern to pass logs messages to that thread. In that case, since the thread is long-lived, you should create a single new thread to run the logging.

Using a new thread for every message is definitely overkill.

Another alternative, if you're only talking about logging, is to use a library like log4net. It handles logging in a separate thread and takes care of all the context issues that could come up in that scenario.

Durance answered 14/10, 2009 at 15:44 Comment(1)
@Sam, I'm actually using log4net and not seeing logs being written in a separate thread - is there some sort of option that I need to enable?Deanery
C
1

I'd say the article is wrong. If you're running a large .NET shop you can safely use the pool across multiple apps and multiple websites (using seperate app pools), simply based on one statement in the ThreadPool documentation:

There is one thread pool per process. The thread pool has a default size of 250 worker threads per available processor, and 1000 I/O completion threads. The number of threads in the thread pool can be changed by using the SetMaxThreads method. Each thread uses the default stack size and runs at the default priority.

Catoptrics answered 14/10, 2009 at 21:25 Comment(5)
One application running in a single process is fully capable of bringing itself down! (Or at least degrading its own performance enough to make the thread pool a losing proposition.)Cordilleras
So I'm guessing ASP.NET requests use the I/O completion threads (as opposed to the worker threads) - is that correct?Deanery
From Fritz Onion's article I linked in my answer: "This paradigm changes [from IIS 5.0 to IIS 6.0] the way requests are handled in ASP.NET. Instead of dispatching requests from inetinfo.exe to the ASP.NET worker process, http.sys directly queues each request in the appropriate process. Thus all requests are now serviced by worker threads drawn from the CLR thread pool and never on I/O threads." (my emphasis)Cordilleras
Hmmm, I'm still not entirely sure... That article is from June 2003. If you read this one from May 2004 (admittedly still quite old), it says "The Sleep.aspx test page can be used to keep an ASP.NET I/O thread busy", where Sleep.aspx just causes the current executing thread to sleep: msdn.microsoft.com/en-us/library/ms979194.aspx - When I've got a chance, I'll see if I can code up that example and test on IIS 7 and .NET 3.5Deanery
Yeah, the text of that paragraph is confusing. Farther along in that section it links to a support topic (support.microsoft.com/default.aspx?scid=kb;EN-US;816829) that clarifies things: running requests on I/O completion threads was a .NET Framework 1.0 problem that was fixed in the ASP.NET 1.1 June 2003 Hotfix Rollup Package (after which "ALL requests now run on Worker threads"). More importantly, that example shows quite clearly that the ASP.NET thread pool is the same thread pool exposed by System.Threading.ThreadPool.Cordilleras
M
1

I was asked a similar question at work last week and I'll give you the same answer. Why are you multi threading web applications per request? A web server is a fantastic system optimized heavily to provide many requests in a timely fashion (i.e. multi threading). Think of what happens when you request almost any page on the web.

  1. A request is made for some page
  2. Html is served back
  3. The Html tells the client to make further requets (js, css, images, etc..)
  4. Further information is served back

You give the example of remote logging, but that should be a concern of your logger. An asynchronous process should be in place to receive messages in a timely fashion. Sam even points out that your logger (log4net) should already support this.

Sam is also correct in that using the Thread Pool on the CLR will not cause issues with the thread pool in IIS. The thing to be concerned with here though, is that you are not spawning threads from a process, you are spawning new threads off of IIS threadpool threads. There is a difference and the distinction is important.

Threads vs Process

Both threads and processes are methods of parallelizing an application. However, processes are independent execution units that contain their own state information, use their own address spaces, and only interact with each other via interprocess communication mechanisms (generally managed by the operating system). Applications are typically divided into processes during the design phase, and a master process explicitly spawns sub-processes when it makes sense to logically separate significant application functionality. Processes, in other words, are an architectural construct.

By contrast, a thread is a coding construct that doesn't affect the architecture of an application. A single process might contains multiple threads; all threads within a process share the same state and same memory space, and can communicate with each other directly, because they share the same variables.

Source

Missymist answered 15/10, 2009 at 16:8 Comment(1)
@Ty, thanks for the input, but I'm well aware of how a web server works and it's not really relevant to the question - again, as I said in the question, I'm not asking for guidance on this as an architectural issue. I'm asking for specific technical information. As for it being "the concern of the logger" that should already have an asynchronous process in place - how do you think that asynchronous process should be written by the logger implementation?Deanery
T
1

You can use Parallel.For or Parallel.ForEach and define the limit of possible threads you want to allocate to run smoothly and prevent pool starvation.

However, being run in background you will need to use pure TPL style below in ASP.Net web application.

var ts = new CancellationTokenSource();
CancellationToken ct = ts.Token;

ParallelOptions po = new ParallelOptions();
            po.CancellationToken = ts.Token;
            po.MaxDegreeOfParallelism = 6; //limit here

 Task.Factory.StartNew(()=>
                {                        
                  Parallel.ForEach(collectionList, po, (collectionItem) =>
                  {
                     //Code Here PostLog(logEvent);
                  }
                });
Televise answered 31/3, 2019 at 7:6 Comment(0)
B
0

I do not agree with the referenced article(C#feeds.com). It is easy to create a new thread but dangerous. The optimal number of active threads to run on a single core is actually surprisingly low - less than 10. It is way too easy to cause the machine to waste time switching threads if threads are created for minor tasks. Threads are a resource that REQUIRE management. The WorkItem abstraction is there to handle this.

There is a trade off here between reducing the number of threads available for requests and creating too many threads to allow any of them to process efficiently. This is a very dynamic situation but I think one that should be actively managed (in this case by the thread pool) rather than leaving it to the processer to stay ahead of the creation of threads.

Finally the article makes some pretty sweeping statements about the dangers of using the ThreadPool but it really needs something concrete to back them up.

Bastinado answered 3/2, 2010 at 19:0 Comment(0)
I
0

Whether or not IIS uses the same ThreadPool to handle incoming requests seems hard to get a definitive answer to, and also seems to have changed over versions. So it would seem like a good idea not to use ThreadPool threads excessively, so that IIS has a lot of them available. On the other hand, spawning your own thread for every little task seems like a bad idea. Presumably, you have some sort of locking in your logging, so only one thread could progress at a time, and the rest would just take turns getting scheduled and unscheduled (not to mention the overhead of spawning a new thread). Essentially, you run into the exact problems the ThreadPool was designed to avoid.

It seems that a reasonable compromise would be for your app to allocate a single logging thread that you could pass messages to. You would want to be careful that sending messages is as fast as possible so that you don't slow down your app.

Inverson answered 5/3, 2010 at 22:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.