Long-running ASP.NET tasks
Asked Answered
S

4

2

I know there's a bunch of APIs out there that do this, but I also know that the hosting environment (being ASP.NET) puts restrictions on what you can reliably do in a separate thread.

I could be completely wrong, so please correct me if I am, this is however what I think I know.

  • A request typically timeouts after 120 seconds (this is configurable) but eventually the ASP.NET runtime will kill a request that's taking too long to complete.
  • The hosting environment, typically IIS, employs process recycling and can at any point decide to recycle your app. When this happens all threads are aborted and the app restarts. I'm however not sure how aggressive it is, it would be kind of stupid to assume that it would abort a normal ongoing HTTP request but I would expect it to abort a thread because it doesn't know anything about the unit of work of a thread.

If you had to create a programming model that easily and reliably and theoretically put a long running task, that would have to run for days, how would you accomplish this from within an ASP.NET application?

The following are my thoughts on the issue:

I've been thinking a long the line of hosting a WCF service in a win32 service. And talk to the service through WCF. This is however not very practical, because the only reason I would choose to do so, is to send tasks (units of work) from several different web apps. I'd then eventually ask the service for status updates and act accordingly. My biggest concern with this is that it would NOT be a particular great experience if I had to deploy every task to the service for it to be able to execute some instructions. There's also this issue of input, how would I feed this service with data if I had a large data set and needed to chew through it?

What I typically do right now is this

SELECT TOP 10 * 
FROM WorkItem WITH (ROWLOCK, UPDLOCK, READPAST)
WHERE WorkCompleted IS NULL

It allows me to use a SQL Server database as a work queue and periodically poll the database with this query for work. If the work item completed with success, I mark it as done and proceed until there's nothing more to do. What I don't like is that I could theoretically be interrupted at any point and if I'm in-between success and marking it as done, I could end up processing the same work item twice. I might be a bit paranoid and this might be all fine but as I understand it there's no guarantee that that won't happen...

I know there's been similar questions on SO before but non really answers with a definitive answer. This is a really common thing, yet the ASP.NET hosting environment is ill equipped to handle long-running work.

Please share your thoughts.

Swashbuckling answered 25/3, 2010 at 22:4 Comment(0)
M
3

John,

I agree that ASP.NET is not suitable for Async tasks as you have described them, nor should it be. It is designed as a web hosting platform, not a back of house processor.

We have had similar situations in the past and we have used a solution similar to what you have described. In summary, keep your WCF service under ASP.NET, use a "Queue" table with a Windows Service as the "QueueProcessor". The client should poll to see if work is done (or use messaging to notify the client).

We used a table that contained the process and it's information (eg InvoicingRun). On that table was a status (Pending, Running, Completed, Failed). The client would submit a new InvoicingRun with a status of Pending. A Windows service (the processor) would poll the database to get any runs that in the pending stage (you could also use SQL Notification so you don't need to poll. If a pending run was found, it would move it to running, do the processing and then move it to completed/failed.

In the case where the process failed fatally (eg DB down, process killed), the run would be left in a running state, and human intervention was required. If the process failed in an non-fatal state (exception, error), the process would be moved to failed, and you can choose to retry or have human intervantion.

If there were multiple processors, the first one to move it to a running state got that job. You can use this method to prevent the job being run twice. Alternate is to do the select then update to running under a transaction. Make sure either of these outside a transaction larger transaction. Sample (rough) SQL:

UPDATE InvoicingRun
SET Status = 2 -- Running
WHERE ID = 1
    AND Status = 1 -- Pending

IF @@RowCount = 0
    SELECT Cast(0 as bit)
ELSE
    SELECT Cast(1 as bit)

Rob

Mouthy answered 25/3, 2010 at 22:4 Comment(6)
That's sort of obvious when you mention it, didn't think of that. I could of course flag the row as pending as soon as I begin processing, but before I do any actual work. But how do you prevent a row from getting stuck in a pending state if e.g. the DB goes down during execution? (this is very unlikely but I wanna be thorough)Swashbuckling
Well, it would possibly get stuck in a "running" state. I think you need to have a human administrator move it back (either through an interface or directly in the DB) once the problem that caused the failure has been solved. This is because it'll be hard to tell whether it is running or failed If this is not an option, add a "Watchdog" that resets it after a certain time (when you are sure the process has failed). It can be hard to determine what that time should be. Last alternative is to keep a transaction on that row, but then you run into concurrent access issues.Mouthy
John, It's one of those things that's obvious once you think of it. The method described is Mutex concurrency.Mouthy
A combination of our two SQL approaches should solve any concurrency issues, and you probably need to have someone or something periodically check to see if the service is operating as intended.Swashbuckling
Something to be aware about SQL Notification is that it was removed from SQL Server 2008. Also, the IO structure behind tables make them poorly suited for implementing queues - this has to do with the clustered index behavior of primary keys which causes the DB to restructure the data layout as the "queue" expands and contracts.Subway
With the introduction of AzureWebJobs, you can now do this. See my answer for more details. An Azure worker role would be even more robust.Cryptogram
A
5

Have a look at NServiceBus

NServiceBus is an open source communications framework for .NET with build in support for publish/subscribe and long-running processes.

It is a technology build upon MSMQ, which means that your messages don't get lost since they are persisted to disk. Nevertheless the Framework has an impressive performance and an intuitive API.

Anetta answered 25/3, 2010 at 22:4 Comment(3)
I didn't do it, but might as well have because it's a bit off-topic. NServiceBus seems to be a message-passing framework for building distributed applications. It doesn't really have anything do with long-running processes. It might be used to transport data (sending messages) but it doesn't really say anything about relying behavior/instructions at the same time and it kind of loses it's meaningfulness because of that. What problem does NServiceBus solve that WCF does not?Swashbuckling
It supports long running Workflows via SAGAS: nservicebus.com/Sagas.aspx. You do not get this out of the box with WCFAnetta
NServiceBus has built-in integration for ASP.NET async page tasks for implementing long-running pages, and in the coming version (2.1) supports MVC AsyncController integration as well. NServiceBus is very much about supporting long-running processes as well as facilitating communication with those processes in a reliable and fault-tolerant way. While you can configure WCF to do that as well, you need to know a lot about WCF to get it right, whereas with NServiceBus it all works that way by default. Another thing WCF doesn't give you is reliable load balancing for MSMQ - NServiceBus does.Subway
M
3

John,

I agree that ASP.NET is not suitable for Async tasks as you have described them, nor should it be. It is designed as a web hosting platform, not a back of house processor.

We have had similar situations in the past and we have used a solution similar to what you have described. In summary, keep your WCF service under ASP.NET, use a "Queue" table with a Windows Service as the "QueueProcessor". The client should poll to see if work is done (or use messaging to notify the client).

We used a table that contained the process and it's information (eg InvoicingRun). On that table was a status (Pending, Running, Completed, Failed). The client would submit a new InvoicingRun with a status of Pending. A Windows service (the processor) would poll the database to get any runs that in the pending stage (you could also use SQL Notification so you don't need to poll. If a pending run was found, it would move it to running, do the processing and then move it to completed/failed.

In the case where the process failed fatally (eg DB down, process killed), the run would be left in a running state, and human intervention was required. If the process failed in an non-fatal state (exception, error), the process would be moved to failed, and you can choose to retry or have human intervantion.

If there were multiple processors, the first one to move it to a running state got that job. You can use this method to prevent the job being run twice. Alternate is to do the select then update to running under a transaction. Make sure either of these outside a transaction larger transaction. Sample (rough) SQL:

UPDATE InvoicingRun
SET Status = 2 -- Running
WHERE ID = 1
    AND Status = 1 -- Pending

IF @@RowCount = 0
    SELECT Cast(0 as bit)
ELSE
    SELECT Cast(1 as bit)

Rob

Mouthy answered 25/3, 2010 at 22:4 Comment(6)
That's sort of obvious when you mention it, didn't think of that. I could of course flag the row as pending as soon as I begin processing, but before I do any actual work. But how do you prevent a row from getting stuck in a pending state if e.g. the DB goes down during execution? (this is very unlikely but I wanna be thorough)Swashbuckling
Well, it would possibly get stuck in a "running" state. I think you need to have a human administrator move it back (either through an interface or directly in the DB) once the problem that caused the failure has been solved. This is because it'll be hard to tell whether it is running or failed If this is not an option, add a "Watchdog" that resets it after a certain time (when you are sure the process has failed). It can be hard to determine what that time should be. Last alternative is to keep a transaction on that row, but then you run into concurrent access issues.Mouthy
John, It's one of those things that's obvious once you think of it. The method described is Mutex concurrency.Mouthy
A combination of our two SQL approaches should solve any concurrency issues, and you probably need to have someone or something periodically check to see if the service is operating as intended.Swashbuckling
Something to be aware about SQL Notification is that it was removed from SQL Server 2008. Also, the IO structure behind tables make them poorly suited for implementing queues - this has to do with the clustered index behavior of primary keys which causes the DB to restructure the data layout as the "queue" expands and contracts.Subway
With the introduction of AzureWebJobs, you can now do this. See my answer for more details. An Azure worker role would be even more robust.Cryptogram
D
1

Use a simple background tasks / jobs framework like Hangfire and apply these best practice principals to the design of the rest of your solution:

  • Keep all actions as small as possible; to achieve this, you should-
  • Divide long running jobs into batches and queue them (in a Hangfire queue or on a bus of another sort)
  • Make sure your small jobs (batched parts of long jobs) are idempotent (have all the context they need to run in any order). This way you don't have to use a quete which maintains a sequence; because then you can
  • Parallelise the execution of jobs in your queue depending on how many nodes you have in your web server farm. You can even control how much load this subjects your farm to (as a trade off to servicing web requests). This ensures that you complete the whole job (all batches) as fast and as efficiently as possible, while not compromising your cluster from servicing web clients.
Die answered 25/3, 2010 at 22:4 Comment(0)
M
0

Have thought about the use the Workflow Foundation instead of your custom implementation? It also allows you to persist states. Tasks could be defined as workflows in this case.

Just some thoughts...

Michael

Miculek answered 25/3, 2010 at 22:4 Comment(1)
I have not, WWF seems like this big conglomerate thing that solves other kinds of asynchronous business oriented tasks. This is really about just crunching numbers in a different thread, but doing so reliably. I appreciate the suggestion though.Swashbuckling

© 2022 - 2024 — McMap. All rights reserved.