Asynchronous background processes with web2py
Asked Answered
H

3

8

I need to to handle a large (time and memory-consuming) process asynchronously in a web2py application called inside a controller method.

My specific use case is to call a process via stdlib.subprocess and wait for it to exit without blocking the web server, but I am open to alternative methods.

  • Hands-on examples would be a plus.
  • 3rd party library recommendations are welcome.
  • CRON scheduling is not required/wanted.
Hutto answered 29/12, 2011 at 13:20 Comment(0)
S
7

Assuming you'll need to start multiple, possibly simultaneous, instances of the background task, the solution is a task queue. I've heard good things about Celery and RabbitMQ, if you're looking for 3rd-party options, and web2py includes it's own task queue system that might be sufficient for your needs.

With either tool, you'll define a function that encapsulates the operation you want the background process to perform. Then bring the task queue workers online. The web2py manual and forums indicate this can be done with an @reboot statement in the web2py cron system, which is triggered whenever the web server starts. There are probably other ways to start the workers if this is unsatisfactory.

In your controller you'll insert a task into the task queue, passing any necessary parameters as inputs to the function (the background function will not run in the same environment as the controller, so it won't have access to the session, DB, etc. unless you explicitly pass the appropriate values into the task function).

Now, to get the output of the background operation to the user. When you insert a task into the task queue, you should get back a unique ID for the task. You would then implement controller logic (either something that expects an AJAX call, or a page that keeps refreshing until the task completes) that calls the task queue's API to check the status of the specified task. If the task's status is "finished", return the data to the user. If not, keep waiting.

Sainthood answered 29/12, 2011 at 14:41 Comment(1)
I think the built-in task scheduler was just I was looking for.Hutto
U
2

Maybe review the book section on running tasks in the background. You can use the new scheduler or create a homemade queue (email example). There's also a web2py-celery plugin, though I'm not sure what state that is in.

Unstressed answered 29/12, 2011 at 14:44 Comment(0)
S
1

This is more difficult than one might expect. Note the deadlock warnings in the stdlib.subprocess documentation. It's easy if you don't mind blocking---use Popen.communicate. To work around the blocking, you can manage the process using stdlib.subprocess from a thread.

My favorite way to deal with subprocesses is to use Twisted's spawnProcess. But, it is not easy to get Twisted to play nicely with other frameworks.

Senter answered 29/12, 2011 at 13:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.