Running periodic task at time stored in database
Asked Answered
A

4

5

Currently, I have periodic tasks set up with a Job Scheduler on my Azure instance. These are triggering API (Django) endpoints at fixed times.

I want to make these times dynamic (which will not work with this solution). The plan is to fire these tasks directly from Django. The schedule times will be stored in my database (MySQL) and be retrieved to create the scheduled job. When these values are changed the scheduler should also change accordingly.

After looking at Celery it seems that using periodic tasks crontab schedules could work. Using this, is it possible to set my scheduled time based on values from my database?

It looks like I will also need a Redis instance as well. Since I would only be using Celery for periodic tasks is it still the right approach?

Agamete answered 11/5, 2018 at 16:2 Comment(0)
E
4

Cron is thought to run tasks that happen periodically, as it has easy configuration options for everyday, every hour, each 15 min...
Adding cron jobs is not really a good way to configure dynamic tasks to run at a concrete date (or datetime)

You could use schedule module as explained in this answer.

There is also another library apscheduler, but check if the last version works well with python3 (if you use it )

from datetime import date
from apscheduler.scheduler import Scheduler

# Start the scheduler
sched = Scheduler()
sched.start()

# Define the function that is to be executed
def my_job(text):
    print text

# The job will be executed on November 6th, 2009
exec_date = date(2009, 11, 6)

# Store the job in a variable in case we want to cancel it
job = sched.add_date_job(my_job, exec_date, ['hello'])
Evy answered 11/5, 2018 at 16:22 Comment(0)
S
4

I'm using Django, Celery, RabbitMQ and postgreSQL.

I'm doing exactly what you want to do.

PIP : celery and flower

You need a Celery conf file (in your settings.py folder):

What you want to add is beat_schedule :

app.conf.beat_schedule = {
    'task-name': {
        'task': 'myapp.tasks.task_name',
        'schedule': crontab(minute=30, hour=5, day_of_week='mon-fri'),
    },
}

This will add an entry in your database to execute task_name (monday to friday at 5:30), you can change directly on your settings (reload celery and celery beat after)

What i love is you can add a retry mechanism realy easily with security:

@app.task(bind=True, max_retries=50)
def task_name(self, entry_pk):
    entry = Entry.objects.get(pk=entry_pk)
    try:
        entry.method()
    except ValueError as e:
        raise self.retry(exc=e, countdown=5 * 60, queue="punctual_queue")

When my method() raise ValueError i will re-execute this method in 5 minute for a maximun number of 50 try.

The good part is you have access to the database in Django admin : enter image description here

And you can check with flower if the task is executed or not (with traceback):

enter image description here

I have more than 1000 task daily executed, what you need is create queues and worker.

I use 10 workers for that (for future scalling purpose) :

celery multi start 10 -A MYAPP -Q:1-3 recurring_queue,punctual_queue -Q:4,5 punctual_queue -Q recurring_queue --pidfile="%n.pid"

And the daemon that launch task :

celery -A MYAPP beat -S django --detach

It's maybe overkill for you, but he can do much more for you: - sending email async (if its fail you can correct and resend the email) - upload and postprocess async for the user - Every task that take time but you dont want to wait (you can chain task that one need to finish return a result and use it in another task)

Shrapnel answered 11/5, 2018 at 16:26 Comment(2)
Thanks for the detailed answer. One key thing that needs to happen is being able to change the parameters in chrontab to pull from the database (dynamic values)? I would want to set the minute, hour and day_of_week values to ones pulled from the db. Do you know if this is possible with this solution?Agamete
The Beat schedule (celery Beat) write in database when you launch the daemon, but the cron Will be launch with the value stored in the database (that you can update dynamicly)Shrapnel
R
0

Without external libraries, set up a daily cron script that gets today's tasks from the database and uses threading to run at that time.

def take_a_background_nap(time_to_send):
    while datetime.datetime.now() < time_to_send:
        time.sleep(60)
    print('finally running')
    return


threadObj = threading.Thread(target=take_a_background_nap, args=[datetime.datetime(2020, 5, 11, 12, 53, 0)],)
threadObj.start()

You can have as many threads as you want but pay attention to concurrency issues.

Ronen answered 11/5, 2018 at 17:16 Comment(0)
K
0

Yes, Celery is perfect for periodic tasks. I even wrote the article on how to dynamically update periodic tasks in Celery and Django and published a very simple project on GitHub showcasing how to update periodic tasks.

My project was very simple, I was using only one worker. I used SQLite as a broker and results backend. If you plan to have more workers then you can use PostgreSQL as broker and results backend. My configuration for Celery with SQLite:

# celery broker and results in sqlite
CELERY_BROKER_URL = "sqla+sqlite:///celery.sqlite"
CELERY_RESULT_BACKEND = "db+sqlite:///celery.sqlite"

I was using Celery version 5.2.7.

Kuykendall answered 18/10, 2022 at 11:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.