Memory usage not getting lowered even after job is completed successfully

Asked 19/9, 2017 at 13:22 Answered 28/4, 2021 at 0:41

I have a job added in apscheduler which loads some data in memory and I am deleting all the objects after the job is complete. Now if I run this job with python it works successfully and memory drop after process exits successfully.But in case of apscheduler the memory usage is not coming down.I am using BackgroundScheduler.Thanks in advance.

Payer answered 19/9, 2017 at 13:22 Comment(5)

Not sure if this is the case here, but if the application is still running, and you do not need the memory for other tasks, the garbage collection might not kick in immediately, but only when needed. – Morse 19/9, 2017 at 13:25

@Morse Thanks for the reply...I have tried calling gc.collect() explicitly after job completion even that's not getting memory usage lowered... – Payer 19/9, 2017 at 13:47

So, does the memory keep rising with repeated executions? Do you get some sort of memory error, or does it start swapping? – Morse 19/9, 2017 at 14:19

@Morse yes the momory keeps rising with repeated executions... – Payer 19/9, 2017 at 14:38

Any updates regarding this issue? – Grazing 3/3, 2019 at 14:48

I was running quite a few tasks via apscheduler. I suspected this setup led to R14 errors on Heroku, with dyno memory overload, crashes and restarts occurring daily. So I spun up another dyno and scheduled a few jobs to run very frequently.

Watching the metrics tab in Heroku, it immediately became clear that apscheduler was the culprit.

Removing jobs after they're run was recommended to me. But this is of course a bad idea when running cron and interval jobs as they won't run again.

What finally solved it was tweaking the threadpoolexecutioner (lowering max number of workers), see this answer on Stackoverflow and this and this post on Github. I definitely suggest you read the docs on this.

Other diagnostics resources: 1, 2.

Example code:

import logging
from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor
from apscheduler.schedulers.blocking import BlockingScheduler
from tests import overloadcheck

logging.basicConfig()
logging.getLogger('apscheduler').setLevel(logging.DEBUG)

sched = BlockingScheduler(
    executors={
        'threadpool': ThreadPoolExecutor(max_workers=9),
        'processpool': ProcessPoolExecutor(max_workers=3)
        }
)

@sched.scheduled_job('interval', minutes=10, executor='threadpool')
def message_overloadcheck():
    overloadcheck()

sched.start()

Or, if you like I do, love to run heavy tasks—try the ProcessPoolExecutor as an alternative, or addition to the ThreadPool, but make sure to call it from specific jobs in such case.

Update: And, you need to import ProcessPoolExecutor as well if you wish to use it, added this to code.

Tahoe answered 23/11, 2019 at 10:53 Comment(0)

Just in case anyone is using Flask-APScheduler and having memory leak issues, it took me a while to realize that it expects any configuration settings to be in your Flask Config, not when you instantiate the scheduler.

So if you (like me) did something like this:

from flask_apscheduler import APScheduler
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.executors.pool import ThreadPoolExecutor

bg_scheduler = BackgroundScheduler(executors={'threadpool': ThreadPoolExecutor(max_workers=1)})
scheduler = APScheduler(scheduler=bg_scheduler)

scheduler.init_app(app)
scheduler.start()

then for whatever reason, when jobs are run in the Flask request context, it will not recognize the executor, 'threadpool', or any other configuration settings you may have set.

However, if you set these same options in the Flask Config class as:

class Config(object):
    #: Enable build completion checks?
    SCHEDULER_API_ENABLED = True
    #: Sets max workers to 1 which reduces memory footprint
    SCHEDULER_EXECUTORS = {"default": {"type": "threadpool", "max_workers": 1}

    # ... other Flask configuration options

and then do (back in the main script)

scheduler = APScheduler()
scheduler.init_app(app)
scheduler.start()

then the configuration settings actually do get set. I'm guessing when I called scheduler.init_app in the original script, Flask-APScheduler saw that I hadn't set any of those settings in my Flask Config and so overwrote them with default values, but not 100% sure.

Regardless, hopefully this helps anyone who has tried the top-rated answer but is also using Flask-APScheduler as a wrapper and might still be seeing memory issues.

Valero answered 28/4, 2021 at 0:41 Comment(0)

Recommended topics

Hot tags