Apscheduler skipping job executions due to maximum number of instances

I am trying to use APScheduler to run periodic jobs with an IntervalTrigger, I've intentionally set the maximum number of running instances to one because I don't want jobs to overlap.

Problem is that after some time the scheduler starts reporting that the maximum number of running instance for a job have been reached even after it previously informed that the job finished successfully, I found this on the logs:

2015-10-28 22:17:42,137 INFO     Running job "ping (trigger: interval[0:01:00], next run at: 2015-10-28 22:18:42 VET)" (scheduled at 2015-10-28 22:17:42-04:30)
2015-10-28 22:17:44,157 INFO     Job "ping (trigger: interval[0:01:00], next run at: 2015-10-28 22:18:42 VET)" executed successfully

2015-10-28 22:18:42,335 WARNING  Execution of job "ping (trigger: interval[0:01:00], next run at: 2015-10-28 22:18:42 VET)" skipped: maximum number of running instances reached (1)

2015-10-28 22:19:42,171 WARNING  Execution of job "ping (trigger: interval[0:01:00], next run at: 2015-10-28 22:19:42 VET)" skipped: maximum number of running instances reached (1)

2015-10-28 22:20:42,181 WARNING  Execution of job "ping (trigger: interval[0:01:00], next run at: 2015-10-28 22:20:42 VET)" skipped: maximum number of running instances reached (1)

2015-10-28 22:21:42,175 WARNING  Execution of job "ping (trigger: interval[0:01:00], next run at: 2015-10-28 22:21:42 VET)" skipped: maximum number of running instances reached (1)

2015-10-28 22:22:42,205 WARNING  Execution of job "ping (trigger: interval[0:01:00], next run at: 2015-10-28 22:22:42 VET)" skipped: maximum number of running instances reached (1)

as you can see on the logs the ping job was reported to be executed successfully but shortly after the next execution is skipped from that point.

this is the code I use to schedule jobs:

    executors = {'default': ThreadPoolExecutor(10)}
    jobstores = {'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')}
    self.scheduler = BackgroundScheduler(executors = executors,jobstores=jobstores)
    ...
    self.scheduler.add_job(func=func,
                               trigger=trigger,
                               kwargs=kwargs,
                               id=plan_id,
                               name=name,
                               misfire_grace_time=misfire_grace_time,
                               replace_existing=True)

the function itself that is being run starts some threads to execute the ping command over several network nodes and saves the results to a file

threads = []
for link in links:
    thread = Thread(target = ping_test, args = (link,count,interval,timeout))
    threads.append(thread)
    thread.start()
for thread in threads:
    thread.join()

notice that the timeout is set to a number much lower than the trigger interval so it's impossible that the job is still executing when the next run triggers.

Any insights on this problem are highly appreciated.

Recommended topics

Hot tags