I am using airflow cli's backfill
command to manually run some backfill jobs.
airflow backfill mydag -i -s 2018-01-11T16-00-00 -e 2018-01-31T23-00-00 --reset_dagruns --rerun_failed_tasks
The dag interval is hourly and it has around 40 tasks. Hence this kind of backfill job takes more than a day to finish. I need it to run without supervision. I noticed however, that even if one task fails at one of the runs in the backfill interval, the entire backfill job stops with the following exception and I have to restart it manually again.
Traceback (most recent call last):
File "/home/ubuntu/airflow/bin/airflow", line 4, in <module>
__import__('pkg_resources').run_script('apache-airflow==1.10.0', 'airflow')
File "/home/ubuntu/airflow/lib/python3.5/site-packages/pkg_resources/__init__.py"
, line 719, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/home/ubuntu/airflow/lib/python3.5/site-packages/pkg_resources/__init__.py", line 1504, in run_script
exec(code, namespace, namespace)
File "/home/ubuntu/airflow/lib/python3.5/site-packages/apache_airflow-1.10.0-py3.
5.egg/EGG-INFO/scripts/airflow", line 32, in <module>
args.func(args)
File "/home/ubuntu/airflow/lib/python3.5/site-packages/apache_airflow-1.10.0-py3.5.egg/airflow/utils/cli.py", line 74, in wrapper
return f(*args, **kwargs)
File "/home/ubuntu/airflow/lib/python3.5/site-packages/apache_airflow-1.10.0-py3.
5.egg/airflow/bin/cli.py", line 217, in backfill
rerun_failed_tasks=args.rerun_failed_tasks,
File "/home/ubuntu/airflow/lib/python3.5/site-packages/apache_airflow-1.10.0-py3.5.egg/airflow/models.py", line 4105, in run
job.run()
File "/home/ubuntu/airflow/lib/python3.5/site-packages/apache_airflow-1.10.0-py3.
5.egg/airflow/jobs.py", line 202, in run
self._execute()
File "/home/ubuntu/airflow/lib/python3.5/site-packages/apache_airflow-1.10.0-py3.5.egg/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/home/ubuntu/airflow/lib/python3.5/site-packages/apache_airflow-1.10.0-py3.
5.egg/airflow/jobs.py", line 2533, in _execute
airflow.exceptions.AirflowException:
Some task instances failed:
{('mydag', 'a_task', datetime.datetime(2018, 1, 30, 17, 5, tzinfo=psy
copg2.tz.FixedOffsetTimezone(offset=0, name=None)))}
The task instances do not depend on their previous instances, therefore I don't mind if one or two tasks fail. I need the job to continue.
I could not find any option in the documentation of backfill which would allow me to specify this behaviour.
Is there a way to achieve what I am looking for?