Airflow kills my tasks after 1 minute
Asked Answered
D

3

6

I have a very simple DAG with two tasks, like following:

default_args = {
    'owner': 'me',
    'start_date': dt.datetime.today(),
    'retries': 0,
    'retry_delay': dt.timedelta(minutes=1)
}

dag = DAG(
    'test DAG',
    default_args=default_args,
    schedule_interval=None
)

t0 = PythonOperator(
    task_id="task 1",
    python_callable=run_task_1,
    op_args=[arg_1, args_2, args_3],
    dag=dag,
    execution_timeout=dt.timedelta(minutes=60)
)

t1 = PythonOperator(
    task_id="task 2",
    python_callable=run_task_2,
    dag=dag,
    execution_timeout=dt.timedelta(minutes=60)
)

t1.set_upstream(t0)

However, when I run it, I see the following in the logs:

[2017-10-17 16:18:35,519] {jobs.py:2083} INFO - Task exited with return code -9

Without any other useful error logs. Anyone seen that before? Did I define my DAG wrongly? Any help appreciated!

Desiccate answered 17/10, 2017 at 16:24 Comment(2)
Any luck with solving this issue?Harleigh
Any luck? Getting this while querying a db then writing it locally.Marras
G
4

If the task container doesn't have enough memory for a task, it will fail with error code -9. https://www.astronomer.io/guides/dag-best-practices/

Gabey answered 17/12, 2018 at 20:22 Comment(0)
E
1

Which version of airflow are you using?
From 1.8, airflow is less forgiving scheduler on dynamic start_date, https://github.com/apache/incubator-airflow/blob/master/UPDATING.md#less-forgiving-scheduler-on-dynamic-start_date.
Try to give a specific date.

Euglena answered 17/10, 2017 at 16:47 Comment(3)
Tried it. Unfortunately, it didn't solve my problem. I think that for some reason the scheduler thinks it's a zombie task and kills it.Desiccate
@user1059968 how did you fix that?Bogy
I was having weird errors, Airlfow was killing zombie tasks wrongly ("WARNING - State of this instance has been externally set to None. Terminating instance."). I had triggered the dag manually, running on "today's" instance (daily DAG). For some reason, it would kill the tasks this way after a few minutes, but if I trigger past instances it runs as it should.Spilt
E
0

Solution for us in GCP composer was increasing the worker size, specifically allocating more memory. We saw this after upgrading to airflow 2 / composer 2. Weren't seeing any error codes, just tasks failing and no logs.

Enswathe answered 5/8 at 16:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.