I'm using Airflow v1.8.1 and run all components (worker, web, flower, scheduler) on kubernetes & Docker. I use Celery Executor with Redis and my tasks are looks like:
(start) -> (do_work_for_product1)
├ -> (do_work_for_product2)
├ -> (do_work_for_product3)
├ …
So the start
task has multiple downstreams.
And I setup concurrency related configuration as below:
parallelism = 3
dag_concurrency = 3
max_active_runs = 1
Then when I run this DAG manually (not sure if it never happens on a scheduled task) , some downstreams get executed, but others stuck at "queued" status.
If I clear the task from Admin UI, it gets executed. There is no worker log (after processing some first downstreams, it just doesn't output any log).
Web server's log (not sure worker exiting
is related)
/usr/local/lib/python2.7/dist-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
.format(x=modname), ExtDeprecationWarning
[2017-08-24 04:20:56,496] [51] {models.py:168} INFO - Filling up the DagBag from /usr/local/airflow_dags
[2017-08-24 04:20:57 +0000] [27] [INFO] Handling signal: ttou
[2017-08-24 04:20:57 +0000] [37] [INFO] Worker exiting (pid: 37)
There is no error log on scheduler, too. And a number of tasks get stuck is changing whenever I try this.
Because I also use Docker I'm wondering if this is related: https://github.com/puckel/docker-airflow/issues/94 But so far, no clue.
Has anyone faced with a similar issue or have some idea what I can investigate for this issue...?
queue
but scheduler seems forgot some of them. When I useairflow scheduler
again, they have been picked up. I am also using 1.8.1, kubernetes and Docker, but with LocalExecutor, same issue here. – Castanetsairflow
user brought the application to normal.sudo chown -R airflow:airflow logs/
– Eckdocker ps
and if it's unhealthy, stop it withdocker stop <container_id>
, remove it withdocker rm <container_id>
, then start it again withdocker run <image_name>
– Cirone