airflow - how to 'Filling up the DagBag' once only

About

Asked 25/4, 2019 at 15:40 Answered 25/4, 2019 at 15:57

Solved airflow-scheduler orchestration airflow

My dag takes about 50seconds to parse, I only use external triggers to start dag runs, no schedules. I notice airflow wants to fill the dagbag a lot --> On every trigger_dag command AND in the background it keeps checking the dags folder AND creating .pyc files seemingly instantly once new .py deployed.

Is there anyway I can deploy my cluster and get dags filled once! Then for the next 2 weeks get dagruns starting instantly on any trigger_dag (right now takes 50 seconds just to fill the dagbag before starting). I have no need to update dag definitions within the 2 weeks.

Harlow answered 25/4, 2019 at 15:40 Comment(0)

50 seconds is an incredibly huge amount of time for DAG instantiation. Looks like you are using a big piece of code (or just long-working) in your DAG file. It is very bad practice:

Note: This means all top level code (ie. anything that isn't defining the DAG) in a DAG file will get run each scheduler heartbeat. Try to avoid top level code to your DAG file unless absolutely necessary.

Airflow works exactly as you described. It is why you should treat your Python files in your DAG folder mostly as configuration files (with some programmatical capabilities). You can't change it with any magic config keys or something like it. This behaviour is the core of Airflow.

Exaggeration answered 25/4, 2019 at 15:57 Comment(4)

why does it need to parse the dag if it has not changed? – Harlow 25/4, 2019 at 16:35

Because Airflow doesn't know was it changed or not. Many DAGs in Airflow are created programmatically and Airflow can't see changes until it creates a DAG again. Moreover, ALL DAGs are re-created each heartbeat, even static DAGs. It is the huge Airflow's drawback IMO. – Exaggeration 25/4, 2019 at 16:44

Yes, in particular when using it in a container: we re-deploy when DAGs change, and it still keeps using up CPU and log rows when idling. – Kukri 19/12, 2019 at 15:27

There are some options but they don't seem to be effective. – Kukri 19/12, 2019 at 15:29

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags