We have relatively complex dynamic DAG as part of our ETL. DAG contains hundreds of transformations and it is created programmatically based on set of yaml files. It is changed through time: new tasks are added, queries executed by tasks are changed and even relationships between tasks are changed.
I know that new DAG should be created each time it is changed in this way and that DAG versioning is not supported by Airflow, but this is real use case and I would like to hear if there are some suggestions how to do this.
One of the most important request and why we want to try to tackle this, is that we must aware of DAG versions when we are doing clear of backfill for some moment in the past. This effectively means that when DAG is executed for some past moment, that must be version of DAG from that moment, not the newest one.
Any suggestions are more than welcome.
generate_dag_code >> trigger_this_new_dag_once >> wait >> disable_the_dag
. If this DAG is run every minute then it can be noisy in terms of number of DAGs but otherwise it would serve all your needs I think. – Celerity