I have a list of dags that are hosted on Airflow. I want to get the name of the dags in a AWS lambda function so that I can use the names and trigger the dag using experimental API. I am stuck on getting the names of the dag. Any help would be appreciated
from airflow.models import DagBag
dag_ids = DagBag(include_examples=False).dag_ids
for id in dag_ids:
print(id)
Since Airflow 2.0, the airflow list_dags
command is now:
airflow dags list [-h] [-o table, json, yaml, plain] [-S SUBDIR] [-v]
with the following named arguments:
-o
, --output
- Possible choices: table, json, yaml, plain
- Output format. Allowed values: json, yaml, plain, table (default: table)
- Default: “table”
-S
, --subdir
- File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
- Default: “[AIRFLOW_HOME]/dags”
-v
, --verbose
- Make logging output more verbose
- Default: False
See:
All Airflow-CLI commands for the various various are listed on this URL - https://airflow.apache.org/docs/apache-airflow/stable/usage-cli.html
In the latest version you could do
airflow list_dags
You can first connect with the backend database, By default airflow using SQLite.
Then you can check the DAGs status from table dag
using columns is_active
and is_paused
e.g. airflow=# SELECT dag_id FROM dag WHERE is_active=TRUE AND is_paused=FALSE;
This command will shows all DAGS include disabled dags as well
airflow dag_list
This way works slowly, because it probably needs to parse all python files to add DAGs to DagBag:
from airflow.models import DagBag
dag_ids = DagBag(include_examples=False).dag_ids
for id in dag_ids:
print(id)
In my case on PROD environment it would take 20 seconds to parse all DAGs. Does anyone know better way maybe read DAGs from database? I was trying to do it by passing parameter
read_dags_from_db=True
to DagBag, but it returns empty list of Dags.
© 2022 - 2024 — McMap. All rights reserved.
/api/v1/dags?<query_parameters_here>
but I have not tried it myself). You might also have to configure appropriate authentication settings inairflow.cfg
before using the API. – Orebro