Airflow how to get env vars of each dag from the code itself
Asked Answered
A

2

5

I see from the log the following info:

[2019-02-28 16:33:14,766] {python_operator.py:95} INFO - Exporting the following env vars:

AIRFLOW_CTX_DAG_ID=email_operator_with_log_attachment_example
AIRFLOW_CTX_EXECUTION_DATE=2019-02-28T21:32:51.357255+00:00
AIRFLOW_CTX_TASK_ID=python_send_email
AIRFLOW_CTX_DAG_RUN_ID=manual__2019-02-28T21:32:51.357255+00:00

How do I get those info inside my code?

Thank you very much.

Alage answered 28/2, 2019 at 21:42 Comment(0)
R
7

You can access these variables with os.environ["ENV VAR NAME"] (make sure to import os). For example:

import os
# ... other imports ...

dag = DAG(
    dag_id="demo",
    default_args=default_args,
    schedule_interval="0 0 * * *",
)

def print_env_var():
    print(os.environ["AIRFLOW_CTX_DAG_ID"])

print_context = PythonOperator(
    task_id="print_env",
    python_callable=print_env_var,
    dag=dag,
)

However, the common way to access such variables in a task is by providing the task context by setting provide_context=True in your operator.

For example:

dag = DAG(
    dag_id="demo",
    default_args=default_args,
    schedule_interval="0 0 * * *",
)

def print_context(**context):
    print(context)

print_context = PythonOperator(
    task_id="print_context",
    python_callable=print_context,
    provide_context=True,  # <====
    dag=dag,
)

The context variable will contain a number of variables containing information about the task context, including the ones in your question:

# {
# 'END_DATE': '2019-01-01',
# 'conf': <module 'airflow.configuration' from '/opt/conda/lib/python3.6/site-packages/airflow/configuration.py'>,
# 'dag': <DAG: context_demo>,
# 'dag_run': None,
# 'ds': '2019-01-01',
# 'ds_nodash': '20190101',
# 'end_date': '2019-01-01',
# 'execution_date': <Pendulum [2019-01-01T00:00:00+00:00]>,
# 'inlets': [],
# 'latest_date': '2019-01-01',
# 'macros': <module 'airflow.macros' from '/opt/conda/lib/python3.6/site-packages/airflow/macros/__init__.py'>,
# 'next_ds': '2019-01-02',
# 'next_ds_nodash': '20190102',
# 'next_execution_date': datetime.datetime(2019, 1, 2, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>),
# 'outlets': [],
# 'params': {},
# 'prev_ds': '2018-12-31',
# 'prev_ds_nodash': '20181231',
# 'prev_execution_date': datetime.datetime(2018, 12, 31, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>),
# 'run_id': None,
# 'tables': None,
# 'task': <Task(PythonOperator): print_exec_date>,
# 'task_instance': <TaskInstance: context_demo.print_exec_date 2019-01-01T00:00:00+00:00 [None]>,
# 'task_instance_key_str': 'context_demo__print_exec_date__20190101',
# 'templates_dict': None,
# 'test_mode': True,
# 'ti': <TaskInstance: context_demo.print_exec_date 2019-01-01T00:00:00+00:00 [None]>,
# 'tomorrow_ds': '2019-01-02',
# 'tomorrow_ds_nodash': '20190102',
# 'ts': '2019-01-01T00:00:00+00:00',
# 'ts_nodash': '20190101T000000',
# 'ts_nodash_with_tz': '20190101T000000+0000',
# 'var': {'json': None, 'value': None},
# 'yesterday_ds': '2018-12-31',
# 'yesterday_ds_nodash': '20181231'
# }

I explain how to handle the task context in a bit more detail in this blog post (see "3. Passing context to tasks").

Ruffled answered 28/2, 2019 at 23:0 Comment(3)
Thank you very much for the detailed explanation.Alage
Looks like you can only access os.environ["AIRFLOW_CTX_DAG_ID"] when inside the operator. Can't access it outside.Honeymoon
Do you know by any chance if there is something similar for a SnowflakeOperator? According to the documentation, I can't find how to print it properly..Paymar
C
0

What about to use function from airflow:

from airflow.models import Variable
Variable.get("AIRFLOW_CTX_DAG_ID")
Cormorant answered 5/4, 2023 at 13:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.