Log messages for DAG import errors in Airflow 2.x
Asked Answered
C

2

7

I'm running Apache Airflow 2.x locally, using the Docker Compose file that is provided in the documentation. In the .\dags directory on my local filesystem (which is mounted into the Airflow containers), I create a new Python script file, and implement the DAG using the TaskFlow API.

The changed to my DAG are sometimes invalid. For example, maybe I have an ImportError due to an invalid module name, or a syntax error. When Airflow attempts to import the DAG, I cannot find any log messages, from the web server, scheduler, or worker, that would indicate a problem, or what the specific problem is.

Instead, I have to read through my code line-by-line, and look for a problem. This problem is compounded by the fact that my local Python environment on Windows 10, and the Python environment for Airflow, are different versions and have different Python packages installed. Hence, I cannot reliably use my local development environment to detect package import failures, because the packages I expect to be installed in the Airflow environment are different than the ones I have locally. Additionally, the version of Python I'm using to write code locally, and the Python version being used by Airflow, are not matched up.

Thus, I am needing some kind of error logging to indicate that a DAG import failed.

Question: When a DAG fails to update / import, where are the logs to indicate if an import failure occurred, and what the exact error message was?

Carman answered 7/4, 2021 at 15:28 Comment(0)
P
7

Currently, the DAG parsing logs would be under $AIRFLOW_HOME/logs/EXECUTION_DATE/scheduler/DAG_FILE.py.log

Example:

Let's say my DAG file is example-dag.py which has the following contents, as you can notice there is a typo in datetime import:

from airflow import DAG
from airflow.operators.bash import BashOperator
from datetime import dattime   # <-- This Line has typo  


dag = DAG(
    dag_id='example_Dag',
    schedule_interval=None,
    start_date=datetime(2019, 2, 6),
)

t1 = BashOperator(
    task_id='print_date1',
    bash_command='sleep $[ ( $RANDOM % 30 )  + 1 ]s',
    dag=dag)

Now, if you check logs under $AIRFLOW_HOME/logs/scheduler/2021-04-07/example-dag.py.log where $AIRFLOW_HOME/logs is what I have set in $AIRFLOW__LOGGING__BASE_LOG_FOLDER or [logging] base_log_folder in airflow.cfg (https://airflow.apache.org/docs/apache-airflow/2.0.1/configurations-ref.html#base-log-folder)

That file should have logs as below:

[2021-04-07 21:39:02,222] {scheduler_job.py:182} INFO - Started process (PID=686) to work on /files/dags/example-dag.py
[2021-04-07 21:39:02,230] {scheduler_job.py:633} INFO - Processing file /files/dags/example-dag.py for tasks to queue
[2021-04-07 21:39:02,233] {logging_mixin.py:104} INFO - [2021-04-07 21:39:02,233] {dagbag.py:451} INFO - Filling up the DagBag from /files/dags/example-dag.py
[2021-04-07 21:39:02,368] {logging_mixin.py:104} INFO - [2021-04-07 21:39:02,357] {dagbag.py:308} ERROR - Failed to import: /files/dags/example-dag.py
Traceback (most recent call last):
  File "/opt/airflow/airflow/models/dagbag.py", line 305, in _load_modules_from_file
    loader.exec_module(new_module)
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/files/dags/example-dag.py", line 3, in <module>
    from datetime import dattime
ImportError: cannot import name 'dattime'
[2021-04-07 21:39:02,380] {scheduler_job.py:645} WARNING - No viable dags retrieved from /files/dags/example-dag.py
[2021-04-07 21:39:02,407] {scheduler_job.py:190} INFO - Processing /files/dags/example-dag.py took 0.189 seconds

and you will see the error in the Webserver as follow:

enter image description here

Poppyhead answered 7/4, 2021 at 21:48 Comment(0)
G
-1

in the above code you are giving the wrong module name. you should write like this:

from datetime import datetime

then it will work.

Getupandgo answered 22/8, 2023 at 15:34 Comment(1)
You're reading one of the answers to this question which uses import error for demonstration purposes. This does not answer the original questionGalloway

© 2022 - 2024 — McMap. All rights reserved.