No module named 'airflow' when initializing Apache airflow docker
Asked Answered
M

4

7

I am trying to run apache airflow as a docker on a Centos 7 machine. I followed all the instructions here:https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html when i am trying to initialize the docker by running docker-compose up airflow-init

i am getting this error

[root@centos7 centos]# docker-compose up airflow-init
Creating network "centos_default" with the default driver
Creating volume "centos_postgres-db-volume" with default driver
Creating centos_redis_1    ... done
Creating centos_postgres_1 ... done
Creating centos_airflow-init_1 ... done
Attaching to centos_airflow-init_1
airflow-init_1       | BACKEND=postgresql+psycopg2
airflow-init_1       | DB_HOST=postgres
airflow-init_1       | DB_PORT=5432
airflow-init_1       |
airflow-init_1       | Traceback (most recent call last):
airflow-init_1       |   File "/home/airflow/.local/bin/airflow", line 5, in <module>
airflow-init_1       |     from airflow.__main__ import main
airflow-init_1       | ModuleNotFoundError: No module named 'airflow'
airflow-init_1       | Traceback (most recent call last):
airflow-init_1       |   File "/home/airflow/.local/bin/airflow", line 5, in <module>
airflow-init_1       |     from airflow.__main__ import main
airflow-init_1       | ModuleNotFoundError: No module named 'airflow'
airflow-init_1       | Traceback (most recent call last):
airflow-init_1       |   File "/home/airflow/.local/bin/airflow", line 5, in <module>
airflow-init_1       |     from airflow.__main__ import main
airflow-init_1       | ModuleNotFoundError: No module named 'airflow'
airflow-init_1       | Traceback (most recent call last):
airflow-init_1       |   File "/home/airflow/.local/bin/airflow", line 5, in <module>
airflow-init_1       |     from airflow.__main__ import main
airflow-init_1       | ModuleNotFoundError: No module named 'airflow'
airflow-init_1       | Traceback (most recent call last):
airflow-init_1       |   File "/home/airflow/.local/bin/airflow", line 5, in <module>
airflow-init_1       |     from airflow.__main__ import main
airflow-init_1       | ModuleNotFoundError: No module named 'airflow'
centos_airflow-init_1 exited with code 1

i used the standard YAML file from here:https://airflow.apache.org/docs/apache-airflow/2.0.1/docker-compose.yaml i found that it's a known issue here:https://github.com/apache/airflow/issues/14520 but i could not understand how to solve this problem. any advice?

Melentha answered 25/3, 2021 at 1:23 Comment(0)
S
2

I solved this problem this way.

login with a non-root user.

find your user id :

echo $UID

create .env file and put these lines inside it . replace 4003 with your user id:

AIRFLOW_UID=4003
AIRFLOW_GID=0

If you have not created these directories, first create these and run docker-compose:

sudo mkdir ./dags ./logs ./plugins
sudo chmod 777 -R logs 
sudo docker-compose up airflow-init
sudo docker-compose up 

Sherrillsherrington answered 21/4, 2021 at 11:1 Comment(0)
M
1

I found the problem. There is a bug on version 2.0.1 that doesn’t let you run the airflow containers using root. You have to run the installation under another user name (with sudo).

Melentha answered 31/3, 2021 at 7:18 Comment(1)
Can you expand on this? I'm seeing the bug and not running as root.Ptolemaist
S
1

Running Airflow in docker-compose on Linux (Mint 20.2 Uma 64-bit) already required me to set up AIRFLOW_UID and AIRFLOW_GID and I still got the aforementioned ModuleNotFound issue, which points that airflow in not in the PYTHONPATH. This might be due to the fact that you are not logging in as the airflow user (UID=50000). So setting the PYTHONPATH resolved my issue. Here how, hope that it is useful for someone:

  • I spawned only one container manually as my user ID (e.g. with a sleep infinity command) and connected to it using docker exec -it <CONTAINER_NAME> /bin/bash in order to investigate further:
    • running both airflow and import airflow (from python) both had this issue which confirmed my assumption
    • check where airflow is located and see if it is the PYTHONPATH
      $ which airflow
      /home/airflow/.local/bin/airflow
      
      $ echo $PYTHONPATH
      /opt/airflow:
      
    • since however, /home/airflow/.local/bin/airflow is simply a python script that imports airflow from /home/airflow/.local/lib/python3.9/site-packages add this path to PYTHONPATH and run airflow to test
      $ PYTHONPATH="/home/airflow/.local/lib/python3.9/site-packages:$PYTHONPATH"
      $ airflow
      # output omitted
      
  • At the end I added this to my docker-compose.yaml file and ran docker-compose up to run airflow.
Spathic answered 11/7, 2023 at 8:40 Comment(0)
P
0

This can happen if AIRFLOW_GID is not set properly in the .env file.

The instructions include running the command echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env. To check this worked as expected, look at the contents of the .env file by running cat .env. You should see something that looks like this:

AIRFLOW_UID=1000
AIRFLOW_GID=0

If you do not you might need to manually edit the .env file to set the airflow uid and gid.

Ptolemaist answered 9/4, 2021 at 18:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.