Background
I am currently deploying Apache Airflow using Helm (using this chart). I am using a git-sync sidecar to mount the SQL & Python files which Airflow will need to have access to to be able to execute scripts/files.
What seems not to work
Once I am done with deploying my container, it seems that my Airflow user is unable to use the files (that have been mounted by the git sidecar), and exits with error (this error happens for all files that have been mounted not only target):
[Errno 13] Permission denied: 'target'
What I have tried
My docker container for the deployment looks like:
FROM apache/airflow:1.10.14-python3.8
USER root
# apt deps
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
nano \
gcc \
python3-dev \
&& apt-get autoremove -yqq --purge \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Use airflow user for pip installs and other things.
USER airflow
# Additional requirements for Airflow
COPY requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt
# Creating folder for logs
RUN mkdir -p /opt/airflow/dbt_logs
RUN pip install dbt==0.18.1
EXPOSE 8081
Running ls -ld */
(in /opt/airflow
) in the scheduler container I get:
airflow@airflow-scheduler-58c7cb87b8-8nk6f:/opt/airflow$ ls -ld */
drwxrwsrwx 5 1000 1000 41 Dec 26 16:02 dags/
drwxr-xr-x 2 airflow airflow 6 Dec 26 14:10 dbt_logs/
drwxrwxr-x 1 airflow root 52 Dec 26 16:02 logs/
And running ls -ld */
(in /opt/airflow
) in the web server container I get:
airflow@airflow-web-7f8df9457-7n2dp:/opt/airflow$ ls -ld */
drwxrwsrwx 5 root nogroup 41 Dec 26 14:20 dags/
drwxr-xr-x 1 airflow airflow 21 Dec 26 14:41 dbt_logs/
drwxrwxr-x 1 airflow root 23 Dec 26 14:20 logs/
This is how the structure of the dbt folder looks like withing my dags dir:
airflow@airflow-scheduler-6cdc985b9b-ssmhx:/opt/airflow/dags/dbt/dw$ ls -l
total 16
-rw-r--r-- 1 root 1000 3061 Dec 27 09:00 README.md
drwxr-sr-x 2 root 1000 22 Dec 27 09:00 analysis
drwxr-sr-x 2 root 1000 22 Dec 27 09:00 data
-rw-r--r-- 1 root 1000 1852 Dec 27 09:00 dbt_project.yml
drwxr-sr-x 2 root 1000 214 Dec 27 09:00 macros
drwxr-sr-x 3 root 1000 21 Dec 27 09:00 models
-rw-r--r-- 1 root 1000 141 Dec 27 09:00 packages.yml
-rw-r--r-- 1 root 1000 842 Dec 27 09:00 profiles.yml
drwxr-sr-x 2 root 1000 22 Dec 27 09:00 snapshots
drwxr-sr-x 2 root 1000 22 Dec 27 09:00 tests
Worth mentioning that I seem not to be able to create files within the dbt dir with the airflow user (permission denied)
It seems to me that once the volume is mounted its owner becomes the root user. How can I provide the Airflow user with the ability to access the mounted git repository?
Happy to provide additional details if needed
emptyDir
volume? – Urbanogit
in your airflow image if you're using git-sync in a separate container? – Urbano/usr/local/airflow
? what's thepwd
where you runls -ld
? – Urbano["sh", "-c", "chown -R 1000:1000 /opt/airflow/dags/"]
command. That changes the dag folder's ownership to 1000 (now trying to figure out if that will help me) – Citarelladrwxrwsrwx
state that all can read/write/execute the dag folder. I still don't understand whattarget
is? directory? file? – Urbanotarget
, and it fails to do so. In the dbt yaml file definition it reads:target-path: "target" # directory which will store compiled SQL files
– Citarelladbt
but sounds like this is your problem. Can you create a simple DAG example without thedbt
to isolate the problem you're facing? (One simple python file) – Urbano