Is it possible to run an R script as an airflow dag? I have tried looking online for documentation on this and am unable to do so. Thanks
Another option is to containerize your R script and run it using the DockerOperator, which is included in the standard distribution. This removes the need to have your worker nodes configured with the correct version of R and any needed R libraries.
There doesn't seem to be a R Operator right now.
You could either write your own and contribute to the community or simply run your task as a BashOperator calling RScript.
USe BashOperator for executing R scripts. For example: opr_hello = BashOperator(task_id='xyz',bash_command='Rscript Pathtofile/file.r')
There is a pull request open for an R operator, still waiting for it to be incorporated.
Combining the approaches from other answers, you can run an R script in a Docker container without bundling it into the container by calling it in a DAG using the BashOperator like this task:
process_data = BashOperator(
task_id='process_data',
bash_command = 'docker run --rm=true -v /home/myuser/airflow/dags/source/my_r_project_folder:/source rocker/verse:4.3 Rscript /source/my_r_script.R'
)
process_data
This mounts the directory /home/myuser/airflow/dags/source/my_r_project_folder
as a volume so that the R script and any data files are present in the container, then executes the file my_r_script.R
from that /source
directory using the environment defined by the rocker/verse image. For reproducibility, you would ideally specify a certain tag for that image, in this case 4.3.
My Airflow instance is running in Docker itself and is starting sibling containers using the host machine's socket. A different setup might need to handle the mounting of a volume somewhat differently.
I managed to run R scripts in airflow in docker.
Step1: Set up airflow in docker using :
https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html
Step2:Open airflow-worker docker container command line and setup proxy
https://operavps.com/docs/setup-proxy-on-ubuntu/
Step3: Install r-base in the airflow-worker (upto setp4 in below link)
https://gcore.com/learning/how-to-install-r-on-ubuntu/
Step 4: Create your dag and call the r script using below command
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from datetime import datetime
#Define a Dag
dag = DAG(
'hello_world',
description='A simple Dag tutorial',
schedule_internal=None,
start_date = datetime(2023,3,22)
)
#Define the Bash Operator task
run_R = BashOperator(
task_id = "Run R",
bash_command='Rscript --no-save /path to R script within airflow',
dag=dag
)
run_R
© 2022 - 2024 — McMap. All rights reserved.
bash my_shell_script.sh
in your Airflow DAG and then in that bash script use RScript to execute your R file. – Hoiden