Running R scripts in Airflow?
Asked Answered
O

6

19

Is it possible to run an R script as an airflow dag? I have tried looking online for documentation on this and am unable to do so. Thanks

Ovary answered 22/8, 2017 at 19:4 Comment(1)
You should include some code you'd like to execute. IMO I think Pierre's answer is the best so far and in the absence of code should be accepted. You can just include bash my_shell_script.sh in your Airflow DAG and then in that bash script use RScript to execute your R file.Hoiden
O
9

Another option is to containerize your R script and run it using the DockerOperator, which is included in the standard distribution. This removes the need to have your worker nodes configured with the correct version of R and any needed R libraries.

Ossuary answered 3/9, 2017 at 15:34 Comment(0)
A
8

There doesn't seem to be a R Operator right now.

You could either write your own and contribute to the community or simply run your task as a BashOperator calling RScript.

Aday answered 28/8, 2017 at 15:51 Comment(0)
W
4

USe BashOperator for executing R scripts. For example: opr_hello = BashOperator(task_id='xyz',bash_command='Rscript Pathtofile/file.r')

Waybill answered 5/3, 2019 at 11:17 Comment(0)
G
2

There is a pull request open for an R operator, still waiting for it to be incorporated.

https://github.com/apache/incubator-airflow/pull/3115/files

Geerts answered 17/7, 2018 at 15:58 Comment(0)
C
0

Combining the approaches from other answers, you can run an R script in a Docker container without bundling it into the container by calling it in a DAG using the BashOperator like this task:

    process_data = BashOperator(
        task_id='process_data',
        bash_command = 'docker run --rm=true -v /home/myuser/airflow/dags/source/my_r_project_folder:/source rocker/verse:4.3 Rscript /source/my_r_script.R'
        )
    process_data

This mounts the directory /home/myuser/airflow/dags/source/my_r_project_folder as a volume so that the R script and any data files are present in the container, then executes the file my_r_script.R from that /source directory using the environment defined by the rocker/verse image. For reproducibility, you would ideally specify a certain tag for that image, in this case 4.3.

My Airflow instance is running in Docker itself and is starting sibling containers using the host machine's socket. A different setup might need to handle the mounting of a volume somewhat differently.

Chauncey answered 7/7, 2023 at 18:14 Comment(0)
L
0

I managed to run R scripts in airflow in docker.
Step1: Set up airflow in docker using :
https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html

Step2:Open airflow-worker docker container command line and setup proxy
https://operavps.com/docs/setup-proxy-on-ubuntu/

Step3: Install r-base in the airflow-worker (upto setp4 in below link)
https://gcore.com/learning/how-to-install-r-on-ubuntu/

Step 4: Create your dag and call the r script using below command

from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from datetime import datetime

#Define a Dag
dag = DAG(
  'hello_world',
  description='A simple Dag tutorial',
  schedule_internal=None,
  start_date = datetime(2023,3,22)
)

#Define the Bash Operator task

run_R = BashOperator(
  task_id = "Run R",
  bash_command='Rscript --no-save  /path to R script within airflow',
  dag=dag
)

run_R
Lauro answered 27/6, 2024 at 13:44 Comment(1)
Please do not add code as an image for these reasons. Instead, add the code in a codeblock. ThanksViperous

© 2022 - 2025 — McMap. All rights reserved.