How to invoke an oozie workflow via shell script and block/wait till workflow completion
Asked Answered
G

2

5

I have created a workflow using Oozie that is comprised of multiple action nodes and have been successfully able to run those via coordinator.

I want to invoke the Oozie workflow via a wrapper shell script.

The wrapper script should invoke the Oozie command, wait till the oozie job completes (success or error) and return back the Oozie success status code (0) or the error code of the failed oozie action node (if any node of the oozie workflow has failed).

From what I have seen so far, I know that as soon as I invoke the oozie command to run a workflow, the command exits with the job id getting printed on linux console, while the oozie job keeps running asynchronously in the backend.

I want my wrapper script to block till the oozie coordinator job completes and return back the success/error code.

Can you please let me know how/if I can achieve this using any of the oozie features?

I am using Oozie version 3.3.2 and bash shell in Linux.

Note: In case anyone is curious about why I need such a feature - the requirement is that my wrapper shell script should know how long an oozie job has been runnig, when an oozie job has completed, and accordingly return back the exit code so that the parent process that is calling the wrapper script knows whether the job completed successfully or not, and if errored out, raise an alert/ticket for the support team.

Gaffe answered 20/6, 2015 at 10:58 Comment(0)
A
5

You can do that by using the job id then start a loop and parsing the output of oozie info. Below is the shell code for same.

Start oozie job

oozie_job_id=$(oozie job -oozie http://<oozie-server>/oozie -config job.properties -run );
echo $oozie_job_id;
sleep 30;

Parse job id from output. Here job_id format is "job: jobid"

job_id=$(echo $oozie_job_id | sed -n 's/job: \(.*\)/\1/p');
echo $job_id;

check job status at regular interval, if its Running or not

while [ true ]
do
   job_status=$(oozie job --oozie http://<oozie-server>/oozie -info $job_id | sed -n 's/Status\(.*\): \(.*\)/\2/p');
    if [ "$job_status" != "RUNNING" ];
    then
        echo "Job is completed with status $job_status";
        break;
    fi
    #this sleep depends on you job, please change the value accordingly
    echo "sleeping for 5 minutes";
    sleep 5m
done 

This is basic way to do it, you can modify it as per you use case.

Alenealenson answered 2/6, 2016 at 13:1 Comment(0)
A
3

To upload workflow definition to HDFS use the following command :

hdfs dfs -copyFromLocal -f workflow.xml /user/hdfs/workflows/workflow.xml

To fire up Oozie job you need these two commands at the below Please Notice that to write each on a single line.

JOB_ID=$(oozie job -oozie http://<oozie-server>/oozie -config job.properties -submit)

oozie job -oozie http://<oozie-server>/oozie -start ${JOB_ID#*:} -config job.properties

You need to parse result coming from below command when the returning result = 0 otherwise it's a failure. Simply loop with sleep X amount of time after each trial.

oozie job -oozie http://<oozie-server>/oozie -info ${JOB_ID#*:}

echo $? //shows whether command executed successfully or not

Antofagasta answered 1/9, 2015 at 22:20 Comment(2)
Thanks for the answer, if you write something between two ` the middle text will be highlighted like : example, Usually codes are written this way click edit on your answer and you will see how I have beatified and clarified your answer. Also if you take look at the other posts you will see that nobody use greetings actually posts with greetings is usually frown upon by community members.Please get the informed badge by clicking on the tour button at the top or the help drop down and then click on the tour and read it carefully.Idiophone
I am writing this comment for you because your answer is sent to me to be reviewed by the community.Idiophone

© 2022 - 2024 — McMap. All rights reserved.