I'm a newbie in Oozie and I've read some Oozie shell action examples but this got me confused about certain things.
There are examples I've seen where there is no <file>
tag.
Some example, like in Cloudera here, repeats the shell script in file tag:
<shell xmlns="uri:oozie:shell-action:0.2">
<exec>check-hour.sh</exec>
<argument>${earthquakeMinThreshold}</argument>
<file>check-hour.sh</file>
</shell>
While in Oozie's website, writes the shell script (the reference ${EXEC}
from job.properties, which points to script.sh file) twice, separated by #.
<shell xmlns="uri:oozie:shell-action:0.1">
...
<exec>${EXEC}</exec>
<argument>A</argument>
<argument>B</argument>
<file>${EXEC}#${EXEC}</file>
</shell>
There are also examples I've seen where the path (HDFS or local?) is prepended before the script.sh#script.sh
within the <file>
tag.
<shell xmlns="uri:oozie:shell-action:0.1">
...
<exec>script.sh</exec>
<argument>A</argument>
<argument>B</argument>
<file>/path/script.sh#script.sh</file>
</shell>
As I understand, any shell script file can be included in the workflow HDFS path (same path where workflow.xml resides).
Can someone explain the differences in these examples and how <exec>
, <file>
, script.sh#script.sh
, and the /path/script.sh#script.sh
are used?
#
syntax for a<file>
element. In the examples above, it has no value. But think about<file>/apps/bling/bling-hardcore-1.2.3.4-6-unplugged.jar#bling.jar</file>
or<archive>/apps/stuff/spark-1.6.0-with hive-1.2-dependencies.zip#spark</archive>
so that your Java or Shell action can expect a pre-defined name to build a CLASSPATH... – Periscope