Can Apache Oozie run docker containers?
Asked Answered
S

1

3

Currently comparing DAG-based workflow tools like Airflow and Luigi for scheduling generic docker containers as well as Spark jobs.

Can Apache Oozie run generic Docker containers through its shell action? Or is Oozie strictly meant for Hadoop tools like Pig and Hive?

Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).

Slier answered 28/1, 2019 at 1:2 Comment(1)
Oozie fetches job executables & resources from HDFS, and launches YARN jobs (even for a plain bash shell). Full stop. End of story.Metric
D
1

I've tried to run Docker containers through Shell action and it's working. Since Shell action can be executed on any node of the cluster, Docker must be installed on any node.

workflow.xml created from Hue

<workflow-app name="Test docker" xmlns="uri:oozie:workflow:0.5">
    <start to="shell-5c29"/>
    <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="shell-5c29">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <exec>test_docker.sh</exec>
            <file>/test_docker.sh#test_docker.sh</file>
        </shell>
        <ok to="End"/>
        <error to="Kill"/>
    </action>
    <end name="End"/>
</workflow-app>

test_docker.sh

docker run hello-world > output.txt
hdfs dfs -put -f output.txt /output.txt
echo 'done'

Content of output.txt generated

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub (amd64)
 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/
Dona answered 31/10, 2019 at 11:40 Comment(1)
Hey, thanks for sharing, I tried running this on my big-data-europe setup, it was successful but I couldn't find the output.txt file, any reason why that is the case? @DonaPygidium

© 2022 - 2024 — McMap. All rights reserved.