Building Oozie 4.2.0 with Spark on YARN support
Asked Answered
D

1

7

What I am trying to achieve is to build and install Oozie 4.2.0 that will enable me to submit Spark jobs to a YARN cluster.

I build the distro by executing: oozie-4.2.0/bin/mkdistro.sh -Puber -Phadoop-2 -DskipTests. That created oozie-4.2.0-distro.tar.gz package and inside I can find oozie-4.2.0-sharelib.tar.gz. However, many tutorials online state that I should use oozie-4.2.0-sharelib-yarn.tar.gz in order to use YARN. Such a file is not contained in the distro package. How can I make the build process output the YARN version of sharelibs?

I tried to continue with the non-YARN version, but when submitting the example Spark job (and adjusting the HDFS and YARN addresses in job.properties along with master property from local[*] to yarn) I got an error:

Error: Could not load YARN classes. This copy of Spark may not have been compiled with YARN support.

Debbidebbie answered 29/10, 2015 at 9:40 Comment(1)
An interesting question, but I voted to close as not reproducible due to the combination of several small issues: It is not clearly mentioned which resources are used (guide, spark version, source of oozie distibution). On top of that oozie has moved beyond the listed version.Preach
N
0

Oozie 4.2 does not include OOZIE-2271 that added the spark_yarn dependency to the sharelib when compiling against the hadoop-2 profile. Try to build distro with Oozie 4.3. Alternatively, you can try to backport OOZIE-2271 and build Oozie yourself.

See spark-yarn_2.10 in this commit: https://github.com/apache/oozie/commit/e6b5c95efb492a70087377db45524e06f803459e

Neoptolemus answered 18/6, 2019 at 8:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.