spark Yarn mode how to get applicationId from spark-submit
Asked Answered
F

1

4

When I submit spark job using spark-submit with master yarn and deploy-mode cluster, it doesn't print/return any applicationId and once job is completed I have to manually check MapReduce jobHistory or spark HistoryServer to get the job details.
My cluster is used by many users and it takes lot of time to spot my job in jobHistory/HistoryServer.

is there any way to configure spark-submit to return the applicationId?

Note: I found many similar questions but their solutions retrieve applicationId within the driver code using sparkcontext.applicationId and in case of master yarn and deploy-mode cluster the driver also run as a part of mapreduce job, any logs or sysout printed to remote host log.

Fahrenheit answered 26/5, 2017 at 20:10 Comment(4)
I'm not sure I get your note. The applicationId from the sparkcontext is the way to goEvacuation
As my driver is launched on one of cluster node so how to send the applicationId from that node to client? is there any out of the box feature spark provides?Fahrenheit
You can save applicationId to a file on hdfs.Many software use this way to keep processing id .Campy
Thanks. Yeah it make sense to persists applicationId on HDFS and let client read it upon required. Another solution I implemented is notify user applicationId using email. @zhangtong Please post your comment as answer.Fahrenheit
F
0

Here are the approaches that I used to achieve this:

  1. Save the application Id to HDFS file. (Suggested by @zhangtong in comment).
  2. Send an email alert with applictionId from driver.
Fahrenheit answered 1/6, 2017 at 16:20 Comment(1)
A concise code snippet would really make it an answerBrake

© 2022 - 2024 — McMap. All rights reserved.