I'm running an Oozie job with multiple actions and there's a part I could not make it work. In the process of troubleshooting I'm overwhelmed with lots of logs.
In YARN UI (yarn.resourcemanager.webapp.address
in yarn-site.xml, normally on port 8088), there's the application_<app_id>
logs.
In Job History Server (yarn.log.server.url
in yarn-site.xml, ours on port 19888), there's the job_<job_id>
logs. (These job logs should also show up on Hue's Job Browser, right?)
In Hue's Oozie workflow editor, there's the task
and task_attempt
(not sure if they're the same, everything's a mixed-up soup to me already), which redirects to the Job Browser if you clicked here and there.
Can someone explain what's the difference between these things from Hadoop/Oozie architectural standpoint?
P.S.
I've seen in logs container_<container_id>
as well. Might as well include this in your explanation in relation to the things above.