Difference between job, application, task, task attempt logs in Hadoop, Oozie

I'm running an Oozie job with multiple actions and there's a part I could not make it work. In the process of troubleshooting I'm overwhelmed with lots of logs.

In YARN UI (yarn.resourcemanager.webapp.address in yarn-site.xml, normally on port 8088), there's the application_<app_id> logs.

In Job History Server (yarn.log.server.url in yarn-site.xml, ours on port 19888), there's the job_<job_id> logs. (These job logs should also show up on Hue's Job Browser, right?)

In Hue's Oozie workflow editor, there's the task and task_attempt (not sure if they're the same, everything's a mixed-up soup to me already), which redirects to the Job Browser if you clicked here and there.

Can someone explain what's the difference between these things from Hadoop/Oozie architectural standpoint?

P.S. I've seen in logs container_<container_id> as well. Might as well include this in your explanation in relation to the things above.

Recommended topics

Hot tags