Oozie: Does oozie generate output-events?
Asked Answered
I

2

5

In oozie, input-events are pretty straightforward, if the specifies file/folder is not present, the coordinator job is kept in WAITING state. But I could not understand what output-events does.

As per my understanding, the files/folders specified in output-events tag should be created by oozie in case all specified actions are successful. But that does not happen. I cannot find any relevant logs either. Nor are the documentations clear about this.

So, the question is, does Oozie really create files/folders specified in output-events? Or does it just mention that these particular files/folders are created during the workflow and the responsibility of creation is on jobs, not on Oozie?

Relevant piece of code can be found at https://gist.github.com/venkateshshukla/de0dc395797a7ffba153

Inveigh answered 20/10, 2015 at 10:27 Comment(2)
If the job submitted to yarn then you can trace the logs in the application master "STDERR" "STDOUT". For oozie you can look at the web UI.Futtock
As I mentioned, logs do not show anything. There is no mention of what happens after the workflow job exits successfully. But I can see that no files/directories are getting created.Inveigh
G
1

Always the actions generate the data, these settings are just for control. You'll find some examples here

Grapeshot answered 29/1, 2016 at 16:35 Comment(0)
A
6

The official Oozie documentation for Oozie Coordinator is not very clear on the exact purpose of the output-events element. However, the book "Apache Oozie: The Workflow Scheduler for Hadoop" mentions the following:

During reprocessing of a coordinator, Oozie tries to help the retry attempt by cleaning up the output directories by default. For this, it uses the <output-events> specification in the coordinator XML to remove the old output before running the new attempt. Users can override this default behavior using the –noCleanup option.

So, in summary:

  • No, files specified in output-events are not automatically created by Oozie, you need to create those files in your Oozie workflow actions.
  • The output-events configuration is for giving Oozie information on what files will be created by your Oozie workflow actions, which Oozie would use to cleanup files when rerunning/reprocessing a coordinator.
Adaadabel answered 23/3, 2017 at 3:58 Comment(0)
G
1

Always the actions generate the data, these settings are just for control. You'll find some examples here

Grapeshot answered 29/1, 2016 at 16:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.