What is difference between Oozie workflow, coordinator and bundle
Asked Answered
I

3

15

What is difference between Oozie workflow, coordinator and bundle ?

Oozie workflow defines a sequence of actions. And we need to invoke it manually every time we want it to run. Where as same workflow can be scheduled through coordinator. Is this understanding correct ?

Then what is extra in bundle ?

I guess it is used again to schedule set of coordinators. Then why can't one coordinator be used to schedule other coordinator like one workflow can have another sub-workflow.

Intercollegiate answered 23/10, 2015 at 10:50 Comment(1)
If coordinator One is scheduled at 7 am & coordinator two is scheduled at 10 am When we bundle these 2 together 1) Do we need to schedule Bundle as well ? 2) If coordinator One fails or delayed beyond 10 am, will Bundle stop Coordinator Two from executing? Could you please clarify.Gober
E
13

Workflow:

It is a sequence of actions. It is written in xml and the actions can be map reduce, hive, pig etc.

Coordinator:

It is a program that triggers actions (commonly workflow jobs) when a set of conditions are met. Conditions can be a time frequency,other external events etc.

Bundle:

It is defined as a higher level oozie abstraction that batches a set of coordinator jobs.We can specify the time for bundle job to start as well.

Earthstar answered 23/10, 2015 at 12:25 Comment(2)
thanks for answer. But these definitions still do not clarify the difference/need of Bundle compared to Coordinator.Intercollegiate
Just an higher level of abstraction... Group of workflows coordinator... Group of coordinators bundle....Earthstar
L
3

Workflow does not have time specifications to run any hadoop job. Coordinator job have the time specifications about job in coordinator.xml using frequency tag. Collective coordinator jobs are considered to be as a Bundle job. In Bundle job, individual users can assign their own jobs by using their job.properties, for their respective jobs.

Lactose answered 24/11, 2015 at 10:10 Comment(0)
M
0

For my understanding, using bundle could group a couple of coordinators, so it will be better to manager, to view, to start/stop...

Likely we have two data pipeline, one is for log handing(collect/parse/ETL), one is for business logic.

Then I create two bundles to groups the different kinds of coordinators.

Milliner answered 15/3, 2017 at 3:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.