Workflow tool comaparison: Oozie Vs Cascading
Asked Answered
A

2

7

I am looking for a workflow tool to run complex map-reduce jobs. I have Oozie in mind but also want to explore Cascading. Is there any sample code or example that chains existing M/R jobs using cascading API? Also, can you provide the comparison Oozie Vs Cascading?

Addlebrained answered 3/7, 2012 at 18:36 Comment(0)
I
7

Cascading and Oozie are not in the same category.

Oozie is a workflow scheduler.

Cascading is an API for creating workflows. It is agnostic about schedulers, i.e., it should run with whatever scheduler system that you use.

There is perhaps some confusion because the Oozie docs mention a "DAG", and both run atop Hadoop.

Also, Cascading has a notion of "data availability" in the checkpoint support, which is supported in Oozie, albeit differently.

Infect answered 3/1, 2013 at 6:6 Comment(0)
B
0

Personally i play around with both to some extend, what i found interesting with cascading is

1)concise and expressive in terms of simple keywords like flow,tap,pipe etc.,

2)amazing TDD based approach for local development and research

3)nice planner view(.dot file) and will be useful once the project is grown, so maintenance is ease.

4)DSL based approach using groovy,scala,cloujre. so no need to worry about learning any new language or rather hadoop.

5)simple cloud deployment(e.g. amazon support as raw jar deployment).

6)you can call anything like existing pig or hive or pure other MR jar as long as they expose java api.

7)amazing for ML and NLP related works.

Becka answered 31/10, 2013 at 7:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.