(Even more basic than Difference between Pig and Hive? Why have both?)
I have a data processing pipeline written in several Java map-reduce tasks over Hadoop (my own custom code, derived from Hadoop's Mapper and Reducer). It's a series of basic operations such as join, inverse, sort and group by. My code is involved and not very generic.
What are the pros and cons of continuing this admittedly development-intensive approach vs. migrating everything to Pig/Hive with several UDFs? which jobs won't I be able to execute? will I suffer a performance degradation (working with 100s of TB)? will I lose ability to tweak and debug my code when maintaining? will I be able to pipeline part of the jobs as Java map-reduce and use their input-output with my Pig/Hive jobs?