Since there is no else or default statements in pig split operation what would be the most elegant way to do the following? I'm not a big fan of having to copy paste code.
SPLIT rawish_data
INTO good_rawish_data IF (
(uid > 0L) AND
(value1 > 0) AND
(value1 < 100) AND
(value1 IS NOT NULL) AND
(value2 > 0L) AND
(value2 < 200L) AND
(value3 >= 0) AND
(value3 <= 300)),
bad_rawish_data IF (NOT (
(uid > 0L) AND
(value1 > 0) AND
(value1 < 100) AND
(value1 IS NOT NULL) AND
(value2 > 0L) AND
(value2 < 200L) AND
(value3 >= 0) AND
(value3 <= 300)));
I would like to do something like
SPLIT data
INTO good_data IF (
(value > 0)),
good_data_big_values IF (
(value > 100)),
bad_data DEFAULT;
Is anything like this possible in anyway?
bad_data
will NOT contain rows where value is null! You need to specifically check for null or those rows will be dropped in this expression. – Chorography