How can I check in piglatin, if a bag contains an element?
Example : In a bag of chararray, how can I check if a token is present?
How can I check in piglatin, if a bag contains an element?
Example : In a bag of chararray, how can I check if a token is present?
In Apache Pig you can use statements nested in FOREACH see Pig Basics. Here is example from the documentation:
A
is a bag in B
.
X = FOREACH B {
S = FILTER A BY 'xyz';
GENERATE COUNT (S.$0);
}
Instead of COUNT you can use IsEmpty and ?: operator
X = FOREACH B {
S = FILTER A BY 'xyz';
GENERATE (IsEmpty(S.$0)) ? 'xyz NOT PRESENT' : 'xyz PRESENT') as present, B;
}
Or only to leave the bags that contain the data:
X = FOREACH B {
S = FILTER A BY 'xyz';
GENERATE B, S;
}
F = FILTER X BY not IsEmpty(S);
R = FOREACH F GENERATE B;
This will avoid costly join to itself, as extra joins are extra Map Reduce jobs.
© 2022 - 2024 — McMap. All rights reserved.