the manual/documentation uses the language of 'inner bag' and 'outer bag' extensively (say: http://pig.apache.org/docs/r0.11.1/basic.html ), and yet I haven't been able to pin out clearly the precise definition separating the terms.
e.g. all inherently interrelated:
- If I give you a bag 'foo,' what would you need to know to label foo as an 'inner bag' vs. an 'outer bag'?
- Is 'any bag' who is not the most outer-bag then ' an inner bag' ?
- Are the labels of inner and outer always exclusive?
- In PigLatin, are all 'bags' 'relations' -- or is only 'the most outer bag' a relation? (and inner bags are not relations)
to create a discussable example:
grunt> dump A;
(1,2,3)
(4,2,1)
(8,3,4)
(4,3,3)
grunt> W1 = GROUP A ALL;
grunt> W2 = GROUP W1 ALL;
grunt> W3 = GROUP W2 ALL;
grunt> W4 = GROUP W3 ALL;
grunt> describe W4;
W4: {group: chararray,W3: {(group: chararray,W2: {(group: chararray,W1: {(group: chararray,A: {(f1: int,f2: int,f3: int)})})})}}
grunt> illustrate W4;
(1,2,3)
---------------------------------------------------
| A | f1:int | f2:int | f3:int |
---------------------------------------------------
| | 1 | 2 | 3 |
| | 8 | 3 | 4 |
---------------------------------------------------
------------------------------------------------------------------------------------------------
| W1 | group:chararray | A:bag{:tuple(f1:int,f2:int,f3:int)} |
------------------------------------------------------------------------------------------------
| | all | {(1, 2, 3), (8, 3, 4)} |
------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------
| W2 | group:chararray | W1:bag{:tuple(group:chararray,A:bag{:tuple(f1:int,f2:int,f3:int)})} |
-----------------------------------------------------------------------------------------------------------------------------------------------
| | all | {(all, {(1, 2, 3), (8, 3, 4)})} |
-----------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| W3 | group:chararray | W2:bag{:tuple(group:chararray,W1:bag{:tuple(group:chararray,A:bag{:tuple(f1:int,f2:int,f3:int)})})} |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| | all | {(all, {(all, {(1, 2, 3), (8, 3, 4)})})} |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| W4 | group:chararray | W3:bag{:tuple(group:chararray,W2:bag{:tuple(group:chararray,W1:bag{:tuple(group:chararray,A:bag{:tuple(f1:int,f2:int,f3:int)})})})} |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| | all | {(all, {(all, {(all, {(1, 2, 3), (8, 3, 4)})})})} |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
grunt> dump W4;
(all,{(all,{(all,{(all,{(1,2,3),(4,2,1),(8,3,4),(4,3,3)})})})})
amongst the bags - W1, W2, W3, W4 -- which is inner, which is outer?