I am trying to do a star schema type of join in pig and below is my code. When I join multiple relations with different columns, I have to prefix the name of the previous join every time to get it working. I am sure there should be some better way, I am not able to find it through googling. Any pointers will be very helpful.
i.e prefixing a column like this "H864::H86::hs_8_d::hs_8_desc" is what I want to avoid.
hs_8 = LOAD 'hs_8_distinct' USING PigStorage('^') as (hs_8:chararray,hs_8_desc:chararray);
hs_8_d = FOREACH hs_8 GENERATE SUBSTRING(hs_8,0,2) as hs_2,SUBSTRING(hs_8,0,4) as hs_4,SUBSTRING(hs_8,0,6) as hs_6,hs_8,hs_8_desc;
hs_6_d = LOAD 'hs_6_distinct' USING PigStorage('^') as (hs_6:chararray,hs_6_desc:chararray);
hs_4_d = LOAD 'hs_4_distinct' USING PigStorage('^') as (hs_4:chararray,hs_4_desc:chararray);
hs_2_d = LOAD 'hs_2_distinct' USING PigStorage('^') as (hs_2:chararray,hs_2_desc:chararray);
H86 = JOIN hs_8_d BY hs_6, hs_6_d BY hs_6 USING 'replicated' ;
H864 = JOIN H86 BY hs_8_d::hs_4, hs_4_d BY hs_4 USING 'replicated' ;
H8642 = JOIN H864 BY H86::hs_8_d::hs_2, hs_2_d BY hs_2 USING 'replicated' ;
hs_dim = FOREACH H8642 GENERATE hs_2_d::hs_2,hs_2_d::hs_2_desc,H864::hs_4_d::hs_4,H864::hs_4_d::hs_4_desc,H864::H86::hs_6_d::hs_6,H864::H86::hs_6_d::hs_6_desc,H864::H86::hs_8_d::hs_8,H864::H86::hs_8_d::hs_8_desc;