concatenate a string to a field in pig
Asked Answered
M

1

6

I like to concat a string to all data in a field?

example a dataset mydata contains following field ( id, name, email ) i like to add a prefix of string test to all the data in the field name.

I tried

a = load 'mydata.csv' as (id, name, email);
b = foreach a generate id, concat('test', chararray(name)); 

i'm getting empty results on this

any thoughts ?

Magnitogorsk answered 30/1, 2015 at 0:47 Comment(0)
P
6
  1. In pig concat keyword should be in Capital letters not small letters. You need to change the keyword concat to CONCAT.
  2. You are loading a CSV file with default delimiter(tab). Are you sure that your csv file is tab separate delimiter for each field? other wise you will get a weird result. Incase your csv file is comma separated delimiter then specify the explicit delimiter as comma in the PigStorage.
  3. Its always safe to specify the schema during load, it will avoid unnecessary explicit typecast.

Sample example:

input.csv

1,aaa,[email protected]
2,bbb,[email protected]
3,ccc,[email protected]

PigScript:

a = load 'input.csv' using PigStorage(',') as (id:int, name:chararray, email:chararray);
b = foreach a generate id, CONCAT('test', name);
DUMP b;

Output:

(1,testaaa)
(2,testbbb)
(3,testccc)

Incase your csv file is already tab separated delimiter then fix only the CONCAT issue.

Peeress answered 30/1, 2015 at 2:15 Comment(1)
Thanks Shiva your solution worked and the mistake i made is i haven't used explicit delimiter as comma in the PigStorage.Magnitogorsk

© 2022 - 2024 — McMap. All rights reserved.