properly loading datetime in pig
Asked Answered
A

1

5

I'm loading a tsv file with a datetime column and long column with:

A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:datetime, userid:long);
DUMP A;

An example line of input:

Tue Feb 11 05:02:10 +0000 2014  205291417

that line of output:

, 205291417

How do I do this properly?

Ainslee answered 26/2, 2014 at 20:31 Comment(0)
K
12

You'd want to load date as a chararray (date:chararray) and then can convert it to to a datetime using FOREACH GENERATE along with the ToDate Pig built-in function.

The format string is based on the SimpleDateFormat

A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:chararray, userid:long);
B = FOREACH A GENERATE ToDate(date, '<some format string>') AS date, userid;
DUMP B;
Kristofer answered 26/2, 2014 at 20:51 Comment(2)
@kskp Please ask it by clicking the Ask Question button. Comments are for clarification on the existing answerNovember
Sorry. Will do it.Aurelioaurelius

© 2022 - 2024 — McMap. All rights reserved.