S3 to Redshift input data format
Asked Answered
S

1

6

I'm trying to run a simple chain s3-pipeline-redshift, but I've got completely stucked with input data format. Here's my file:

1,Toyota Park,Bridgeview,IL
2,Columbus Crew Stadium,Columbus,OH
3,RFK Stadium,Washington,DC
4,CommunityAmerica Ballpark,Kansas City,KS
5,Gillette Stadium,Foxborough,MA
6,New York Giants Stadium,East Rutherford,NJ
7,BMO Field,Toronto,ON
8,The Home Depot Center,Carson,CA
9,Dick's Sporting Goods Park,Commerce City,CO
10,Pizza Hut Park,Frisco,TX

and here's the table I'm using:

    create table venue_new(
    venueid smallint not null,
    venuename varchar(100) not null,
    venuecity varchar(30),
    venuestate char(2),
    venueseats integer not null default '1000');

When I use | as a delimiter, I'm getting error 1214 - Delimiter not found , when I use comma - same thing, when I converted file to utf-8, I'm getting "Invalid digit, Value '.', Pos 0, Type: Short'. I ran out of ideas. What a heck is wrong with that thing? Can somebody please give me the example of the input file or tell what I'm doing wrong? Thanks in advance. P.S. I also found that sample files are available in bucket awssampledb, but I have no idea how to get them.

Sardonic answered 19/2, 2014 at 19:58 Comment(0)
L
8

Based on the data in the file example. You need to remember that you have 5 fields in your table, and there is no 5th field in any of your data - yet it is a not null field. Your Copy command needs to reference the 4 columns you are providing at the start of the statement.

copy venue_new(venueid, venuename, venuecity, venuestate) 
from 's3://mybucket/data/venue_noseats.txt' 
credentials 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>'
delimiter ',';

I found the above command (from AWS Docs COPY examples worked successfully for me, leaving me with the default 1000 in the 'venueseats' column.

Lizliza answered 20/2, 2014 at 11:37 Comment(1)
Thanks a ton, you saved my day. I defined the last parameter explicitly and it started to work. But as for the solution you provided, how can I do the same thing in pipeline?Sardonic

© 2022 - 2024 — McMap. All rights reserved.