Amazon Redshift how to copy from s3 and set a job_id
Asked Answered
A

2

9

Amazon Redshift provides the ability to load table data from s3 objects using the "Copy" command. Is their a way to use the copy command, but also set additional "col=CONSTANT" for each inserted row.

I want to set a job_id (which is not in the source data) on each copied row, and I think it would be a shame to have to execute a few million inserts just so each row has a job attribute, when "copy" gets me 99% of the way there with much better performance.

Maybe there is a more clever solution?

Avouch answered 13/5, 2013 at 14:59 Comment(0)
P
14

If you want all your rows added in a single COPY command to have the same value of job_id, then you may COPY data into staging table, then add job_id column into that table, then insert all data from the staging table into final table like:

CREATE TABLE destination_staging (LIKE destination);
ALTER TABLE destination_staging DROP COLUMN job_id;
COPY destination_staging FROM 's3://data/destination/(...)' (...)
ALTER TABLE destination_staging ADD COLUM job_id INT DEFAULT 42;
INSERT INTO destination SELECT * FROM destination_staging ORDER BY sortkey_column;
DROP TABLE destination_staging;
ANALYZE TABLE destination;
VACUUM destination;

ANALYZE and VACUUM are not necessary, but highly recommended in order to update query analyzer and put all new data into correct positions.

Publican answered 30/5, 2013 at 12:31 Comment(2)
This would cause inconsistent behaviour if you tried to load more than one job at the same time, which is necessary. I suppose we could use "destination_staging_${jobId}" as the table, and that would keep things straight.Avouch
@gbegley, that is correct. You will have to create a separate staging table for each job_id.Publican
M
0

It seems there is no option to do post/pre processing with the COPY command itself. Therefore your best option seems to be to do preprocessing to the files you intend to COPY into Redshift, add the jobid and then load them into Redshift.

Mallarme answered 7/7, 2013 at 6:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.