Unload data from postgres to s3

Asked 10/3, 2017 at 6:51 Answered 3/4, 2023 at 14:57

postgresql amazon-web-services amazon-s3 amazon-redshift

I'm trying to unload a table data from postgres database into amazon s3.

I'm aware that redshift has a option of unload into s3 - Since redshift is a postgres database, I tried using the same command in my postgres database but was unsuccesful.

Can someone help me with unloading table data from postgres into s3 periodically ?

Coleman answered 10/3, 2017 at 6:51 Comment(2)

COPY ... TO PROGRAM – Carmencita 10/3, 2017 at 7:12

and if you are not superuser, just split to two steps: copy to file and aws s3 cp file to bucket – Gal 10/3, 2017 at 8:10

Redshift is based on a PostgreSQL clone but there's not 1-1 feature correspondence. If you want to load data from a PostgreSQL DB to Redshift, through S3, you should:

Unload your data from PostgreSQL to a CSV file. To do that use the copy command of psql. See also this Question here.
Copy the CSV file on S3. There are different ways to do that but check the documentation here
Use the COPY command to load the data from S3 to Redshift

Frankenstein answered 12/3, 2017 at 18:43 Comment(0)

On Redshift you can create a table to receive the data:

CREATE TABLE redshift_schema.redshift_table (...);

Then create a foreign data wrapper, server and a virtual phantom of the table in PostgreSQL RDS:

CREATE EXTENSION redshift_fdw;

----optional
--CREATE FOREIGN DATA WRAPPER redshift_fdw
--HANDLER postgres_fdw_handler
--VALIDATOR postgres_fdw_validator
--OPTIONS ();

CREATE SERVER redshift_server_mydb
FOREIGN DATA WRAPPER redshift_fdw
OPTIONS (dbname 'mydb', port '5439', connect_timeout '200000', host 'myhost.redshift.amazonaws.com');

CREATE USER MAPPING FOR mypguser
SERVER redshift_server_mydb
OPTIONS (user 'myrsuser', password 'mypassword');

IMPORT FOREIGN SCHEMA redshift_schema 
LIMIT TO (redshift_table) 
FROM SERVER redshift_server_mydb
INTO postgresql_schema;

Now in PostgreSQL you can (in a function if you like) load (select, insert, update, delete) the Redshift table from the PostgreSQL table (without using dblink):

INSERT INTO postgresql_schema.redshift_table
SELECT *
FROM postgresql_schema.postgresql_table;

Now when you look at the Redshift table all the data is there and you can UNLOAD the table to S3 as required.

Reptilian answered 14/9, 2018 at 15:22 Comment(0)

To export data from a PostgreSQL RDS instance to S3, use the aws_s3 extension: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/postgresql-s3-export.html

CREATE EXTENSION aws_s3 CASCADE;

SELECT * from aws_s3.query_export_to_s3('select * from sample_table', 
   aws_commons.create_s3_uri('sample-bucket', 'sample-filepath', 'us-west-2')
);

Michaelamichaele answered 3/4, 2023 at 14:57 Comment(0)

I used the following code-snippet for that:

UNLOAD ('SELECT * FROM my_table') TO 's3://my-bucket/path' credentials 'aws_access_key_id=*****;aws_secret_access_key=*****' parallel off delimiter ',' ALLOWOVERWRITE ADDQUOTES HEADER ESCAPE gzip;

You can check also the following links:

https://docs.aws.amazon.com/redshift/latest/dg/t_Unloading_tables.html https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD_command_examples.html

Bit answered 19/8, 2021 at 12:6 Comment(0)

Recommended topics

Hot tags