Importing zipped CSV file into PostgreSQL
Asked Answered
F

3

20

I have a big compressed csv file (25gb) and I want to import it into PostgreSQL 9.5 version. Is there any fast way to import zip or qzip file into postgres without extracting the file?

Fusion answered 19/1, 2017 at 10:0 Comment(3)
There is nothing built-in, but copy can get the input from a program so you could run unzip and pipe the output into the copy (or \copy) command. Some SQL clients also offer that featureJuliannejuliano
I am using pgAdmin III client which does not have option for zip file. using this copy command that you mentioned would be faster or extracting the file first then import into postgresql?Fusion
COPY is the command used to IMPORT and EXPORT Data from/to CSV and which doesn't have capability to extract a zip file.If you want to import using pgAdmin what you've to do is extract the zip file then use COPY to import extracted CSV fileItalian
R
25

There is an old trick to use a named pipe (works on Unix, don't know about Windows)

  • create a named pipe: mkfifo /tmp/omyfifo
  • write the file contents to it: zcat mycsv.csv.z > /tmp/omyfifo &
  • [from psql] copy mytable(col1,...) from '/tmp/omyfifo'
  • [when finished] : rm /tmp/omyfifo

The zcat in the backgound will block until a reader (here: the COPY command) will start reading, and it will finish at EOF. (or if the reader closes the pipe)

You could even start multiple pipes+zcat pairs, which will be picked up by multiple COPY statements in your sql script.


This will work from pgadmin, but the fifo (+zcat process) should be present on the machine where the DBMS server runs.


BTW: a similar trick using netcat can be used to read a file from a remote machine (which of course should write the file to the network socket)

Relator answered 19/1, 2017 at 12:18 Comment(0)
T
18

example how to do it with zcat and pipe:

-bash-4.2$ psql -p 5555 t -c "copy tp to '/tmp/tp.csv';"
COPY 1
-bash-4.2$ gzip /tmp/tp.csv
-bash-4.2$ zcat /tmp/tp.csv.gz | psql -p 5555 t -c "copy tp from stdin;"
COPY 1
-bash-4.2$ psql -p 5555 t -c "select count(*) from tp"
 count
-------
     2
(1 row)

also from 9.3 release you can:

psql -p 5555 t -c "copy tp from program 'zcat /tmp/tp.csv.gz';"

without pipe at all

Transitive answered 19/1, 2017 at 10:17 Comment(2)
With the program attribute, you can move the zcat inside the copy command itself. psql -p 5555 t -c "copy tp from program 'zcat /tmp/tp.csv.gz';" The ability to run programs that generate the output is independently useful. (postgresql.org/docs/9.5/static/sql-copy.html)Demibastion
true - copy from stdin requires SU anyway - so could be copy form programTransitive
I
3

If you have a ZIP (.zip) instead of a GZIP (.gz) archive, you can use unzip -p to pipe the zipped file.

psql -p 5555 -t -c "copy tp from program 'unzip -p /tmp/tp.csv.zip';"
Incipient answered 1/9, 2020 at 15:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.