How to COPY into one PostgreSQL table in parallel without lock?
Asked Answered
T

0

10

As I see in pg_stat_activiry, only one of COPY command executes at once. Other queries are in Lock state as I see in wait_event_type column.

There are only one active query at once

How can I run several COPY mytable FROM STDIN in parallel without locking table?

ps. mytable is hypertable of TimescaleDB 2.5.0.

UPD

CREATE TABLE "public"."mytable" (
    "q_time" timestamp,
    "symbol_id" int,
    "o" decimal(24,12),
    "c" decimal(24,12),
    "h" decimal(24,12),
    "l" decimal(24,12),
    "v" bigint,
    CONSTRAINT mytable_ts_pkey PRIMARY KEY (symbol_id, "q_time")
);

SELECT create_hypertable('mytable', 'q_time', 'symbol_id', 1,
  create_default_indexes => false, 
  chunk_time_interval => '7 days'::interval);

UPD2

I run in parallel next commands:

out, err := exec.Command("bash", "-c", "cat file01.gz | gunzip | psql -d db -U user -c "\copy mytable from stdin HEADER DELIMITER ';' CSV\"").Output()

TimescaleDB 2.5.0

PostgreSQL 13

max_connections = 200

max_worker_processes = 21

max_parallel_workers = 10

Thence answered 6/11, 2021 at 21:45 Comment(7)
Plain PostgreSQL doesn't do this automatically. Must be something you are doing, or something timescaledb is doing. Can you show how to set up the table with timescaledb?Joiner
@Joiner schema updatedThence
Seems copy will do a full lock by default. I know Timescale also offers a tool for a parallel copy that maybe can be useful. As it's for faster ingestion, probably it overcome this issue: github.com/timescale/timescaledb-parallel-copyIchthyo
As I see this tool just paralleled COPY to several connections like in my case. And as I see this is not worked way in my case... github.com/timescale/timescaledb-parallel-copy/blob/master/cmd/…Thence
If it is possible in your case to use file_fdw then why not attach the file as a foreign table (or a temporary foreign table in pg_temp schema) and then insert into mytable ... select ... from the_foreign_table ...? I think that this is more powerful and flexible than COPY as (almost) all SQL features are available.Skirting
@Thence curious to know what was the issue in the timescaledb-parallel-copy, was it not doing the COPY commands in parallel? Also, how many cores on your PG server?Sire
@Anand Sowmithiran 10 cores. I will try later way without psql, by advice from timescaledb's slack - psql may be reason of locking.Thence

© 2022 - 2024 — McMap. All rights reserved.