Bulk INSERT in Postgres in GO using pgx
Asked Answered
B

2

10

I am trying to bulk insert keys in db in go here is the code Key Struct

type tempKey struct {
keyVal  string
lastKey int

}

Test Keys

data := []tempKey{
    {keyVal: "abc", lastKey: 10},
    {keyVal: "dns", lastKey: 11},
    {keyVal: "qwe", lastKey: 12},
    {keyVal: "dss", lastKey: 13},
    {keyVal: "xcmk", lastKey: 14},
}

Insertion part

dbUrl := "db url...."
conn, err := pgx.Connect(context.Background(), dbUrl)
if err != nil {
    println("Errrorr...")
}
defer conn.Close(context.Background())
sqlStr := "INSERT INTO keys (keyval,lastval) VALUES "
dollars := ""
vals := []interface{}{}
count := 1
for _, row := range data {
    dollars = fmt.Sprintf("%s($%d, $%d),", dollars, count, count+1)
    vals = append(vals, row.keyVal, row.lastKey)
    count += 2
}
sqlStr += dollars
sqlStr = sqlStr[0 : len(sqlStr)-1]
fmt.Printf("%s \n", sqlStr)

_, erro := conn.Exec(context.Background(), sqlStr, vals)
if erro != nil {
    fmt.Fprint(os.Stderr, "Error : \n", erro)
}

on running it throws error: expected 10 arguments, got 1

What is the correct way of bulk inserting.

Blok answered 23/1, 2022 at 14:30 Comment(0)
N
24

You are crafting the SQL statement by hand, which is fine, but you are not leveraging pgx which can help with this (see below).

Appending to the SQL string like so can be inefficient for large inputs

dollars = fmt.Sprintf("%s($%d, $%d),", dollars, count, count+1)

but also the final value has a trailing , where instead you need a termination character ; to indicate the end of the statement.

BTW this string truncation line is redundant:

sqlStr = sqlStr[0 : len(sqlStr)-1] // this is a NOOP

Anyway, better to use something more performant like strings.Builder when crafting long strings.


From the pgx docs, use pgx.Conn.CopyFrom:

func (c *Conn) CopyFrom(tableName Identifier, columnNames []string, rowSrc CopyFromSource) (int, error)

CopyFrom uses the PostgreSQL copy protocol to perform bulk data insertion. It returns the number of rows copied and an error.

example usage of Copy:

rows := [][]interface{}{
    {"John", "Smith", int32(36)},
    {"Jane", "Doe", int32(29)},
}

copyCount, err := conn.CopyFrom(
    pgx.Identifier{"people"},
    []string{"first_name", "last_name", "age"},
    pgx.CopyFromRows(rows),
)
Nicholson answered 23/1, 2022 at 15:4 Comment(2)
Thanks for additional info also..... I wasn't aware of Copy til now....made it workedBlok
@Blok it can be a PK violation on CopyFrom depending on tables structure. If it is your case then create temp table on session, do bulk insert there and then do select to maintable from temp on conflict do nothing. See details in: #13947827Allin
A
16

use batch (https://github.com/jackc/pgx/blob/master/batch_test.go):

batch := &pgx.Batch{}
batch.Queue("insert into ledger(description, amount) values($1, $2)", "q1", 1)
batch.Queue("insert into ledger(description, amount) values($1, $2)", "q2", 2)
br := conn.SendBatch(context.Background(), batch)
Allin answered 23/1, 2022 at 14:54 Comment(3)
Thanks....Its simple and worked like charm but I have doubt does this have any performance hits from the answer of @colm.anseo.Blok
yes. It slower than CopyFrom.Allin
Batch is slower than CopyFrom. But the usage depends on business case. If you have different inserts into multiple tables then it probably make sense to use batch. If you need maximum insertion rate into single table, then of course CopyFrom.Allin

© 2022 - 2024 — McMap. All rights reserved.