Perl DBI insert multiple rows using mysql native multiple insert ability
Asked Answered
M

3

12

Has anyone seen a DBI-type module for Perl which capitalizes, easily, on MySQL's multi-insert syntax

insert into TBL (col1, col2, col3) values (1,2,3),(4,5,6),...?

I've not yet found an interface which allows me to do that. The only thing I HAVE found is looping through my array. This method seems a lot less optimal vs throwing everything into a single line and letting MySQL handle it. I've not found any documentation out there IE google which sheds light on this short of rolling my own code to do it.

TIA

Marmolada answered 7/12, 2011 at 19:59 Comment(1)
So what I ended up doing was using DBIx::Simple with SQL::Abstract. That is why I haven't decided which of these to use as the accepted answer, yet :)... Just letting you both know. I do appreciate all of your assistance, though.Marmolada
P
12

There are two approaches. You can insert (?, ?, ?) a number of times based on the size of the array. The text manipulation would be something like:

my $sql_values = join( ' ', ('(?, ?, ?)') x scalar(@array) );

Then flatten the array for calling execute(). I would avoid this way because of the thorny string and array manipulation that needs to be done.

The other way is to begin a transaction, then run a single insert statement multiple times.

my $sql = 'INSERT INTO tbl (col1, col2, col3)';
$dbh->{AutoCommit} = 0;
my $sth = $dbh->prepare_cached( $sql );
$sth->execute( @$_ ) for @array;
$sth->finish;
$dbh->{AutoCommit} = 1;

This is a bit slower than the first method, but it still avoids reparsing the statement. It also avoids the subtle manipulations of the first solution, while still being atomic and allowing disk I/O to be optimized.

Peraea answered 7/12, 2011 at 20:16 Comment(4)
This is what execute_array in DBI is for. Also using prepare_cached here makes no difference as you've only prepared once so this example might as well be a simple prepare.Nonparous
@Nonparous - I consider prepare_cached() to be good standard practice. In a standalone script, it doesn't particularly matter. In a persistent environment, it will help in future invocations. Might as well do it out of habit.Peraea
Only problem with this is, that if you have lots and lots of data to insert, my testing shows that the (?,?,?),(?,?,?).... solution is many times (>10 times) faster than multiple executes. :-(Innkeeper
The second variant didn't work for me in MariaDB until I added " VALUES (?, ?, ?)" at the end of SQL.Postlude
N
9

If DBD::mysql supported DBI's execute_for_fetch (see DBI's execute_array and execute_for_fetch) this is the typical usage scenario i.e., you have multiple rows of inserts/updates/deletes available now and want to send them in one go (or in batches). I've no idea if the mysql client libs support sending multiple rows of bound parameters in one go but most other database client libs do and can take advantage of DBI's execute_array/execute_for_fetch. Unfortunately few DBDs actually implement execute_array/execute_for_fetch and rely on DBI implementing it one row at a time.

Nonparous answered 7/12, 2011 at 20:26 Comment(0)
M
1

Jim, Frezik has it. That is probably the most optimal:

my $sth = $dbh->prepare( 'INSERT INTO tbl (?, ?, ?)' );
foreach(@array) { $sth->execute( @{$_} ); }
$sth->finish;
Muttonhead answered 7/12, 2011 at 20:34 Comment(4)
I disagree. The most optimal would be to send batches of rows in one go like ODBC and Oracle can do. However, I don't know mysql/DBD::mysql can do this as yet. If DBD::mysql implemented execute_for_fetch/execute_array and the mysql client libs supported sending batches or rows of parameters in one go this would be undoubtably faster.Nonparous
and BTW, that call to finish should not be required and is bad practise - read up on the pod for the finish method.Nonparous
@Nonparous - From the DBI pod: "When all the data has been fetched from a SELECT statement, the driver will automatically call finish for you." We're doing an INSERT rather than SELECT. It's not clear to me from the docs if it's still needed in this case. Thanks for bringing this up, though, because most DBI tutorials get you to always use finish(), and it's clear from the docs that this isn't correct.Peraea
Also from the DBI pod "You almost certainly do not need to call this method." and "The finish method should have been called discard_pending_rows."Nonparous

© 2022 - 2024 — McMap. All rights reserved.