Inserting rows one at a time, and autocommitting each statement, has two overheads.
Each transaction has overhead, probably more than one insert. So inserting multiple rows in one transaction is the trick. This requires a code change, not a configuration change.
Each INSERT
statement has overhead. One insert has about 90% over head and 10% actual insert.
The optimal is 100-1000 rows being inserted per transaction.
For rapid inserts:
- Best is
LOAD DATA
-- if you are starting with a .csv file. If you must build the .csv file first, then it is debatable whether that overhead makes this approach lose.
- Second best is multi-row
INSERT
statements: INSERT INTO t (a,b) VALUES (1,2), (2,3), (44,55), ...
. I recommend 1000 per statement, and COMMIT
each statement. This is likely to get you past 1000 rows per second being inserted.
Another problem... Since each index is updated as the row is inserted, you may run into trouble with thrashing I/O to achieve this task. InnoDB automatically "delays" updates to non-unique secondary indexes (no need for INSERT DELAYED
), but the work is eventually done. (So RAM size and innodb_buffer_pool_size
come into play.)
If the "thousands" of rows/second is a one time task, then you can stop reading here. If you expect to do this continually 'forever', there are other issues to contend with. See High speed ingestion .
ON DUPLICATE KEY UPDATE
could work for updates unless better is needed. NOTE: fewer transactions is also generally good if you're using transactions. – Edmondsinsert
transactions and just modify the insert code to do 2 inserts every execute call that might give you an idea if batching might help. Of course, you're right if it's tuning it'll be a lot more difficult for us to help from a generic standpoint. Also you're 100% autocommit is off? – Edmonds