Why is MySQL 'insert into ... select ...' so much slower than a select alone?
Asked Answered
F

3

13

I'm trying to store a query result in a temporary table for further processing.

create temporary table tmpTest
(
    a FLOAT,
    b FLOAT,
    c FLOAT
)
engine = memory;

insert into tmpTest
(
    select a,b,c from someTable
    where ...
);

But for some reason the insert takes up to a minute, whereas the subselect alone just takes a few seconds. Why would it take so much longer to write the data to a temporary table instead of printing it to my SQL management tool's output???

UPDATE My Setup: MySQL 7.3.2 Cluster with 8 Debian Linux ndb data nodes 1 SQL Node (Windows Server 2012)

The table I'm running the select on is a ndb table.

I tried to find out, if the execution plan would differ when using 'insert into..', but they look the same: (sorry for the formatting, stackoverflow doesn't have tables)

id  select_type     table       type    possible_keys   key     key_len ref                 rows        Extra
1   PRIMARY         <subquery3> ALL     \N              \N      \N      \N                  \N          \N
1   PRIMARY         foo         ref     PRIMARY         PRIMARY 3       <subquery3>.fooId   9747434     Using where
2   SUBQUERY        someTable   range   PRIMARY         PRIMARY 3       \N                  136933000   Using where with pushed condition; Using MRR; Using temporary; Using filesort
3   MATERIALIZED    tmpBar      ALL     \N              \N      \N      \N                  1000        \N

CREATE TABLE ... SELECT is slow, too. 47 seconds vs. 5 seconds without table insert/create.

Frond answered 9/10, 2013 at 10:12 Comment(5)
You should be more specific and provide some data you're writing. Also INSERT .. SELECT syntax is different. Your sample will result in error.Franconia
good question. I don't really know how mysql reserves memory for itself. If it goes through the OS api, it might just asks for memory, which would be then served as either ram or harddrive space, according to system's status. See windows' virtual bytes management.Rem
Have you tried a CREATE TABLE ... SELECT-type query for comparison? dev.mysql.com/doc/refman/5.0/en/create-table-select.htmlTheology
@AlmaDoMundo: why do you say my syntax is wrong? I don't get any errors.Frond
@xiankai: I did :( Same result as INSERT INTO ... SELECTFrond
C
6

I wrote a comment above, then stumbled across this as a workaround.

This will accomplish what you want to do.

SELECT * FROM aTable INTO OUTFILE '/tmp/atable.txt';
LOAD DATA INFILE '/tmp/atable.txt' INTO TABLE anotherTable;

Note that doing this means managing the /tmp tables in some way. If you try to SELECT data into an OUTFILE that already exists, you get an error. So you need to generate unique temporary file names. And then run a cron job of some sort to go clean them up.

I guess INFILE and OUTFILE behave differently. If someone can shed some light on what is going on here to explain mysql behavior, I would appreciate it.

D

Here is a better way than using INFILE / OUTFILE.

SET TRANSACTION ISOLATION LEVEL READ COMMITTED; INSERT INTO aTable SELECT ... FROM ...

Here is a relevant post to read :

How to improve INSERT INTO ... SELECT locking behavior

Camphor answered 1/11, 2013 at 9:29 Comment(1)
Default isolation level for InnoDB is REPEATABLE READ. RefGrassi
J
1

I experiences the same issue and was playing around with subqueries that actually solved it. If the select has a huge amount of rows, its taking very long to insert the data. Example:

INSERT INTO b2b_customers (b2b_name, b2b_address, b2b_language)
SELECT customer_name, customer_address, customer_language
FROM customers
WHERE customer_name LIKE "%john%"
ORDER BY customer_created_date DESC
LIMIT 1

using LIMIT in combination to INSERT data is not a good option. So you could use 2 seperate queries for getting data and inserting, or you can use a subquery. Example:

INSERT INTO b2b_customers (b2b_name, b2b_address, b2b_language)
SELECT * FROM (
SELECT customer_name, customer_address, customer_language
FROM customers
WHERE customer_name LIKE "%john%"
ORDER BY customer_created_date DESC
LIMIT 1
) sub1

that would be a fast solution without changing your script.

So im not sure why its taking 0.01 seconds to run the subquery and 60 seconds to run the insert. I get 1000+ results without the Limit. In my case the subquery improved performance from 60 seconds to 0.01 seconds.

Jentoft answered 3/11, 2016 at 9:28 Comment(6)
You're not sure why it takes a millisecond to read the data an 60 seconds to write 1000+ records? Maybe because reading is faster than writing?Telephonist
yes, thats true, but im writing only 1 row while using LIMIT 1.Jentoft
WHERE customer_name LIKE "%john%" - this is a full table scan. Once every single record is scanned and reduced to retrieve what contains john. This takes a while. Then you write 1 record. It's inefficient.Telephonist
so mysql is writing all data into a temporary table and at the end it recognizes that I put a LIMIT 1 and writes only the first record into the table ? that would also explain why my capsulation with the subquery would run much faster.Jentoft
It buffers the records it found (we can say it writes to temp table, yes) and then it uses only one record, correct.Telephonist
The results of the SELECT are buffered, so none of the records are inserted until the SELECT finishes.Renvoi
E
0

The reason has to do with how the computer reads and writes and the way a temp file works. Select is reading data that is in an indexed file on the hard drive whereas insert is using a temporary file and is writing to that file. More RAM is required and doing so is more difficult. As for why it takes a minute, I am not sure exactly but I think the code might be a little incorrect so that would contribute.

Ezechiel answered 9/10, 2013 at 10:36 Comment(2)
My temporary table has its engine explicitly set to 'memory'. Doesn't this mean it should stay in RAM? My machine has 32GB RAM of which 39% are used. And my temporary table just has 1000 rows...Frond
Did you ever get an answer to this? Or figure out how to improve the performance? I am getting the same behavior, where the SELECT takes sub-second, but the INSERT takes 30 seconds. It's completely baffling. I wonder if it is a bug? To get around this, I tried creating a cursor on the data I was SELECTing and INSERTED one row at a time and it was faster. Which is crazy. I will try a question similar to yours, but it appears that there is no answer to this.Camphor

© 2022 - 2024 — McMap. All rights reserved.