Optimize mySql for faster alter table add column
Asked Answered
D

5

55

I have a table that has 170,002,225 rows with about 35 columns and two indexes. I want to add a column. The alter table command took about 10 hours. Neither the processor seemed busy during that time nor were there excessive IO waits. This is on a 4 way high performance box with tons of memory.

Is this the best I can do? Is there something I can look at to optimize the add column in tuning of the db?

Delicatessen answered 15/4, 2011 at 14:4 Comment(7)
HDD system was what caused the slowdown, all you can do is either get a faster HDD system (FusionIO if you can afford it) or you'll just have to wait. Alternatively, don't add columns :)Risner
What MySQL version do you use?Pelf
@Michael he said "nor were there excessive io waits."Pelf
@FractalizeR - since it wasn't said HOW the IO info was obtained, I'm inclined to believe that it's the memory/hdd subsystem (or raid controller if any) that holds up the altering of the table. From my experience (and I work with large datasets), it is almost always the storage system that's to blame when it comes to performance issues such as these.Risner
@Michael, yes, that's so. But anyway thorough investigation is needed before you can say something for certain.Pelf
Sometimes things actually are certain to assume. If CPU usage isn't high, that indicates that another part of the system is busy. With larger tables, it's either the misconfiguration so the disk subsystem is being used extensively. Sometimes, RAID controllers might also delay the write operation to the disk which can falsely indicate that IO isn't to blame.Risner
We're using innodb tables. Michael - What is the best thing to look at to determine if the HDD is the problem?Delicatessen
J
39

I faced a very similar situation in the past and i improve the performance of the operation in this way :

  1. Create a new table (using the structure of the current table) with the new column(s) included.
  2. execute a INSERT INTO new_table (column1,..columnN) SELECT (column1,..columnN) FROM current_table;
  3. rename the current table
  4. rename the new table using the name of the current table.
Janeth answered 15/4, 2011 at 14:15 Comment(11)
Some notes... 1. CREATE TABLE new_table LIKE table; 2. INSERT INTO new_table SELECT * FROM table; 3&4. You can get a pseudo-atomic rename if you combine steps 3 & 4 into this: RENAME TABLE table = old_table, table = new_table;Abalone
This is the procedure I use as well. Often, we'll script the select into to slow down (drip feed) in order to allow normal database operations to continue normally.Tews
How do you handle writes that happen to the old table while you are copying it over (i.e. before the rename)?Agglutinative
We put our site into maintenance mode for this kind of heavy operation. Preventing IO during the migration will also improve performance.Pentheus
How do you handle FK's in this case (InnoDB)?Countryandwestern
If you need to keep your database online while making a schema change, this post explains how to do it safely with triggers and delta tables: m.facebook.com/note.php?note_id=430801045932Mislike
You can handle foreign keys using SET FOREIGN_KEY_CHECKS = 0Mislike
Adding to the list of success stories, to keep the post balanced, this doesn't always improve performance. In our case, the insert into NewTable select * from OldTable was running at about 600m/second while the alter table was doing about 1g/second.Strati
percona.com/doc/percona-toolkit/2.2/… is a nifty tool if you want to be able to make these changes while still being online and limit the impact on production performance.Strati
If new table contains N+1 columns, you should write INSERT INTO new_table (column1,..columnN) SELECT (column1,..columnN) FROM current_table;. Otherwise you'll get: #1136 - Column count doesn't match value count at row 1 errorCid
mysql4webdev.blogspot.in/2014/05/… This is the process I follow and have tried a python script for my use caseGranada
A
7

ALTER TABLE in MySQL is actually going to create a new table with new schema, then re-INSERT all the data and delete the old table. You might save some time by creating the new table, loading the data and then renaming the table.

From "High Performance MySQL book" (the percona guys):

The usual trick for loading MyISAM table efficiently is to disable keys, load the data and renalbe the keys:

mysql> ALTER TABLE test.load_data DISABLE KEYS;
-- load data
mysql> ALTER TABLE test.load_data ENABLE KEYS;
Amplexicaul answered 15/4, 2011 at 17:25 Comment(0)
P
4

Well, I would recommend using latest Percona MySQL builds plus since there is the following note in MySQL manual

In other cases, MySQL creates a temporary table, even if the data wouldn't strictly need to be copied. For MyISAM tables, you can speed up the index re-creation operation (which is the slowest part of the alteration process) by setting the myisam_sort_buffer_size system variable to a high value.

You can do ALTER TABLE DISABLE KEYS first, then add column and then ALTER TABLE ENABLE KEYS. I don't see anything can be done here.

BTW, can't you go MongoDB? It doesn't rebuild anything when you add column.

Pelf answered 15/4, 2011 at 15:4 Comment(6)
disable keys and enable keys just works for myisam engine,there's 170,002,225 rows,i think it's using innodb!Brabble
@Brabble maximum number of rows for MyISAM is ~2^32 and this is significantly more than 170 millions. With --with-big-tables option this limit is raised to ~2^64 which is even more.Pelf
MyISAM tables have problems with concurrency due to read locking. InnoDB is clearly the way to go here.Amplexicaul
@Amplexicaul I doubt switching to InnoDB will speed up ALTER TABLEPelf
Agreed. InnoDB is almost always slower for ALTER TABLE for reasons mentions here. The comment was in reference to MyISM max rows being inside 170M rows.Amplexicaul
MyISAM's limit has not been 2^32 for more than a decade.Angelitaangell
P
3

Maybe you can remove the index before alter the table because what is take most of the time to build is the index?

Platelet answered 15/4, 2011 at 14:10 Comment(5)
The index will either not be rebuilt or only be rebuilt once depending on the ALTER TABLE. Removing it and adding it back would not save any time.Abalone
I was pointing to the binary search and that it is expensive to build an index. I've never tried this on 170 Billions rows.Platelet
there are only 170 millions of rows, not billions ;)Pelf
@Phpdna, when indexes are built, it's doing a sort, which is O(nlog n), not a binary search, which is O(log n).Pokpoke
This answer is the most proper one, in practice, Ive tested on - table that has >22.000.000 rows. What I did - dropped all indexes (there were 3), then added the field (that takes 4.23 on HDD) and recreate indexes, which took less then 2 mins each. Darius Jahandarie - isn't right, he do not know, in practice what he is talking about, the answer of RRUZ also seems to work but in practice - INSERT INTO ... SELECT * FROM` takes hours where drop indexes-add column-recreate indexes seems really fast decision. PS I would rather lock this table before to prevent slow-down queries.Sclerotic
G
3

Combining some of the comments on the other answers, this was the solution that worked for me (MySQL 5.6):

  1. create table mytablenew like mytable;
  2. alter table mytablenew add column col4a varchar(12) not null after col4;
  3. alter table mytablenew drop index index1, drop index index2,...drop index indexN;
  4. insert into mytablenew (col1,col2,...colN) select col1,col2,...colN from mytable;
  5. alter table mytablenew add index index1 (col1), add index index2 (col2),...add index indexN (colN);
  6. rename table mytable to mytableold, mytablenew to mytable

On a 75M row table, dropping the indexes before the insert caused the query to complete in 24 minutes rather than 43 minutes.

Other answers/comments have insert into mytablenew (col1) select (col1) from mytable, but this results in ERROR 1241 (21000): Operand should contain 1 column(s) if you have the parenthesis in the select query.

Other answers/comments have insert into mytablenew select * from mytable;, but this results in ERROR 1136 (21S01): Column count doesn't match value count at row 1 if you've already added a column.

Goss answered 29/1, 2020 at 23:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.