(I disagree with some of the other Answers/Comments. I will try to answer all the questions, plus address all the points that I disagree with.)
MEDIUMINT
is 3 bytes, saving 1 byte per row over INT
.
TINYINT
is 1 bytes, saving 3 bytes per row over INT
.
In both cases, there is another 1 or 3 bytes saved per occurrence in any INDEX
other than the PRIMARY KEY
.
If you are likely to have more data+index than space in RAM, then it is wise to shrink the datatypes but be conservative.
Use MEDIUMINT UNSIGNED
(etc) if the value is non-negative, such as for AUTO_INCREMENT
. That gives you a limit of 16M instead of 8M. (Yeah, yeah, that's a tiny improvement.)
Beware of "burning" AUTO_INCREMENT
ids -- INSERT IGNORE
(and several other commands) will allocate the next auto_inc before checking whether it will be used.
Even if data+index exceeds RAM size (actually innodb_buffer_pool_size
), it may not slow down to disk speed -- it depends on access patterns of the data. Beware of UUIDs, they are terribly random. Using UUIDs when you can't cache the entire index is deadly. The buffer_pool is a cache. (I have seen a 1TB dataset run fast enough with only 32GB of RAM and a spinning disk.)
Using ALTER TABLE
to change a datatype probably (I am not sure) rebuilds the table, thereby performing the equivalent of OPTIMIZE TABLE
.
If the table was created with innodb_file_per_table = OFF
and you turn it ON
before doing the ALTER
, you get a separate file for the table, but ibdata1
will not shrink (instead it will have lots more free space).
Alignment of 3-byte numbers -- not an issue. Powers of 2 is not relevant here. MySQL assumes all columns are at poor boundaries, and of poor sizes. All numbers are converted to a generic format (64-bit numbers) for operating on. This conversion is an insignificant part of the total time -- fetching the row (even if cached) is the most costly part.
When I/O-bound, shrinking datatypes leads to more rows per block, which leads to fewer disk hits (except in the UUID case). When I/O-bound, hitting the disk is overwhelming the biggest performance cost.
"NULLS take no space" -- https://dev.mysql.com/doc/internals/en/innodb-field-contents.html . So, again, less I/O. But, beware, if this leads to an extra check for NULL
in a SELECT
, that could lead to a table scan instead of using an index. Hitting 10M rows is a lot worse than hitting just a few.
As for how many clients you can fit into 32GB -- Maybe 6 or more. Remember, the buffer_pool is a cache; data and indexes are cached on a block-by-block basis. (An InnoDB block is 16KB.)
One more thing... It is a lot easier to shrink the datatypes before going into production. So, do what you can safely do now.
INT
datatype for all my numbers because it will make a better performance ? – GlottochronologyALTER TABLE
and adjust the schema the table has to be rebuilt so if you're using "file per table" in InnoDB the space will be reclaimed. – Loera