Which is faster: char(1) or tinyint(1) ? Why?
Asked Answered
P

8

35

MY PLATFORM:

PHP & mySQL

MY SITUATION:

I came across a situation where I need to store a value for user selection in one of my columns of a table. Now my options would be to:

  1. Either declare the Column as char(1) and store the value as 'y' or 'n'
  2. Or declare the Column as tinyint(1) and store the value as 1 or 0
  3. This column so declared, may also be indexed for use within the application.

MY QUESTIONS:

So I wanted to know, which of the above two types:

  1. Leads to faster query speed when that column is accessed (for the sake of simplicity, let's leave out mixing other queries or accessing other columns, please).

  2. Is the most efficient way of storing and accessing data and why?

  3. How does the access speed vary if the columns are indexed and when they are not?

My understanding is that since char(1) and tinyint(1) take up only 1 byte space, storage space will not be an issue in this case. Then what would remain is the access speed. As far as I know, numeric indexing is faster and more efficient than anything else. But the case here is tough one to decide, I think. Would definitely like to hear your experience on this one.

Thank you in advance.

Propraetor answered 7/1, 2010 at 20:40 Comment(6)
Profile it and let us know the resoult.Ladin
A false dichotomy, there's also enum('1','0') (for example).Expedition
the question has nothing to do with php so i removed the php tagCavazos
Indexing a field with two possible values is pretty worthless.Tisiphone
@Tisiphone The type of the column has little bearing on its suitability for indexing. If you put the column in a WHERE clause and there's no index its going to have to do a full table scan regardless of the type.Trichromatism
WOW! Didn't expect that this question would even be viewed & replied to, so many times, in such a small time. @erenon: Pardon my ignorance, but how do I do that? Does phpmyadmin has any tool for that (I may not know that yet) @ricebowl Good one, but there's a catch in it as mentioned by Langdon. Kindly refer to his comment. @streetparade: that's ok, but looks like it helped other users like Matchu and Zombat to specify a valid point in their comment. @Tisiphone I agree with Schwern totally on this one. On a table with a million rows, a no-index scan would be an overkill. So, good to use.Propraetor
R
38

I think you should create column with ENUM('n','y'). Mysql stores this type in optimal way. It also will help you to store only allowed values in the field.

You can also make it more human friendly ENUM('no','yes') without affect to performance. Because strings 'no' and 'yes' are stored only once per ENUM definition. Mysql stores only index of the value per row.

Also note about sorting by ENUM column:

ENUM values are sorted according to the order in which the enumeration members were listed in the column specification. (In other words, ENUM values are sorted according to their index numbers.) For example, 'a' sorts before 'b' for ENUM('a', 'b'), but 'b' sorts before 'a' for ENUM('b', 'a').

Reagan answered 7/1, 2010 at 20:43 Comment(6)
Way back when, I had the same question as the OP, and I benchmarked it to find enum the quickest and most efficient of the three options. Just make sure you don't use enum('0', '1') like I did -- you'll end up wondering why UPDATE X SET Y = 0; doesn't work (you need single quotes).Staggers
+1 for Langdon. That's a very unique point you specified. I never knew about it until now. So that means if we use enum('0', '1'), our query must have UPDATE X SET Y = '0'; Is that correct? @Ivan If I am right, ENUM('n','y') takes the same space as ENUM('no','yes'). Am I right?Propraetor
@Propraetor Yes, space usage is the same because you can't add any values other then '', 'no' and 'yes'. Mysql stores only index of the value per row, not the string. Strings 'no' and 'yes' are stored only once in table definition.Reagan
@Devner: All enum values have numerical indexes, beginning with 1 (0 is a special value to indicate the empty string). You can use these indexes to query and set values, but as the manual says: "For these reasons, it is not advisable to define an ENUM column with enumeration values that look like numbers, because this can easily become confusing." [ dev.mysql.com/doc/refman/5.1/en/enum.html ] (Do not confuse these numerical indexes with real column indexes, there is just no better word to differentiate between them)Cankerworm
Enums are the work of the devil!Weighin
Why not use NULL and yes. NULL being no. It would not take up additional space, NULL being totally empty. It is still indexable.Sidsida
M
44
                       Rate insert tinyint(1) insert char(1) insert enum('y', 'n')
insert tinyint(1)     207/s                --            -1%                  -20%
insert char(1)        210/s                1%             --                  -19%
insert enum('y', 'n') 259/s               25%            23%                    --
                       Rate insert char(1) insert tinyint(1) insert enum('y', 'n')
insert char(1)        221/s             --               -1%                  -13%
insert tinyint(1)     222/s             1%                --                  -13%
insert enum('y', 'n') 254/s            15%               14%                    --
                       Rate insert tinyint(1) insert char(1) insert enum('y', 'n')
insert tinyint(1)     234/s                --            -3%                   -5%
insert char(1)        242/s                3%             --                   -2%
insert enum('y', 'n') 248/s                6%             2%                    --
                       Rate insert enum('y', 'n') insert tinyint(1) insert char(1)
insert enum('y', 'n') 189/s                    --               -6%           -19%
insert tinyint(1)     201/s                    7%                --           -14%
insert char(1)        234/s                   24%               16%             --
                       Rate insert char(1) insert enum('y', 'n') insert tinyint(1)
insert char(1)        204/s             --                   -4%               -8%
insert enum('y', 'n') 213/s             4%                    --               -4%
insert tinyint(1)     222/s             9%                    4%                --

it seems that, for the most part, enum('y', 'n') is faster to insert into.

                       Rate select char(1) select tinyint(1) select enum('y', 'n')
select char(1)        188/s             --               -7%                   -8%
select tinyint(1)     203/s             8%                --                   -1%
select enum('y', 'n') 204/s             9%                1%                    --
                       Rate select char(1) select tinyint(1) select enum('y', 'n')
select char(1)        178/s             --              -25%                  -27%
select tinyint(1)     236/s            33%                --                   -3%
select enum('y', 'n') 244/s            37%                3%                    --
                       Rate select char(1) select tinyint(1) select enum('y', 'n')
select char(1)        183/s             --              -16%                  -21%
select tinyint(1)     219/s            20%                --                   -6%
select enum('y', 'n') 233/s            27%                6%                    --
                       Rate select tinyint(1) select char(1) select enum('y', 'n')
select tinyint(1)     217/s                --            -1%                   -4%
select char(1)        221/s                1%             --                   -2%
select enum('y', 'n') 226/s                4%             2%                    --
                       Rate select char(1) select tinyint(1) select enum('y', 'n')
select char(1)        179/s             --              -14%                  -20%
select tinyint(1)     208/s            17%                --                   -7%
select enum('y', 'n') 224/s            25%                7%                    --

Selecting also seems to be the enum. Code can be found here

Morisco answered 8/1, 2010 at 14:1 Comment(6)
+1 @gms8994 Thank you very much for the stats. Gives more insight into the speed. Will that be possible for you to let us know if there's any other tool as well to produce the same results as the above? Thanks again.Propraetor
@Propraetor There're none that I know of. I wrote this one specifically for use with this question, but you can check the GitHub page linked in the response for it.Morisco
What version of mysql did you use?Elkins
@DaviMenezes based on when this was posted, likely either 5.1 or 5.5 - I wouldn't expect a significant change in the percentages with a newer version, though it's entirely possible that it has.Morisco
Curious to see performance using enum against 'y' and null instead of enum('y', 'n')Sidsida
@JoelKarunungan the code still exists at the link provided; you should be able to pull it down, make whatever modifications necessary, and run it.Morisco
R
38

I think you should create column with ENUM('n','y'). Mysql stores this type in optimal way. It also will help you to store only allowed values in the field.

You can also make it more human friendly ENUM('no','yes') without affect to performance. Because strings 'no' and 'yes' are stored only once per ENUM definition. Mysql stores only index of the value per row.

Also note about sorting by ENUM column:

ENUM values are sorted according to the order in which the enumeration members were listed in the column specification. (In other words, ENUM values are sorted according to their index numbers.) For example, 'a' sorts before 'b' for ENUM('a', 'b'), but 'b' sorts before 'a' for ENUM('b', 'a').

Reagan answered 7/1, 2010 at 20:43 Comment(6)
Way back when, I had the same question as the OP, and I benchmarked it to find enum the quickest and most efficient of the three options. Just make sure you don't use enum('0', '1') like I did -- you'll end up wondering why UPDATE X SET Y = 0; doesn't work (you need single quotes).Staggers
+1 for Langdon. That's a very unique point you specified. I never knew about it until now. So that means if we use enum('0', '1'), our query must have UPDATE X SET Y = '0'; Is that correct? @Ivan If I am right, ENUM('n','y') takes the same space as ENUM('no','yes'). Am I right?Propraetor
@Propraetor Yes, space usage is the same because you can't add any values other then '', 'no' and 'yes'. Mysql stores only index of the value per row, not the string. Strings 'no' and 'yes' are stored only once in table definition.Reagan
@Devner: All enum values have numerical indexes, beginning with 1 (0 is a special value to indicate the empty string). You can use these indexes to query and set values, but as the manual says: "For these reasons, it is not advisable to define an ENUM column with enumeration values that look like numbers, because this can easily become confusing." [ dev.mysql.com/doc/refman/5.1/en/enum.html ] (Do not confuse these numerical indexes with real column indexes, there is just no better word to differentiate between them)Cankerworm
Enums are the work of the devil!Weighin
Why not use NULL and yes. NULL being no. It would not take up additional space, NULL being totally empty. It is still indexable.Sidsida
S
11

Using tinyint is more standard practice, and will allow you to more easily check the value of the field.

// Using tinyint 0 and 1, you can do this:
if($row['admin']) {
    // user is admin
}

// Using char y and n, you will have to do this:
if($row['admin'] == 'y') {
    // user is admin
}

I'm not an expert in the inner workings of MySQL, but it intuitively feels that retrieving and sorting integer fields is faster than character fields (I just get a feeling that 'a' > 'z' is more work that 0 > 1), and seems to feel much more familiar from a computing perspective in which 0s and 1s are the standard on/off flags. So the storage for integers seems to be better, it feels nicer, and is easier to use in code logic. 0/1 is the clear winner for me.

You may also note that, to an extent, this is MySQL's official position, as well, from their documentation:

BOOL, BOOLEAN: These types are synonyms for TINYINT(1). A value of zero is considered false. Nonzero values are considered true.

If MySQL goes so far as to equate TINYINT(1) with BOOLEAN, it seems like the way to go.

Sciurine answered 7/1, 2010 at 20:43 Comment(2)
Perhaps it's a good thing to have that sort of check? The IDE, let me explain.... require_once("./Permissions.php"); ... if( $row['permissions'] === Permissions::ADMIN ) { // user is admin } not only is this good for readability of code, using a static property to reference a value gives a good compile time check against typos, and when using a predictive IDE, it will help you code quickly. This example gives you multi-level permisions but I think readability and maintainability is key to developing large scale projects so I'm all for that.Endometriosis
@Gary Thanks for your comment, but I am unable to tell if you are advocating the use of 0 and 1 or the non-usage of it. I just feel that your programming practice is different from mine, so please bear with me as I might take a little more time to understand what you are implying.Propraetor
C
4

To know it for sure, you should benchmark it. Or know that it probably will not matter that much in the grander view of the whole project.

Char columns have encodings and collations, and comparing them could involve unnecessary switches between encodings, so my guess is that an int will be faster. For the same reason, I think that updating an index on an int column is also faster. But again, it won't matter much.

CHAR can take up more than one byte, depending on the character set and table options you choose. Some characters can take three bytes to encode, so MySQL sometimes reserves that space, even if you only use y and n.

Cankerworm answered 7/1, 2010 at 20:46 Comment(3)
+1 for "But again, it won't matter much." I'm thinking the same thing. The difference is likely negligible.Kofu
@Jan What you say, makes sense to me. So say if I use enum('n', 'y'), does the switches between encodings and comparisons lag still apply? How would it differ when using INNODB VS MyISAM?Propraetor
@Devner: Yes, since enum columns are defined with an encoding and a collation, I assume this can have a performance impact. I don't know about differences between InnoDB and MyISAM, just a note that describes and InnoDB option that can affect char storage [ dev.mysql.com/doc/refman/5.1/en/data-size.html ]Cankerworm
G
3

They're both going to be so close that it doesn't matter. If you feel have to ask this question on SO, you're over-optimizing. Use whichever one makes the most logical sense.

Gerita answered 7/1, 2010 at 20:48 Comment(0)
H
1

If you specify the types BOOL or BOOLEAN as a column type when creating a table in MySQL, it creates the column type as TINYINT(1). Presumably this is the faster of the two.

Documentation

Also:

We intend to implement full boolean type handling, in accordance with standard SQL, in a future MySQL release.

Homogeny answered 7/1, 2010 at 20:50 Comment(0)
V
1

While my hunch is that an index on a TINYINT would be faster than an index on a CHAR(1) due to the fact that there is no string-handling overhead (collation, whitespace, etc), I don't have any facts to back this up. My guess is that there isn't a significant performance difference that is worth worrying about.

However, because you're using PHP, storing as a TINYINT makes much more sense. Using the 1/0 values is equivalent to using true and false, even when they are returned as strings to PHP, and can be handled as such. You can simply do a if ($record['field']) with your results as a boolean check, instead of converting between 'y' and 'n' all the time.

Viscera answered 7/1, 2010 at 20:50 Comment(1)
+1 @Zombat That makes sense. I think using numbers would really ease up the processing with PHP code within the app.Propraetor
C
1
 TINYINT    1 Byte
CHAR(M)     M Bytes, 0 <= M <= 255

is there any different?

Cavazos answered 7/1, 2010 at 20:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.