MySQL: Fields length. Does it really matter?

Asked 25/4, 2011 at 19:12 Answered 25/4, 2011 at 19:42

Solved mysql sql database database-abstraction

I'm working with some database abstraction layers and most of them are using attributes like "String" which is VARCHAR 250 or INTEGER which has length of 11 digits. But for example I have something that will be less than 250 characters long. Should I go and make it less? Does it really makes any valuable difference?

Thanks in advance!

Larainelarboard answered 25/4, 2011 at 19:12 Comment(0)

INT length does nothing. All INTs are 4 bytes. The number you can set, is only used for zerofill (and who uses that!?).

VARCHAR length does more. It's the maxlength of the field. VARCHAR is saved so that only the actual data is stored, so the length doesn't mattter. These days, you can have bigger VARCHARs than 255 bytes (being 256^2-1). The difference is the bytes that are used for the field length. VARCHAR(100) and VARCHAR(8) and VARCHAR(255) use 1 byte to save the field length. VARCHAR(1000) uses 2.

Hope that helps =)

edit
I almost always make my VARCHARs 250 long. Actual length should be checked in the app anyway. For bigger fields I use TEXT (and those are stored differently, so can be much much longer).

edit
I don't know how current this is, but it used to help me (understand): http://help.scibit.com/Mascon/masconMySQL_Field_Types.html

Categorize answered 25/4, 2011 at 19:17 Comment(7)

Is that really the only difference? This would imply that VARCHAR(255) is the same performance-wise as VARCHAR(8), is that correct? – Slovenia 25/4, 2011 at 19:19

Yes, that's correct. The difference is the amount you can store: the more you store, the less efficient the table. If it's real data, I often use TEXT for shorter fields too (60+ chars or so), because those aren't stored in the InnoDB blob. – Categorize 25/4, 2011 at 19:22

depens on the values, a VARCHAR(8) containing lots of rows where the value is 8 characters long is slower than a VARCHAR(255) where all are 1 char long ;-) Rudie's answer is correct, though. – Colonialism 25/4, 2011 at 19:23

-1 Actual length should be checked in the app anyway. Actual length should also be checked in the app. The database should strive to enforce correctness and protect itself against bad data. Why not make every column varchar(8000)? Because if you do, someone will try to stuff a novel into that field. – Highpriced 25/4, 2011 at 19:36

@Categorize - Will recind the -1 because you did not imply that the database shouldn't check the length. However, that the field is sized to the actual meaningful values is critical. – Highpriced 25/4, 2011 at 19:37

@Rudie, Actually the first 768 characters are stored in the table, regardless of what type of field is used. For more information read (mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb). Also, why wouldn't you want the data to be stored in the row. It would make more sense to have there data there, then have to go off to another page to fetch it. – Ecospecies 25/4, 2011 at 19:43

@Ecospecies Because (I thought) InnoDB saves ALL data in 1 file and when that file becomes 2 gig, it's no fun. TEXT's and BLOB's don't belong in that InnoDB file IMO, so they're TEXT/BLOB. Very useful link > thanks – Categorize 25/4, 2011 at 20:42

First, remember that the database is meant to store facts and is designed to protect itself against bad data. Thus, the reason you do not want to allow a user to enter 250 characters for a first name is that a user will put all kinds of data in there that is not a first name. They'll put their whole name, their underwear size, a novel about what they did last summer and so on. Thus, you want to strive to enforce that the data is as correct as possible. It is a mistake to assume that the application is the sole protector against bad data. You want users to tell you that they had a problem stuffing War in Peace into a given column.

Thus, the most important question is, "What is the most appropriate value for the data being stored?" Ideally, you would use an int and a check constraint to ensure that the values have an appropriate range (e.g. greater than zero, less than a billion etc.). Unfortunately, this is one of MySQL's greatest weakness: it does not honor check constraints. That simply means you must implement those integrity checks in triggers which admittedly is more cumbersome.

Will the difference between an int (4 bytes) make an appreciable difference to a tinyint (1 byte)? Obviously, it depends on the amount of data. If you will have no more than 10 rows, the answer is obviously no. If you will have 10 billion rows, the answer is obviously "Yes". However, IMO, this is premature optimization. It is far better to focus on ensuring correctness first.

For text, you should ask whether your data should support Chinese, Japanese or non-ANSI values (i.e., should you use nvarchar or varchar)? Does this value represent a real world code like a currency code, or bank code which has a specific specification?

Highpriced answered 25/4, 2011 at 19:35 Comment(0)

Not so sure in MySQL, but in MS SQL it only makes a difference for sufficiently large databases. Typically, I like to use smaller fields for a) the space saving (it never hurts to practice good habits) and b) for the implied validation (if you know a certain field should never be more than 10 characters, why allow eleven, let alone 250?).

Carreno answered 25/4, 2011 at 19:14 Comment(3)

That is interesting. Could you post a link to the SQL Server manual where it is documented that a difference in the declared max_length makes a difference in the overall size? – Transpolar 25/4, 2011 at 19:49

I'd have to look around to find it, but I'm pretty sure it's in the official MS documentation somewhere... I'm working off memory here, and it's been awhile since I've played intimately with SQL Server. – Carreno 25/4, 2011 at 19:58

I searched the manual, but couldn't find anything related to that. That's why I asked. But thanks anyway – Transpolar 25/4, 2011 at 20:7

I thinks Rudie is wrong, not all INTs are 4 bytes... in MySQL you have:

tinyint = 1 byte, smallint = 2 bytes, mediumint = 3 bytes, int = 4 bytes, bigint = 8 bytes.

I think Rudie refers to the "display with" that is the number you put between parenthesis when you are creating a column, e.g.:

age INT(3)

You're telling to the RDBMS just to SHOW no more than 3 numbers.

And VARCHARs are (variable length charcter string) so if you declare let's say name varchar(5000) and you store a name like "Mario" you only are using 7 bytes (5 for the data and 2 for the length of the value).

Symmetrize answered 25/4, 2011 at 19:25 Comment(4)

TINYITN is not an INT then... INT = 4 bytes. Period. In every language (?), in every database (?) is an INT 4 bytes. Some are signed and others aren't, but an INT is 4 bytes. And a TINYINT is not an INT. Also: if you put the number 800000 in a INT(3) field, all 80000 will be shown. 4 bytes will be available no matter the 'size' of the INT(n). – Categorize 25/4, 2011 at 20:48

tinyINT it is an int, but tiny. the number you put in parenthesis it is not the size, it is just the "display width". – Symmetrize 25/4, 2011 at 21:15

Do not believe me, just read the documentation: dev.mysql.com/doc/refman/5.0/en/numeric-types.html – Symmetrize 25/4, 2011 at 21:18

INTEGER is a word that states the size of the byteset. INT means 4 bytes. A TINYINT is also a non floating number, like INT, but with a different size. It is therefore not an INT. Pf who cares, I give up – Categorize 25/4, 2011 at 22:12

The correct field size serves to limit the bad data that can be put in. For instance suppose you have a phone number field. If you allow 250 characters, you will often end up with things like the following in the phone field (an example not taken at random):

Call the good-looking blonde secretary instead.

So first limiting the length is part of how we enforce data integrity rules. As such it is critical.

Second, there is only so much space on a datapage and while some databases will allow you to create tables where the potential record is longer than the width of the data page, they often will not allow you to actually exceed it when storing the data. This can lead to some very hard to find bugs when suddenly one record can't be saved. I don't know about MySql and whether it does this but I know SQL Server does and it is very hard to figure out what is wrong. So making data the correct size can be critical to preventing bugs.

Corticosterone answered 25/4, 2011 at 19:42 Comment(0)

Recommended topics

Hot tags