MySQL: Why use VARCHAR(20) instead of VARCHAR(255)? [duplicate]
Asked Answered
G

6

38

Possible Duplicate:
Are there disadvantages to using a generic varchar(255) for all text-based fields?

In MYSQL you can choose a length for the VARCHAR field type. Possible values are 1-255.

But what are its advantages if you use VARCHAR(255) that is the maximum instead of VARCHAR(20)? As far as I know, the size of the entries depends only on the real length of the inserted string.

size (bytes) = length+1

So if you have the word "Example" in a VARCHAR(255) field, it would have 8 bytes. If you have it in a VARCHAR(20) field, it would have 8 bytes, too. What is the difference?

I hope you can help me. Thanks in advance!

Groyne answered 11/8, 2009 at 18:34 Comment(0)
F
36

Check out: Reference for Varchar

In short there isn't much difference unless you go over the size of 255 in your VARCHAR which will require another byte for the length prefix.

The length indicates more of a constraint on the data stored in the column than anything else. This inherently constrains the MAXIMUM storage size for the column as well. IMHO, the length should make sense with respect to the data. If your storing a Social Security # it makes no sense to set the length to 128 even though it doesn't cost you anything in storage if all you actually store is an SSN.

Ferraro answered 11/8, 2009 at 18:38 Comment(3)
what do the (20) mean? obviously, it is the length, but the length of what?Simulcast
@Simulcast (20) is the length of string you will be inserting in the table. suppose social security number is of 9 digit, therefore its length is 9 and it can be easily put in column of datatype VARCHAR(20), but it will cause error in VARCHAR(2)Fimbria
@ShubhamShaw to be more precise, it's not "the length of string" but rather "the max allowed length of string"Quite
S
23

There are many valid reasons for choosing a value smaller than the maximum that are not related to performance. Setting a size helps indicate the type of data you are storing and also can also act as a last-gasp form of validation.

For instance, if you are storing a UK postcode then you only need 8 characters. Setting this limit helps make clear the type of data you are storing. If you chose 255 characters it would just confuse matters.

Samoyed answered 11/8, 2009 at 18:50 Comment(6)
+1 for emphasizing clarity and intelligibility. I would certainly question the use of a buf[1024], a new DynamicBuf(1024) or a VARCHAR(255) to store, for example, a single, dotted-quad IP address. Did the coder know what he was doing?Aeneid
+1 And if you decide on VARCHAR(8) for UK postcodes use CHAR(8) for performance :)Embry
If you want to store 8 character postcode, you should use CHAR(8) instead. It is better to avoid the presence of VARCHAR column in table, since it forces varibale row length and slower seek in table. However, if you cannot have all columns of fixed length, it matters not.Randa
@JosefKufner UK postcodes aren't always 8 characters long, they vary in length. 8 is the CURRENT longest but it can vary between 6 and 8 characters long. Thus it should probably use VARCHARVins
CHAR(8) will pad shorter string with spaces and then cut these spaces away when retrieving the value. CHAR does not need to store length of the string -- VARCHAR(8) requires 9 bytes, but CHAR(8) only 8 bytes. Therefore, CHAR(8) is still more effective, if most of the postcodes are 8 characters long.Randa
CHARS also usually perform better than VARCHARs, too - dba.stackexchange.com/questions/424/…Samoyed
P
5

I don't know about mySQL but in SQL Server it will let you define fields such that the total number of bytes used is greater than the total number of bytes that can actually be stored in a record. This is a bad thing. Sooner or later you will get a row where the limit is reached and you cannot insert the data.

It is far better to design your database structure to consider row size limits.

Additionally yes, you do not want people to put 200 characters in a field where the maximum value should be 10. If they do, it is almost always bad data.

You say, well I can limit that at the application level. But data does not get into the database just from one application. Sometimes multiple applications use it, sometimes data is imported and sometimes it is fixed manually from the query window (update all the records to add 10% to the price for instance). If any of these other sources of data don't know about the rules you put in your application, you will have bad, useless data in your database. Data integrity must be enforced at the database level (which doesn't stop you from also checking before you try to enter data) or you have no integrity. Plus it has been my experience that people who are too lazy to design their database are often also too lazy to actually put the limits into the application and there is no data integrity check at all.

They have a word for databases with no data integrity - useless.

Patagonia answered 12/8, 2009 at 20:10 Comment(0)
H
2

There is a semantical difference (and I believe that's the only difference): if you try to fill 30 non-space characters into varchar(20), it will produce an error, whereas it will succeed for varchar(255). So it is primarily an additional constraint.

Halsted answered 11/8, 2009 at 18:50 Comment(4)
But isn't it clear that 30 characters don't fit into a 20-byte field?Groyne
As you say, the storage representation really doesn't care about the length: a field declared varchar(20) could just fine store 30 characters, as far as disk representation goes - I thought this observation was the core of your question (why not always use varchar(255)?) Now, I'm telling you the reason why you set varchar(20): because you want the error that you get if you accidentally try to put in more than 20 characters.Nesline
So it's only for validation issues, correct?Groyne
It also depends on you MySQL settings. String longer than the maximum limit for a varchar row doesn't produce an error if SQL Strict setting is not set. If it is not set, it just produces a warning and the string is truncated to the maximum length. If you turn on this setting then you actually get an error message when trying this.Enchiridion
G
1

Well, if you want to allow for a larger entry, or limit the entry size perhaps.

For example, you may have first_name as a VARCHAR 20, but perhaps street_address as a VARCHAR 50 since 20 may not be enough space. At the same time, you may want to control how large that value can get.

In other words, you have set a ceiling of how large a particular value can be, in theory to prevent the table (and potentially the index/index entries) from getting too large.

You could just use CHAR which is a fixed width as well, but unlike VARCHAR which can be smaller, CHAR pads the values (although this makes for quicker SQL access.

Gorget answered 11/8, 2009 at 18:39 Comment(5)
But why should I set the length to 20 at all? If there's no difference, I can simply set it to 255 for all fields, can't I?Groyne
see my edit - i elaborate a bit more.Gorget
Thank you. So it doesn't make any differences converning storage and performance, right? But it can be useful, though, e.g. if you want to cut strings to prevent to long ones. But you could also achieve this before if you used substr() in PHP or a similar function in another language!?Groyne
The difference has to do with data integrity. Real world example: storing computer names of Windows machines in your database. A computer running windows cannot have a computer name longer than 63 bytes. If you define a field that contains computers names as varchar(255) you can input names that are invalid and may cause errors if you try to use these names to access computers. Defining varchar(63) makes MySQL reject (or truncate) inserts or updates that exceed 63 characters (technet.microsoft.com/en-us/library/cc757496(WS.10).aspx for more on computer names in Windows 2003)Legpull
You can't always ensure other programmers will know not to enter more than 63 characters. By setting this on the field, you proactively prevent this from ever being an issue in the future.Legpull
S
1

From a database perspective performance wise I do not believe there is going to be a difference.

However, I think a lot of the decision on the length to use comes down to what you are trying to accomplish and documenting the system to accept just the data that it needs.

Smaragdite answered 11/8, 2009 at 18:39 Comment(2)
"Note that using CHAR will only speed up your access if the whole record is fixed size. That is, if you use any variable size object, you might as well make all of them variable size. You gain no speed by using a CHAR in a table that also contains a VARCHAR." linkEnchiridion
However you do get an increase in size, as a varchar uses 1 more byte than the char equivalent if both are always going to be filled to their maximum value. As the var char uses 1 byte to declare the szie of the following char sequenceBrazier

© 2022 - 2024 — McMap. All rights reserved.