Why historically do people use 255 not 256 for database field magnitudes?
Asked Answered
B

12

243

You often see database fields set to have a magnitude of 255 characters, what is the traditional / historic reason why? I assume it's something to do with paging / memory limits, and performance but the distinction between 255 and 256 has always confused me.

varchar(255)

Considering this is a capacity or magnitude, not an indexer, why is 255 preferred over 256? Is a byte reserved for some purpose (terminator or null or something)?

Presumably varchar(0) is a nonsense (has zero capacity)? In which case 2^8 of space should be 256 surely?

Are there other magnitudes that provide performance benefits? For example is varchar(512) less performant than varchar(511) or varchar(510)?

Is this value the same for all relations databases, old and new?

disclaimer - I'm a developer not a DBA, I use field sizes and types that suit my business logic where that is known, but I'd like to know the historic reason for this preference, even if it's no longer relevant (but even more if it still is relevant).

Edit:

Thanks for the answers, there seems to be some concensus that a byte is used to store size, but this doesn't settle the matter definitively in my mind.

If the meta data (string length) is stored in the same contiguous memory/disk, it makes some sense. 1 byte of metadata and 255 bytes of string data, would suit each other very nicely, and fit into 256 contiguous bytes of storage, which presumably is neat and tidy.

But...If the metadata (string length) is stored separately from the actual string data (in a master table perhaps), then to constrain the length of string's data by one byte, just because it's easier to store only a 1 byte integer of metadata seems a bit odd.

In both cases, it would seem to be a subtlety that probably depends on the DB implementation. The practice of using 255 seems pretty widespread, so someone somewhere must have argued a good case for it in the beginning, can anyone remember what that case was/is? Programmers won't adopt any new practice without a reason, and this must have been new once.

Baseless answered 26/2, 2010 at 9:40 Comment(6)
Because character count starts from 0 to N-1. So 256 characters will be declared varchar(255). Unless I'm mistaken.Selfless
Maybe because IT people start to count with 0, not 1 ;) ?Mash
I think it has to do with old school programmers, cant even remember why we did it.Amelia
@Elite Gentleman: nope the number in brackets is the true length... Like in C array declarations: x[256] gives x[0]...x[255].Glindaglinka
@romaintaz - but consider an array which can store 1 item. You declare it something[1] and access it something[0]. The question is why in SQL do we declare the capacity to be 1 byte less than seems logical at first glance.Baseless
Possible duplicate of Is there a good reason I see VARCHAR(255) used so often (as opposed to another length)?Gaughan
L
205

With a maximum length of 255 characters, the DBMS can choose to use a single byte to indicate the length of the data in the field. If the limit were 256 or greater, two bytes would be needed.

A value of length zero is certainly valid for varchar data (unless constrained otherwise). Most systems treat such an empty string as distinct from NULL, but some systems (notably Oracle) treat an empty string identically to NULL. For systems where an empty string is not NULL, an additional bit somewhere in the row would be needed to indicate whether the value should be considered NULL or not.

As you note, this is a historical optimisation and is probably not relevant to most systems today.

Lance answered 26/2, 2010 at 9:44 Comment(5)
Reserving a byte for the length makes sense, but WRT your second paragrph, presumably a /value/ of length zero is valid, but is a /capacity/ of length zero valid?Baseless
@Andrew: I just tried and PostgreSQL rejects varchar(0). It's probably not that useful because the value could only be two things, the empty string or NULL, and so you might as well just use a bit for that.Lance
So is it true to assume that the capacity metadata is stored in the same contiguous block as the data itself, and therefore there is an advantage to the DB to keep the total of those two things (data and metadata) within one page (presumably 256 bytes)?Baseless
@Andrew: That's an assumption that may or may not be true, depending on the implementation details of the DBMS in question. Page sizes are typically much larger than 256 bytes. As I mentioned, this sort of optimisation is sometimes important (eg. if you're storing billions of small rows), but most of the time it's not worth worrying about.Lance
The importance in disk space (and index space) is not because 256 may fit in a page but because 1 byte vs 2 bytes (for millions/billions/trillions rows) makes a big difference.Harts
G
41

255 was the varchar limit in mySQL4 and earlier.

Also 255 chars + Null terminator = 256

Or 1 byte length descriptor gives a possible range 0-255 chars

Glindaglinka answered 26/2, 2010 at 9:47 Comment(2)
And reading into char foo[256] is important because memory management likes powers of 2. see: #3190646 Allocating char foo[257] will either fragment memory or take up 512 bytes.Coelenteron
Doesn't varchar store the length of the string, and therefore doesn't need a null terminator?Baskin
E
23

255 is the largest numerical value that can be stored in a single-byte unsigned integer (assuming 8-bit bytes) - hence, applications which store the length of a string for some purpose would prefer 255 over 256 because it means they only have to allocate 1 byte for the "size" variable.

Epexegesis answered 26/2, 2010 at 9:43 Comment(0)
R
20

From MySQL Manual:

Data Type:
VARCHAR(M), VARBINARY(M)

Storage Required:
L + 1 bytes if column values require 0 – 255 bytes, L + 2 bytes if values may require more than 255 bytes

Understand and make choice.

Roxi answered 5/7, 2012 at 4:6 Comment(1)
Yes, but M represents the declared column length in characters for nonbinary string types and bytes for binary string types. L represents the actual length in bytes of a given string value. dev.mysql.com/doc/refman/5.7/en/storage-requirements.htmlMcmahan
L
17

255 is the maximum value of a 8 bit integer : 11111111 = 255.

Limoli answered 26/2, 2010 at 9:43 Comment(0)
P
14

Are there other magnitudes that provide performance benefits? For example is varchar(512) less performant than varchar(511) or varchar(510)?

Recollected the fundamentals of the bits/bytes storage, it requires one byte to store integers below 256 and two bytes for any integer between 256 and 65536.
Hence, it requires same space (two bytes) to store 511 or 512 or for that matter 65535....
Thus it is clear that the this argument mentioned in the discussion above is N/A for varchar(512) or varchar(511).

Piscary answered 3/7, 2014 at 9:9 Comment(0)
T
7

A maximum length of 255 allows the database engine to use only 1 byte to store the length of each field. You are correct that 1 byte of space allows you to store 2^8=256 distinct values for the length of the string.

But if you allow the field to store zero-length text strings, you need to be able to store zero in the length. So you can allow 256 distinct length values, starting at zero: 0-255.

Transcendence answered 26/2, 2010 at 10:11 Comment(0)
P
7

It used to be that all strings required a NUL terminator, or "backslash-zero". Updated databases don't have that. It was "255 characters of text" with a "\0" added automatically at the end so the system knew where the string ended. If you said VARCHAR(256), it would end up being 257 and then you'd be in the next register for one character. Wasteful. That's why everything was VARCHAR(255) and VARCHAR(31). Out of habit the 255 seems to have stuck around but the 31's became 32's and the 511's became 512's. That part is weird. It's hard to make myself write VARCHAR(256).

Phyfe answered 16/7, 2015 at 19:37 Comment(1)
Not sure why this answer speaks in the past tense. Even almost 9 years later, MySQL still uses an extra byte if the value of a varchar column exceeds 255 bytes: dev.mysql.com/doc/refman/8.0/en/storage-requirements.htmlCultus
O
6

Often varchars are implemented as pascal strings: holding the actual length in the byte #0. The length was therefore bound to 255. (Value of a byte varies from 0 to 255.)

Orelee answered 26/2, 2010 at 9:45 Comment(0)
F
6

8 bits unsigned = 256 bytes

255 characters + byte 0 for length

Fierce answered 26/2, 2010 at 9:53 Comment(0)
P
0

I think this might answer your question. Looks like it was the max limit of varchar in earlier systems. I took it off another stackoverflow question.

It's hard to know what the longest postal address is, of course, which is why many people choose a long VARCHAR that is certainly longer than any address. And 255 is customary because it may have been the maximum length of a VARCHAR in some databases in the dawn of time (as well as PostgreSQL until more recently).

Are there disadvantages to using a generic varchar(255) for all text-based fields?

Pigmy answered 11/3, 2013 at 16:19 Comment(0)
R
0

Data is saved in memory in binary system and 0 and 1 are binary digits. Largest binary number that can fit in 1 byte (8-bits) is 11111111 which converts to decimal 255.

Rectus answered 26/9, 2013 at 19:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.