Difference between BYTE and CHAR in column datatypes
Asked Answered
I

5

188

In Oracle, what is the difference between :

CREATE TABLE CLIENT
(
 NAME VARCHAR2(11 BYTE),
 ID_CLIENT NUMBER
)

and

CREATE TABLE CLIENT
(
 NAME VARCHAR2(11 CHAR), -- or even VARCHAR2(11)
 ID_CLIENT NUMBER
)
Impression answered 17/9, 2008 at 9:27 Comment(0)
T
299

Let us assume the database character set is UTF-8, which is the recommended setting in recent versions of Oracle. In this case, some characters take more than 1 byte to store in the database.

If you define the field as VARCHAR2(11 BYTE), Oracle can use up to 11 bytes for storage, but you may not actually be able to store 11 characters in the field, because some of them take more than one byte to store, e.g. non-English characters.

By defining the field as VARCHAR2(11 CHAR) you tell Oracle it can use enough space to store 11 characters, no matter how many bytes it takes to store each one. A single character may require up to 4 bytes.

Tinishatinker answered 17/9, 2008 at 9:34 Comment(9)
Note that character length semantics do not affect the 4000 byte maximum length for a VARCHAR2. Declaring a VARCHAR2(4000 CHAR) will allow fewer than 4000 characters if some of the characters require multiple bytes of storage.Streptothricin
@David Sykes Is this semantically the same with NVARCHAR(11) ?Fruitless
@Fruitless Not as far as I know. I believe that the size parameter in the NVARCHAR type declaration has the meaning as in VARCHAR2. i.e. to ensure enough storage space for 11 characters (not bytes) in the NVARCHAR character set you would say NVARCHAR(11 CHAR). NOTE: I have not actually checked this. I have never used NVARCHAR.Tinishatinker
To further illustrate the difference between the two: four characters of a hex-encoded value (i.e. "0xFF") or three decimal characters (i.e. "255") could be "compressed" when represented as a single byte: 11111111. This could then be useful for bit flags (up to 8 settings), bitwise operations, etc.Tejeda
Note that 1 as an ASCII character (dec. 49) is 1001001 whereas 1 as a bit is 00000001.Tejeda
If you define a variable in PLSQL as VARCHAR2(11 CHAR) how much space will be reserved? Given utf8 It should be 44 bytes because each character can have up to 4 bytes and Oracle has no way of knowing beforehand which characters will be stored in the variable.Ridenhour
@Ridenhour I have not actually tested this, but fully expect it to be the case. It would be weird if it wasn't.Tinishatinker
@DavidSykes shall i know which char takes more than one byte?Penny
@151291 As far as I know this depends on the database character set. If the database is set to UTF-8 then anything but the first 127 characters of the ASCII table takes more than 1 byte.Tinishatinker
P
23

One has exactly space for 11 bytes, the other for exactly 11 characters. Some charsets such as Unicode variants may use more than one byte per char, therefore the 11 byte field might have space for less than 11 chars depending on the encoding.

See also http://www.joelonsoftware.com/articles/Unicode.html

Phantasmagoria answered 17/9, 2008 at 9:31 Comment(0)
C
17

Depending on the system configuration, size of CHAR mesured in BYTES can vary. In your examples:

  1. Limits field to 11 BYTE
  2. Limits field to 11 CHARacters


Conclusion: 1 CHAR is not equal to 1 BYTE.

Calculable answered 17/9, 2008 at 10:13 Comment(0)
V
4

I am not sure since I am not an Oracle user, but I assume that the difference lies when you use multi-byte character sets such as Unicode (UTF-16/32). In this case, 11 Bytes could account for less than 11 characters.

Also those field types might be treated differently in regard to accented characters or case, for example 'binaryField(ete) = "été"' will not match while 'charField(ete) = "été"' might (again not sure about Oracle).

Verney answered 17/9, 2008 at 9:31 Comment(0)
M
1

In simple words when you write NAME VARCHAR2(11 BYTE) then only 11 Byte can be accommodated in that variable.

No matter which characters set you are using, for example, if you are using Unicode (UTF-16) then only half of the size of Name can be accommodated in NAME.

On the other hand, if you write NAME VARCHAR2(11 CHAR) then NAME can accommodate 11 CHAR regardless of their character encoding.

BYTE is the default if you do not specify BYTE or CHAR

So if you write NAME VARCHAR2(4000 BYTE) and use Unicode(UTF-16) character encoding then only 2000 characters can be accommodated in NAME

That means the size limit on the variable is applied in BYTES and it depends on the character encoding that how many characters can be accommodated in that vraible.

Magocsi answered 5/5, 2021 at 9:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.