Why does a byte only have 0 to 255?
Asked Answered
Z

9

41

Why does a byte only range from 0 to 255?

Zwinglian answered 13/2, 2011 at 19:55 Comment(2)
I assume the core of this question is why does it go to 255 instead of 256, with the answer being that it does hold 256 values, it just starts at zeroWootten
if a bit can only take on 1s and 0s, then the total number of unique values this can take on is 256. [0,0,0,0,0,0,0,0]. For example: [0,0,0,0,0,0,0,1] represents 1 and [1,0,0,0,0,0,0,0] represents 255 and [0,0,0,0,0,0,0,0] represents 0. So the total number of unique values is from 0-255 which is 256 values.Leslileslie
A
54

Strictly speaking, the term "byte" can actually refer to a unit with other than 256 values. It's just that that's the almost universal size. From Wikipedia:

Historically, a byte was the number of bits used to encode a single character of text in a computer and it is for this reason the basic addressable element in many computer architectures.

The size of the byte has historically been hardware dependent and no definitive standards exist that mandate the size. The de facto standard of eight bits is a convenient power of two permitting the values 0 through 255 for one byte. Many types of applications use variables representable in eight or fewer bits, and processor designers optimize for this common usage. The popularity of major commercial computing architectures have aided in the ubiquitous acceptance of the 8-bit size. The term octet was defined to explicitly denote a sequence of 8 bits because of the ambiguity associated with the term byte.

Ironically, these days the size of "a single character" is no longer consider a single byte in most cases... most commonly, the idea of a "character" is associated with Unicode, where characters can be represented in a number of different formats, but are typically either 16 bits or 32.

It would be amusing for a system which used UCS-4/UTF-32 (the direct 32-bit representation of Unicode) to designate 32 bits as a byte. The confusion caused would be spectacular.

However, assuming we take "byte" as synonymous with "octet", there are eight independent bits, each of which can be either on or off, true or false, 1 or 0, however you wish to think of it. That leads to 256 possible values, which are typically numbered 0 to 255. (That's not always the case though. For example, the designers of Java unfortunately decided to treat bytes as signed integers in the range -128 to 127.)

Aarika answered 13/2, 2011 at 19:59 Comment(12)
Too bad C chose to use char for the byte type, which now means that a char is not a character.Clink
@Jon: I should hardly say that Unicode (a 21‐bit character set) is typically represented by 16 or 32 bits! That’s an extremely Java/Microsoft‐centric point of view! First of all, nothing but stupid old UCS‐2 is only 16 bits. And while it is true that UTF‐16 serializes to either 16 or 32 bits, far and beyond the most common encoding scheme bar none for Unicode text is certainly UTF‐8. Anyone thinking about the ‘size of a character’ has stopped thinking about abstract size‐independent characters, and is a perilous path at best.Gagne
Also, it’s not “a unit with more than 256 values”, but rather one with other than 256 values. That’s because there were (and sometimes still are) a lot of machines whose bytes held fewer than 8 bits, not more.Gagne
@dan04: That’s no worse than Java, whose char and even Character data type cannot hold a character. That’s because they screwed up the notion of an abstract character, confusing high-level characters with low-level serialization schemes. Then to add insult to injury, Java also cursed people with the ugliest of all possible serialized representations to be forever conscious of or be plagued by error. It’s a real mess!Gagne
@tchrist: Changed "more" to "other". As for the UTF-16/UTF-32/UTF-8 issue, my point is that even UTF-8 is still representing up to a 32-bit (31-bit?) number (albeit only 21 bits being used at the moment). I was thinking of Unicode as a coded character set (initially 16-bit, now 21 or 31 bit) rather than in terms of a character encoding form.Aarika
@tchrist: Indeed. As of Java 5, int is the new char. There was recently an extensive discussion about this on one of the Scala mailinglists, where someone complained about the fact that Scala's String is identical to Java's String, and thus still maintains all those mistakes in a language that was specifically designed as a "better Java". The whole thread is a great read, even if you don't care about Scala and/or Java.Deuteronomy
@Jörg: If you could please post a link to that discussion thread, or simply mail me it, I’d indeed interested in reading through it. Thanks.Gagne
@tchrist: Support for Ropes in Scala. Watch out for posts by a guy named Jim Balter, he's the one with the strongest opinions but also the most knowledge.Deuteronomy
Jörg: Thanks very much; it was a good read! Jim Balter’s saying stuff I’ve been ranting on for a good while. Java’s terrible Unicode support, especially but not only in its seriously deficient regexes, has made me abandon Java for text processing. Too many weeks lost debugging buggy internals that aren’t my fault. I agree that Python has messed this up, too. I’ve gone back to Perl, which has a clean abstract character model with excellent UCD integration: names, graphemes, properties, normalization, collation, etc. Your mailing list seems universally unaware of how easy this stuff is in Perl.Gagne
@Jörg "where someone complained about the fact that Scala's String is identical to Java's String" -- that was not the complaint, although I agree that it's a great read. :-) Scala is actually considerably worse in re Unicode than Java, because Java now has a Unicode API that allows correct processing, as horrible as it is to use ... whereas Scala's functional/iterable approach to Strings guarantees that all String handling that deals with individual characters is broken. "strongest opinions but also the most knowledge" -- if you know 1+1=2, your "opinion" that 1+1=2 will be pretty strong. :-)Thermomotor
@Jim: No problem. There aren’t enough of us who recognize, understand, and appreciate the deep troubles that derive from fixating on physical bytewise encodings instead of logical integer code points, and UTF‑16 just makes all these worse. I’ve been working with the JDK7 folks on squaring up the j.u.regex stuff here in this thread. Send me mail if you’re interested in discussing this. Oh drat! Looks like Groovy has the same bug as Scala WRT “characters ≠ characters”.Gagne
not sure why you're on a tangent about character sets. completely irrelevant.Forceful
B
30

Because a byte, by its standard definition, is 8 bits which can represent 256 values (0 through 255).

Buddybuderus answered 13/2, 2011 at 19:56 Comment(4)
Oh! Wait. Jon Skeet is here. May be not. :-)Indictment
I've previously gotten downvoted for making that assumption: https://mcmap.net/q/365961/-how-can-i-perform-multiplication-without-the-39-39-operator/…Kuibyshev
@Andrew: Such is the fickleness of SO.Metathesis
It's commonly accepted that a byte is 8 bits, it most certainly is not the "standard definition". Architectures have existed where bytes are otherwise.Forceful
G
23

Byte ≠ Octet

Why does a byte only range from 0 to 255?

It doesn’t.

An octet has 8 bits, thus allowing for 28 possibilities. A byte is ill‐defined. One should not equate the two terms, as they are not completely interchangeable. Also, wicked programming languages that support only signed characters (ʏᴏᴜ ᴋɴᴏw ᴡʜᴏ ʏᴏᴜ ᴀʀᴇ﹗) can only represent the values −128 to 127, not 0 to 255.

Big Iron takes a long time to rust.

Most but not all modern machines all have 8‑bits bytes, but that is a relatively recent phenomenon. It certainly has not always been that way. Many very early computers had 4‑bit bytes, and 6‑bit bytes were once common even comparitively recently. Both of those types of bytes hold rather fewer values than 255.

Those 6‑bit bytes could be quite convenient, since with a word size of 36 bits, six such bytes fit cleanly into one of those 36‑bit words without any jiggering. That made if very useful for holding Fieldata, used by the very popular Sperry ᴜɴɪᴠᴀᴄ computers. You can only fit 4 ᴀsᴄɪɪ characters into a 36‑bit word, not 6 Fieldata. We had 1100 series at the computing center when I was an undergraduate, but this remains true even with the modern 2200 series.

Enter ASCII

ᴀsᴄɪɪ — which was and is only a 7‑ not an 8‑bit code — paved the way for breaking out of that world. The importance of the ɪʙᴍ 360, which had 8‑bit bytes whether they held ᴀsᴄɪɪ or not, should not be understated.

Nevertheless, many machines long supported ᴅᴇᴄ’s Radix‑50. This was a 40‑character repertoire wherein three of its characters could be efficiently packed into a single 16‑bit words under two distinct encoding schemes. I used plenty of ᴅᴇᴄ ᴘᴅᴘ‑11s and Vaxen during my university days, and Rad‑50 was simply a fact of life, a reality that had to be accomodated.

Gagne answered 13/2, 2011 at 20:29 Comment(11)
You can fit 5 ASCII characters into a 36-bit word, if you can figure out what to do with the 1 bit left over.Clink
@Dan04: Nice. :) I meant without packing, but of course you are correct. It really sucks on machine where you take a hit on non-word addressable stuff.Gagne
And while your answer is technically correct, nowadays 6-bit and 9-bit bytes are used more for "Byte ≠ Octet" pedantry than they are for actual programming.Clink
@dan04: Are you saying that technical corrrectness should count for nothing in a technical forum? What would you prefer in its stead?Gagne
"The $LANGUAGE standard doesn't precisely define $TERM, but nearly all implementations use $DE_FACTO_STANDARD, which you can safely assume unless you're writing for $OBSCURE_PLATFORM."Clink
@dan04: I wouldn't be surprised if you had a processor with non-8-bit bytes in your pocket right now. The DSPs that are used in the radio portions of modern mobile phones are often pretty weird.Deuteronomy
Personally, the way I see it is that a language isn't defined by an authority - it's defined by "the masses". If the majority of people say that a byte has eight bits, a byte has 8 bits. If the majority of people say that a hacker is someone who bypasses computer security systems, that's what a hacker is. Gay doesn't mean happy, etc etc etc. Language moves on, whatever the "thou shalt speak according to my rulebook" types say.Madelinemadella
@Steve314: Ever served on a standards committee? Understand the strict requirements of normative language in designing technical specifications?Gagne
@tchist - the whole of programming is not in your standards committee. And a standards committee can simply say "for the purposes of this standard, a byte is <whatever> bits". Many definitions in many standards differ from those in wider use, by being more specialised, or often due to the simple fact that the committee ran out of more appropriate words to use.Madelinemadella
@Steve314: I see that the answer to my questions is therefore “no”. If you have some credentials to present that you believe somehow give you the right to talk down to me about the English language, I’d certainly like to see those. I have served on standards committees, and have several books published under my own name. My current work revolves around natural language processing and computational linguistics, which are of course of the descriptive variety. So don’t teach your grandma to suck eggs.Gagne
Imo, this answer is better than Skeets, its more succinct and to the point.Lozenge
H
10

A Byte has 8 bits(8 1's or 0's) 01000111=71

each bit represents a value, 1,2,4,8,16,32,64,128 but from right to left ?

example

128, 64, 32, 16, 8, 4, 2, 1,
0    1   0   0   0  1  1  1 =71
1    1   1   1   1  1  1  1 = max 255
0    0   0   0   0  0  0  0 = min 0

using binary 1's or 0's and only 8 bits(1 byte) we can only have

1 of each value 1 X 128, 1 X 64,1 X 32 etc giving a max total of 255 and a min of 0

Headwater answered 15/12, 2014 at 6:58 Comment(0)
R
7

You are wrong! A byte ranges from 0 to 63 or from 0 to 99!

Do you believe in God? God said in the Holy Bible.

The basic unit of information is a byte. Each byte contains an unspecified amount of information, but it must be capable of holding at least 64 distinct values. That is, we know that any number between 0 and 63, inclusive, can be contained in one byte. Furthermore, each byte contains at most 100 distinct values. On a binary computer a byte must therefore be composed of six bits; on a decimal computer we have two digits per byte.* - The Art of Computer Programming, Volume 1, written by Donald Knuth.

And...

* Since 1975 or so, the word "byte" has come to mean a sequence of precisely eight binary digits, capable of representing the numbers 0 to 255. Real-world bytes are therefore larger than the bytes of the hypothetical MIX machine; indeed, MIX's old-style bytes are just barely bigger than nybbles. When we speak of bytes in connection with MIX we shall confine ourselves to the former sense of the word, harking back to the days when bytes were not yet standardized. - The Art of Computer Programming, Volume 1, written by Donald Knuth.

:-)

Resolute answered 15/2, 2011 at 2:53 Comment(4)
Knuth's first statement applies only to MIX machine bytes -- a MIX machine can be implemented on either a binary computer, in which case a byte holds 0 to 63, or a decimal computer, in which case a byte holds 0 to 99. His footnote makes clear that the term "byte" is not generally limited to that, so it is your statement that is wrong.Thermomotor
@Resolute I obviously did read your second quotation, since that's "His footnote". The point is that your first quote is ripped out of context -- it refers to MIX bytes, not bytes generally. Knuth isn't so silly as to make the claim that "A byte ranges from 0 to 63 or from 0 to 99!" as you did. The fact is that your first quote appears under "Description of MIX", which isn't what the OP asked about, so your answer is wrong, like I said.Thermomotor
@Jim Balter: Oh, I did read my second quotation, too. The point is that you have a dull sense of humor.Resolute
@RedPain: I don't believe god! :PCelenacelene
N
6

A byte has only 8 bits. A bit is a binary digit. So a byte can hold 2 (binary) ^ 8 numbers ranging from 0 to 2^8-1 = 255.

It's the same as asking why a 3 digit decimal number can represent values 0 through 999, which is answered in the same manner (10^3 - 1).

Originally bytes weren't always 8 bits though. They represented 'a couple' of bits that could be 6, 7 or 9 bits as well. That was later standardized and it made sense to make those units a power of two, due to the binary nature of computering. Hence came the nibble (4 bits, or half a byte) and the 8 bit byte.

[edit] That is also why octal and hexadecimal numbering became popular. One octal number represents 3 bits and one hexadecimal number represents 4 bits. So a to digit hexadecimal number can represent exactly one byte. It makes a lot more sense to have number from 0 to 0xFF than from 0 to 255. :)

Nicoline answered 13/2, 2011 at 20:0 Comment(0)
S
3

I'll note that on the PDP-10 series of computers, a byte was a variable-length construct, defined by a "byte pointer" which defined the number of bits as well as the offset from the beginning of the storage area. There were then a set of machine instructions for dealing with a byte pointer, including:

  • LDB - Load Byte
  • DPB - Deposit Byte
  • ILDB - Increment pointer, then Load Byte
  • IDPB - Increment pointer, then Deposit Byte (hope I got this one right - it doesn't feel right)

In fact, a "byte" was what we today would call a bit field. Using a byte pointer to represent the next in a series of bytes of the same size was only one of its uses.

Some of the character sets in use were "sixbit" (upper-case only, six bytes to a 36-bit word), ASCII (upper and lower-case, five bytes to a word, with a bit left over), and only rarely EBCDIC (the IBM character set, which used four eight-bit bytes per word, wastefully leaving four bits per word unused).

Stubble answered 13/2, 2011 at 19:55 Comment(3)
The Common Lisp language has functions called ldb and dpb, with the Hyperspec documentation attributing the names for both to the PDP-10 assembly language: lispworks.com/documentation/HyperSpec/Body/f_dpb.htm.Thais
DSPs still often have variable-length bytes, I think. Or even no bytes at all. (If you interpret "byte" as "the smallest efficiently individually addressable chunk of memory". There are DSPs which can address anything from a single bit to an entire word with the same performance, no mis-alignment penalties. Arguably, there is no such thing as a "byte" on such CPUs.)Deuteronomy
@Jorg: by that definition, there was no such thing as a byte on the PDP-10 CPUs. Only 36-bit words were addressable. The byte pointer held a bit width, bit offset from start of word, and the address of the word.Stubble
M
3

Strictly speaking, it doesn't.

On most modern systems a byte is 8 binary bits, but on some systems this was not always the case (many older computers used 7 bits to represent ASCII characters (aka bytes), and punched card systems were often based on 6-bit characters (aka bytes), for example).

If you're talking about an 8-bit byte, this can represent any range you wish. However, it can only represent 256 distinct values, so it is typically used to represent 0..255 ("unsigned byte") or -128..+127 ("signed byte").

Mcdaniels answered 13/2, 2011 at 20:9 Comment(2)
Could you name a computer that had a 7 bit byte? I don't believe there were any, and I think this is a confused claim.Thermomotor
It depends whether you define "byte" in the sense of "a character" (which is how it originally came about: en.wikipedia.org/wiki/Byte), or in terms of "the native size of a CPU register", which is perhaps what youre thinking of. ASCII is a very common standard and it is based on a 7-bit "byte" (in the former sense). Many computers supported ASCII, hence many computers used/supported 7 bit "bytes". Before ASCII, punched card characters were usually represented as 6 bit bytes.Mcdaniels
L
0

if a bit can only take on 1s and 0s, then the total number of unique values this can take on is 256. [0,0,0,0,0,0,0,0]. For example: [0,0,0,0,0,0,0,1] represents 1 and [1,0,0,0,0,0,0,0] represents 255 and [0,0,0,0,0,0,0,0] represents 0. So the total number of unique values is from 0-255 which is 256 values.

Leslileslie answered 17/12, 2023 at 17:34 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.