Why are there no byte or short literals in Java?

Asked 25/11, 2008 at 15:57 Answered 26/5, 2024 at 5:58

I can create a literal long by appending an L to the value; why can't I create a literal short or byte in some similar way? Why do I need to use an int literal with a cast?

And if the answer is "Because there was no short literal in C", then why are there no short literals in C?

This doesn't actually affect my life in any meaningful way; it's easy enough to write (short) 0 instead of 0S or something. But the inconsistency makes me curious; it's one of those things that bother you when you're up late at night. Someone at some point made a design decision to make it possible to enter literals for some of the primitive types, but not for all of them. Why?

Windywindzer answered 25/11, 2008 at 15:57 Comment(0)

In C, int at least was meant to have the "natural" word size of the CPU and long was probably meant to be the "larger natural" word size (not sure in that last part, but it would also explain why int and long have the same size on x86).

Now, my guess is: for int and long, there's a natural representation that fits exactly into the machine's registers. On most CPUs however, the smaller types byte and short would have to be padded to an int anyway before being used. If that's the case, you can as well have a cast.

Embonpoint answered 25/11, 2008 at 16:9 Comment(1)

This is just wild and inaccurate guesswork. There are valid syntactical and logical reasons, and this isn't one of them. The main reason is that they are largely redundant. d and f and are needed to turn integer literals into double/float/long literals, and l is needed to allow an otherwise integer-looking literal to overflow the capacity of an integer. None of those reasons apply to bytes or shorts. – Rhpositive 26/5, 2024 at 6:37

I suspect it's a case of "don't add anything to the language unless it really adds value" - and it was seen as adding sufficiently little value to not be worth it. As you've said, it's easy to get round, and frankly it's rarely necessary anyway (only for disambiguation).

The same is true in C#, and I've never particularly missed it in either language. What I do miss in Java is an unsigned byte type :)

Keitloa answered 25/11, 2008 at 16:2 Comment(3)

yes, please add unsigned byte :( . for the issue at hand, i think it's worth to mention there is a long iteral because an int literal cannot represent all values of a long. But an int can represent all values of a short on the other hand. – Tribade 25/11, 2008 at 16:4

But I think it does add value - not having to cast can be a big advantage. short val = (short)(val + 10) is just annoying although Java would have to also allow addition of shorts for that to work. – Masoretic 26/4, 2013 at 9:34

@mjaggard: It's a matter of adding sufficiently little value to make it not worth it. It's slightly irksome, but far from the biggest problem... – Keitloa 26/4, 2013 at 10:39

Another reason might be that the JVM doesn't know about short and byte. All calculations and storing is done with ints, longs, floats and doubles inside the JVM.

Raber answered 25/11, 2008 at 16:4 Comment(11)

Okay, but is that true of every JVM? And even if it is, should the language definition really be dependent on implementation details of the VM it runs on? Shouldn't that be the other way around? – Windywindzer 25/11, 2008 at 16:8

It's part of the JVM spec, so yes: the JVM has no way of knowing what type the value it handles really is. (Except for some operations such as array access, where separate opcodes exist). – Raber 25/11, 2008 at 16:17

Storing isn't always done with ints etc. The obvious example is an array of bytes, which is certainly stored as an array of bytes rather than ints. See also #230386 – Keitloa 25/11, 2008 at 16:25

Right, arrays are a special case that do have per-type instructions (probably because byte[] that used 4 byte per entry would be a tiny bit too wasteful) – Raber 25/11, 2008 at 22:38

"The JVM doesn't know about short and byte.": does this mean that when I declare a member variable to be of type byte or short it actually takes up as much as an int? – Chiffon 27/11, 2011 at 19:55

If JVM doesn't know about bytes, why (int)(byte)100000001 == 1? – Royalroyalist 19/5, 2014 at 9:27

@kamczak: because the compiler (or "the Java language" if you want) has been designed to produce bytecode that produces that output. – Raber 19/5, 2014 at 13:2

@Royalroyalist Joachim is wrong. The JVM doesn't know about byte constants, but it does know about bytes. The JVM has 10 field types, and signed bytes are one of them. Bytes can be part of static method signatures, pushed on the stack, determine the types of arrays, converted to other data types, etc. The only real thing missing is byte constants. The major JVMs implement bytes using 32 bits, but that's an implementation detail: your example is part of the JVM's behavior, not something the compiler simulates through bytecode. Google for "JVM field descriptor" if you want more information. – Conversation 29/8, 2014 at 10:28

@Giorgio: Yes, byte fields in OpenJDK, Oracle, and IBM JVMs occupy 4 bytes. It's a chicken-and-egg issue: nobody uses short fields because they save no space, and JVMs don't try to save space on short fields because nobody uses them. Array elements are different, of course. – Hole 29/3, 2016 at 2:25

"the JVM doesn't know about short and byte" - Actually, it does now about them. Ref JVMS 2.3 ... and opcodes like i2b, baload etcetera. You are correct that arithmetic etc are only done using 32 or 64 bit operations (bytecodes). But that is mandated by the Java spec rather than the JVM spec. – Evertor 15/8, 2018 at 13:17

(Unfortunately, people are citing this Answer ...) – Evertor 15/8, 2018 at 13:19

There are several things to consider.

1) As discussed above the JVM has no notion of byte or short types. Generally these types are not used in computation at the JVM level; so one can think there would be less use of these literals.

2) For initialization of byte and short variables, if the int expression is constant and in the allowed range of the type it is implicitly cast to the target type.

3) One can always cast the literal, ex (short)10

Zoe answered 26/2, 2013 at 20:23 Comment(0)

As Jon Skeet implies in his answer, distinct literal types for byte and short (and char expressed as numbers) are rarely needed. (He says that he doesn't particularly miss them, ergo ...) For the rare cases where you do need them, you can use a type cast; see below.

There are >>two<< reasons why they are not needed from the perspective of the Java language:

As noted by previous answers, arithmetic and relational operations on byte, short and char values are specified to be performed using 32 bit integer operands. Thus, when you write this:
```
byte b = ...
if (b > 1) {
    ...
}
```
the value of b is promoted to an int before comparing it with the int value 1. This behavior is specified in JLS 5.6 and the relevant sections of JLS Chapter 15.

You don't need to specify that 1 is a byte to compare it with a byte value ... since the comparison is done as an int.
When you assign a constant expression of type int to a variable of types byte, short or char, there is a special rule that says (in effect) that there is an implicit type cast ... if the value being assigned is in the range of the variable. For example:
```
byte b = 42;         // OK
byte b2 = 1000;      // Compilation error - out of range
byte b3 = b + 1;     // Compilation error - not a constant expression
byte b4 = (byte) 42; // OK ... but redundant
```
This is specified in JLS 5.2

To my knowledge, the only case where you need to explicitly cast an integer literal (other than for value conversion purposes) is when you are calling a method with overloads for (say) int and byte arguments; e.g.

public void foo(int i) {...}
public void foo(byte b) {...}

foo(42);         // binds to first overload
foo((byte) 42);  // binds to second overload

As I commented on another answer, the fact that the JVM bytecodes don't support arithmetic operations on bytes, shorts and chars is an implementation detail. It is a consequence of the Java language spec's rules ... not the other way around.

^{How James Gosling et al reached these design decisions is not public knowledge. It happened back before Java 1.0, when the language was still called Oak. The indications are in the documentation for Oak, if you can track it down.}

Evertor answered 26/5, 2024 at 5:58 Comment(0)

Recommended topics

Hot tags