Java's Virtual Machine's Endianness
Asked Answered
C

2

37

What endianness does Java use in its virtual machine? I remember reading somewhere that it depends on the physical machine it's running on, and then other places I have read that it is always, I believe, big endian. Which is correct?

Copious answered 11/6, 2009 at 14:47 Comment(0)
O
35

Multibyte data in the class files are stored big-endian.

From The Java Virtual Machine Specification, Java SE 7 Edition, Chapter 4: The class File Format:

A class file consists of a stream of 8-bit bytes. All 16-bit, 32-bit, and 64-bit quantities are constructed by reading in two, four, and eight consecutive 8-bit bytes, respectively. Multibyte data items are always stored in big-endian order, where the high bytes come first.

Furthermore, the operand in an bytecode instruction is also big-endian if it spans multiple bytes.

From The Java Virtual Machine Specification, Java SE 7 Edition, Section 2.11: Instruction Set Summary:

If an operand is more than one byte in size, then it is stored in big-endian order-high-order byte first. For example, an unsigned 16-bit index into the local variables is stored as two unsigned bytes, byte1 and byte2, such that its value is (byte1 << 8) | byte2.

So yes, I think it can be said that the Java Virtual Machine uses big-endian.

Ocieock answered 11/6, 2009 at 14:50 Comment(3)
This answer is highly misleading. All references explain how multi-byte values are stored in class files. And the class file indeed uses big endian. However at run-time, all Java implementations that I know of store data of variables and data structures in native byte order. It most likely also applies to instruction operands once the class file has been loaded into a better executable format. Everything else would be tremendously slow on little endian architectures such as i386.Partheniaparthenocarpy
A JVM can give the appearance of being big-endian from the POV of the bytecode it executes while still actually storing multi-byte values in native endianness. There's an "as-if" rule at work here that as long as the JVM behaves as its supposed to from the POV of guest code running in it, the actual implementation details are irrelevant. e.g. interpret vs. JIT to native code. Since Java code can't easily cast an int to a byte[], this detail is generally not visible to Java code running in a JVM, so it's easy for the JVM to just use native C int32_t.Summerville
@Partheniaparthenocarpy Java is tremendously slow on little endian architectures such as i386. If the instruction set specifies that operations are required to have big-endian parameters, then storing them as little endian would result in even worse performance because you would have to convert from little endian to big endian only to use the instruction set, which in turn would have to reverse to little endian to do the operation and the result turn it to big endian to match the instruction set specs, then BACK AGAIN to little endian only to store the data as big endian in the resulting files.Durst
S
21

The actual working data stored in the running process will almost certainly match the endianess of the executing process. Generally file formats (including class files) will be in network order (big endian).

It's generally difficult to tell what the machine is doing underneath, as it is abstracted away by the virtual machine. You can't cast a short[] to byte[] as you can in C and C++. java.nio.ByteOrder.nativeOrder() should give you the underlying endianess. Matching endianess is useful when using non-byte NIO buffers.

Selfsealing answered 11/6, 2009 at 15:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.