Apart from using (byte[]) in streaming I don't really see byte and short used much. On the other hand I have seen long used where the actual value is |100| and byte would be more appropriate. Is this a consequence of the relative inexpensive nature of memory now or is this just minutia that developers needn't worry about?
They are used when programming for embedded devices that are short on memory or disk space. Such as appliances and other electronic devices.
Byte is also used in low level web programming, where you send requests to web servers using headers, etc.
The byte
datatype is frequently used when dealing with raw data from a file or network connection, though it is mostly used as byte[]
. The short
and short[]
types are often used in connection with GUIs and image processing (for pixel locations & image sizes), and in sound processing.
The primary reason for using byte
or short
is one of clarity. The program code states uncategorically that only 8 or 16 bits are to be used, and when you accidentally use a larger type (without an appropriate typecast) you get a compilation error. (Admittedly, this could also be viewed as a nuisance when writing the code ... but once again the presence of the typecasts flags the fact that there is truncation happening to the reader.)
You don't achieve any space saving by using byte
or short
in simple variables instead of int
, because most Java implementations align stack variables and object members on word boundaries. However, primitive array types are handled differently; i.e. elements of boolean
, byte
, char
and short
arrays are byte aligned. But unless the arrays are large in size or large in number, they doesn't make any significant contribution to the app's overall memory usage.
So I guess that the main reason that developers don't use byte
or short
as much as you (a C developer?) might expect is that it really doesn't make much (or often any) difference. Java developers tend not to obsess over memory usage like old-school C developers did :-).
In a 64-bit processor, the registers are all 64-bit so if your local variable is assigned to a register and is a boolean, byte, short, char, int, float, double or long it doesn't use memory and doesn't save any resources. Objects are 8-byte aligned so they always take up a multiple of 8-byte in memory. This means Boolean, Byte, Short, Character, Integer, Long , Float and Double, AtomicBoolean, AtomicInteger, AtomicLong, AtomicReference all use the same amount of memory.
As has been noted, short types are used for arrays and reading/writing data formats. Even then short is not used very often IMHO.
Its also worth noting that a GB cost about £80 in a server, so a MB is about 8 pence and a KB is about 0.008 pence. The difference between byte and long is about 0.00006 pence. Your time is worth more than that. esp if you ever have a bug which resulted from having a data type which was too small.
Stephen C's answer above is incorrect. (Sorry I don't have enough reputation points to comment, so I have to post an answer here)
He stated
"You don't achieve any space saving by using byte or short in simple variables instead of int, because most Java implementations align stack variables and object members on word boundaries"
It's not true. The following is run on Oracle JDK1.8.0 , with jol
public class CompareShorts {
public static void main(String[] args) {
System.out.println(VM.current().details());
System.out.println(ClassLayout.parseInstance(new PersonalDetailA()).toPrintable());
System.out.println(ClassLayout.parseInstance(new PersonalDetailB()).toPrintable());
}
}
class PersonalDetailA {
short height;
byte color;
byte gender;
}
class PersonalDetailB{
int height;
int color;
int gender;
}
The output:
# Running 64-bit HotSpot VM.
# Using compressed oop with 3-bit shift.
# Using compressed klass with 3-bit shift.
# Objects are 8 bytes aligned.
# Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
# Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
com.hunterstudy.springstudy.PersonalDetailA object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 82 22 01 f8 (10000010 00100010 00000001 11111000) (-134143358)
12 2 short PersonalDetailA.height 0
14 1 byte PersonalDetailA.color 0
15 1 byte PersonalDetailA.gender 0
Instance size: 16 bytes
Space losses: 0 bytes internal + 0 bytes external = 0 bytes total
com.hunterstudy.springstudy.PersonalDetailB object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) e1 24 01 f8 (11100001 00100100 00000001 11111000) (-134142751)
12 4 int PersonalDetailB.height 0
16 4 int PersonalDetailB.color 0
20 4 int PersonalDetailB.gender 0
Instance size: 24 bytes
Space losses: 0 bytes internal + 0 bytes external = 0 bytes total
As you can see, the class instance using shorts and bytes takes 16 bytes, and the class instance using ints takes 24 bytes. So it does save eight bytes of memory per class instance.
Arithmetic on byte
s and short
s is more awkward than with int
s. For example, if b1
and b2
are two byte
variables, you can't write byte b3 = b1 + b2
to add them. This is because Java never does arithmetic internally in anything smaller than an int
, so the expression b1 + b2
has type int
even though it is only adding two byte
values. You'd have to write byte b3 = (byte) (b1 + b2)
instead.
I would most often use the short
and byte
types when working with binary formats and DataInput/DataOutput instances. If the spec says the next value is an 8bit or 16bit value and there's no value in promoting them to int
(perhaps they're bit flags), they are an obvious choice.
I used short
extensively when creating an emulator based on a 16-bit architecture. I considered using char
so I could have stuff unsigned but the spirit of using a real integer type won out in the end.
edit: regarding the inevitable question about what I did when I needed the most significant bit: with the thing I was emulating it happened to almost never get used. In the few places it was used, I just used bitwise modifiers or math hackery.
I think in most applications short has no domain meaning, so it makes more sense to use Integer.
short
and others are often used for storing image data. Note that it is the number of bits which is really important, not the arithmetic properties (which just cause promotion to int
or better.
short
is also used as array indexes in JavaCard (1.0 and 2.0, IIRC, but not 3.0 which also has an HTTP stack and web services).
byte[] happens all the time; buffers, specifically for networks, files, graphics, serialization, etc.
Most of the time, there's never a real good technical reason for a developer (Java, C#, BASIC, etc.) to decide for an int, short or byte - when the capacity is enough, of course. If the value will be under 2 billion then int it will be.
Are you sure we'll have people older than 255? Well, you never know!
Aren't 32,767 possible countries enough? Don't think too small!
In your example, you can be perfectly happy with your byte var containing 100, if you are absolutely sure than it will NEVER overflow. Why do guys use int the most? Because.... because.
This is one of those things that most of us just do because we saw it that way most of the time, and never asked differently.
Of course, I have nothing against "all things int". I just prefer to use the right type for each kind of value, no stress involved.
The general information you come across is the fact that byte
, generally byte []
is used in manipulating binary data, for example an image file, or while sending data over the network. I'll mention now, other use cases:
Encoding Strings in Java
In Java, the String object uses UTF-16, and the latter is immutable, i.e cannot be modified.
In order to encode strings, we convert them to UTF-8, which is compatible with ASCII. one way to do it, is to use java core, you can find out about more ways to go here.
To perform encoding, we copy the original string bytes to a byte array, and then create the desired one. Below, I'll give a simple example that shows why we need to encode strings, and how to do it:
- Why is encoding important?
Imagine you have this German word "Tschüss" and you're using US-ASCII:
String germanString = "Tschüss";
byte[] germanBytes = germanString.getBytes();
String asciiEncodedString = new String(germanBytes,StandardCharsets.US_ASCII);
assertNotEquals(asciiEncodedString, germanString);
and the output will simply be:
Tsch?ss
Because US_ASCII doesn't recognize the "ü".
Now here's the example that works:
String germanString = "Tschüss";
byte[] germanBytes = germanString.getBytes(StandardCharsets.UTF_8);
String utf8EncodedString = new String(germanBytes, StandardCharsets.UTF_8);
assertEquals(germanString, utf8EncodedString);
byte[ ] Better than String in Performance Sometimes
For the record, Java String is an Object that uses char arrays under the hood, including other data, you can find more in this answer.
Now imagine the case where you want to parse a huge amount of string data and using for instance split
method. In this case, you'll have different objects (different char arrays) spread in different locations in the memory, which results in worse locality to the CPU, contrary to the case where you have a byte array since the beginning in one location. You can find out more in this interesting post this one.
© 2022 - 2024 — McMap. All rights reserved.