In Java 8, is there a ByteStream class?
Asked Answered
N

5

61

Java 8 provides Stream<T> specializations for double, int and long: DoubleStream, IntStream and LongStream respectively. However, I could not find an equivalent for byte in the documentation.

Does Java 8 provide a ByteStream class?

Northernmost answered 8/9, 2015 at 13:54 Comment(2)
#22919347Cyrilla
Does this answer your question? Why are new java.util.Arrays methods in Java 8 not overloaded for all the primitive types?Mindoro
B
52

No, it does not exist. Actually, it was explicitly not implemented so as not to clutter the Stream API with tons of classes for every primitive type.

Quoting a mail from Brian Goetz in the OpenJDK mailing list:  

Short answer: no.

It is not worth another 100K+ of JDK footprint each for these forms which are used almost never. And if we added those, someone would demand short, float, or boolean.

Put another way, if people insisted we had all the primitive specializations, we would have no primitive specializations. Which would be worse than the status quo.

Bickford answered 8/9, 2015 at 13:57 Comment(6)
Seriously? Byte streams are used "almost never"? I wonder what planet that guy is living on, because in the real world streams of bytes are ubiquitous.Schreiber
@Schreiber You'd have to ask that guy to know for sure :-) My impression is that the kind of byte streams most devs are familiar with are more on the line of ByteArrayInputStream / ByteArrayOutputStream (used for I/O-operations, bulk data processing, etc.). These objects are conceptually quite different from Streams of the Java 8 Stream API, which are used in functional programming.Micron
I'm with @augurar. There is Arrays.stream(int[] array), Arrays.stream(long[] array) and Arrays.stream(double[] array) but not Arrays.stream(byte[] array) or the other primitive types. Actually, I find it rather ridiculous.Colonist
Ah yes, it's nice to see the thing I wanted was not implemented because they just didn't feel like it.Bushnell
Everyone - 1) You can implement it yourself. 2) You can find a 3rd-party implementation. 2a) If you can't find a 3rd-party implementation, that implies something about the degree to which ByteStream is actually needed.Spelling
Sometimes I'm very much... mmm... surprised with logic of these guys. Really.Tare
B
53

Most of the byte-related operations are automatically promoted to int. For example, let's consider the simple method which adds a byte constant to each element of byte[] array returning new byte[] array (potential candidate for ByteStream):

public static byte[] add(byte[] arr, byte addend) {
    byte[] result = new byte[arr.length];
    int i=0;
    for(byte b : arr) {
        result[i++] = (byte) (b+addend);
    }
    return result;
}

See, even though we perform an addition of two byte variables, they are widened to int and you need to cast the result back to byte. In Java bytecode most of byte-related operations (except array load/store and cast to byte) are expressed with 32-bit integer instructions (iadd, ixor, if_icmple and so on). Thus practically it's ok to process bytes as ints with IntStream. We just need two additional operations:

  • Create an IntStream from byte[] array (widening bytes to ints)
  • Collect an IntStream to byte[] array (using (byte) cast)

The first one is really easy and can be implemented like this:

public static IntStream intStream(byte[] array) {
    return IntStream.range(0, array.length).map(idx -> array[idx]);
}

So you may add such static method to your project and be happy.

Collecting the stream into byte[] array is more tricky. Using standard JDK classes the simplest solution is ByteArrayOutputStream:

public static byte[] toByteArray(IntStream stream) {
    return stream.collect(ByteArrayOutputStream::new, (baos, i) -> baos.write((byte) i),
            (baos1, baos2) -> baos1.write(baos2.toByteArray(), 0, baos2.size()))
            .toByteArray();
}

However it has unnecessary overhead due to synchronization. Also it would be nice to specially process the streams of known length to reduce the allocations and copying. Nevertheless now you can use the Stream API for byte[] arrays:

public static byte[] addStream(byte[] arr, byte addend) {
    return toByteArray(intStream(arr).map(b -> b+addend));
}

My StreamEx library has both of these operations in the IntStreamEx class which enhances standard IntStream, so you can use it like this:

public static byte[] addStreamEx(byte[] arr, byte addend) {
    return IntStreamEx.of(arr).map(b -> b+addend).toByteArray();
}

Internally toByteArray() method uses simple resizable byte buffer and specially handles the case when the stream is sequential and target size is known in advance.

Brewer answered 9/9, 2015 at 4:18 Comment(6)
baos1.write(baos2.toByteArray(), 0, baos2.size()) is an unnecessarily complicate merger. First, toByteArray() always returns an appropriately sized array, so , 0, baos2.size() is not needed. The reason, the array is always appropriately sized, is that it always returns a newly allocated array. If you want to avoid this overhead, consider using baos2.writeTo(baos1) instead, that’s shorter and more efficient.Brookebrooker
By the way, the cast from int to byte is unnecessary when writing a single byte to an OutputStream, hence ByteArrayOutputStream::write is sufficient as accumulator function.Brookebrooker
@Holger, both writeTo and write(byte[]) declared throwing an IOException, so you would need an explicit try-catch. I just selected the shortest version (write(byte[], int, int) does not throw - crazy, I know). writeTo would be more efficient indeed. As for explicit cast, I don't remember. Probably I decided that such version would be more clear.Brewer
Granted, writoTo requires a try…catch around it, so {try{baos2.writeTo(baos1);}catch(IOException x){} } is not shorter than baos1.write(baos2.toByteArray(), 0, baos2.size()), but it’s not significantly larger (but more efficient). writeTo had to declare IOException as you can pass an arbitrary OutputStream as argument. The write(byte[]) method has not been overwritten, so unfortunately, it has the general OutputStream.write(byte[]) signature. Reminds me on this issueBrookebrooker
Quite a space requirement to store 8 bits each in a 32-bit location, isn't it?Knudsen
@Knudsen A stream is not a storage structure. It’s a tool for processing data and, as this answer already says, “most of the byte-related operations are automatically promoted to int” in Java anyway. Which doesn’t hurt considering that today’s CPUs have 64 bit wide data registers anyway. The storage still is a byte[] here.Brookebrooker
B
52

No, it does not exist. Actually, it was explicitly not implemented so as not to clutter the Stream API with tons of classes for every primitive type.

Quoting a mail from Brian Goetz in the OpenJDK mailing list:  

Short answer: no.

It is not worth another 100K+ of JDK footprint each for these forms which are used almost never. And if we added those, someone would demand short, float, or boolean.

Put another way, if people insisted we had all the primitive specializations, we would have no primitive specializations. Which would be worse than the status quo.

Bickford answered 8/9, 2015 at 13:57 Comment(6)
Seriously? Byte streams are used "almost never"? I wonder what planet that guy is living on, because in the real world streams of bytes are ubiquitous.Schreiber
@Schreiber You'd have to ask that guy to know for sure :-) My impression is that the kind of byte streams most devs are familiar with are more on the line of ByteArrayInputStream / ByteArrayOutputStream (used for I/O-operations, bulk data processing, etc.). These objects are conceptually quite different from Streams of the Java 8 Stream API, which are used in functional programming.Micron
I'm with @augurar. There is Arrays.stream(int[] array), Arrays.stream(long[] array) and Arrays.stream(double[] array) but not Arrays.stream(byte[] array) or the other primitive types. Actually, I find it rather ridiculous.Colonist
Ah yes, it's nice to see the thing I wanted was not implemented because they just didn't feel like it.Bushnell
Everyone - 1) You can implement it yourself. 2) You can find a 3rd-party implementation. 2a) If you can't find a 3rd-party implementation, that implies something about the degree to which ByteStream is actually needed.Spelling
Sometimes I'm very much... mmm... surprised with logic of these guys. Really.Tare
C
4

I like this solution since it does it at runtime from a byte [], rather than building a collection and then streaming from a collection. This just does one byte at a time to the stream I believe.

byte [] bytes =_io.readAllBytes(file);
AtomicInteger ai = new AtomicInteger(0);

Stream.generate(() -> bytes[ai.getAndIncrement()]).limit(bytes.length);

However this is quite slow due to the synchronization bottleneck of the AtomicInteger, so back to imperative loops!

Courtnay answered 2/10, 2020 at 22:42 Comment(1)
I would suggest always measuring the performance before reaching such conclusions (not saying you didn't). Such usages of atomics are often surprisingly fast especially if no actual runtime contention events actually occur.Dehydrogenate
M
3

Use com.google.common.primitives.Bytes.asList(byte[]).stream() instead.

Morehouse answered 10/2, 2021 at 16:13 Comment(0)
K
2

if you don't have a ByteStream, build one

Stream.Builder<Byte> builder = Stream.builder();
for( int i = 0; i < array.length; i++ )
  builder.add( array[i] );
Stream<Byte> stream = builder.build();

...where array can be of type byte[] or Byte[]

Knudsen answered 23/10, 2019 at 10:23 Comment(2)
Involves copying of all the data though.Circuitous
Yep. This is always the case when creating a new stream.Knudsen

© 2022 - 2024 — McMap. All rights reserved.