In java, InputStream
class has methods read(byte[], int, int)
and readNBytes(byte[], int, int)
. It seems that these two methods have exactly the same functionality, so I wonder what are the differences between them.
Edited for better visibility of discussion in the comments:
read()
says it attempts to read "up tolen
bytes ... but a smaller number may be read. This method blocks until input data is available, end of file is detected, or an exception is thrown."readNBytes()
says "blocks untillen
bytes of input data have been read, end of stream is detected, or an exception is thrown."
Even though the JDK's implementation for InputStream
is likely to give you identical results for both methods, the documented differences mean than other classes inheriting from it may behave differently.
E.g. Given the stream '12345<end>'
, read(s,0,10)
is allowed to return '123'
, whereas readNbytes()
is more likely to keep going to look for an end-of-stream and give you the whole thing.
Original answer:
You're right that the javadocs are very similar. When in doubt, always drop down to the source. Most IDEs make it easy to attach the OpenJDK source and lets you drill down to them.
This is readNBytes
from InputStream.java:
public int readNBytes(byte[] b, int off, int len) throws IOException {
Objects.requireNonNull(b);
if (off < 0 || len < 0 || len > b.length - off)
throw new IndexOutOfBoundsException();
int n = 0;
while (n < len) {
int count = read(b, off + n, len - n);
if (count < 0)
break;
n += count;
}
return n;
}
As you can see, it actually performs a call to read(byte[],int,int)
. The difference in this case is that if the actual read bytes is less than your specified len
, it will attempt to read() again until it is confirmed that there is actually nothing left to be read.
Edit: Note that
- This is OpenJDK's implementation of the base
InputStream
. Others may differ. - Subclasses of
InputStream
may also have their own overridden implementation. Consult the doc/source for the relevant class.
read()
just says it attempts to read up to len
bytes, whereas readNBytes()
actually says up to len
bytes or end-of-stream is reached. Though in the source we can see that both behave very similarly, the way the doc is worded means that it allows overrides of read()
to just give you whatever it can as long as it's less than len
bytes. Even if there's an EOS within range, it might stop beforehand. –
Anaplastic read()
method can be overridden, in which situation readNBytes()
does differ. Thanks a lot. –
Kubiak readNBytes()
does not say 'reads up to len
bytes'. It says 'Reads the requested number of bytes from the input stream into the given byte array', and goes on to make it clear that exactly len
bytes are read unless end of stream intervenes. Furthermore the two methods do not 'behave very similarly'. –
Mcgough readNBytes(,,)
calls read(,,)
, which calls read()
; and the doc for read()
says "..blocks until input data is available, the end of the stream is detected, or an exception is thrown". Which causes the default implementations to behave very similarly if read()
actually does what it should do. –
Anaplastic read()
in a loop whose purpose is to ensure that it reads len
bytes and no fewer unless end of tream intervenes. There is no 'very similarly' about it. –
Mcgough read(,,)
in a loop. And in turn, read(,,)
calls read()
in a loop. The doc for read()
says: "blocks until input data is available, the end of the stream is detected, or an exception is thrown". So if the implementation for read()
follows its documented behaviour, you'd reach EOS with read(,,)
too. –
Anaplastic readNBytes()
would read len
bytes where read()
need not. –
Mcgough read(void)
. Hence the similarities, unless overridden. –
Anaplastic The difference between the two methods is that readNBytes(byte[] b, int offset , int len)
will actually attempt to read len
bytes before returning, whereas read(byte[] b, int off, int len)
will simply return whenever there's data available. Depending on how data arrives, the read(...)
may therefore require many more iterations than the readNBytes(...)
before the InputStream has been read in full.
Bottom line: You often want to process the stream in chunks of X bytes. If so, it is much better to use readNBytes(...)
.
Let's say the InputStream is 3500 bytes in total and you want to process in chunks of 1024 bytes (meaning len = 1024
).
With readNBytes(...)
you are guaranteed the stream will be read in exactly 4 method invocations: The first 3 will read 1024 bytes each and the fourth will read 428 bytes.
With read(...)
you don't really know how many method invocations it will require to read this stream. It might require only 4 invocations but it may also be more.
Difference: len only 【expects】 to read in the number of bytes, while read(byte[] b, int off, int len) may continue to execute subsequent code even if it does not read full len, while readNBytes (byte[] b, int off, int len) ensure that subsequent code is not executed if it does not read full len
ByteArrayInputStream.read(byte [] b, int off, int len) is implemented as:
public synchronized int read(byte[] b, int off, int len) {
Objects.checkFromIndexSize(off, len, b.length);
if (this.pos >= this.count) {
return -1;
} else {
int avail = this.count - this.pos;
if (len > avail) {
len = avail;
}
if (len <= 0) {
return 0;
} else {
System.arraycopy(this.buf, this.pos, b, off, len);
this.pos += len;
return len;
}
}
}
It can be seen that when len>available, len is truncated to available, and the actual number of bytes read subsequently=available, while available represents the estimated value of the number of bytes that can still be read in the current stream. There is a possibility that after executing int available=this. count - this. pos; After calculating the availability in this code, due to new data entering ByteArrayInputStream, this. count was increased by other threads. For example, there were originally 10 bytes in the input stream that were not read, but we wanted to get 20 bytes, so we called read (s, 0,20) l, and in the read function, available=this. count - this. pos; After execution, the available is 10, but at this point, another thread added 100 bytes to the stream. Therefore, even though the expected 20 bytes could be read, only 10 were read.
Let's take a look at the implementation of InputStream. readNBytes (byte [] b, int off, int len):
public int readNBytes(byte[] b, int off, int len) throws IOException {
Objects.checkFromIndexSize(off, len, b.length);
int n;
int count;
for(n = 0; n < len; n += count) {
count = this.read(b, off + n, len - n);
if (count < 0) {
break;
}
}
return n;
}
The for loop here is the difference between read and readNbytes. ReadNbytes does not compromise (len=available), but instead stays in the function through the for loop when there is not enough data to read and does not continue to execute. Using the previous example, len=20, and currently there are only 10 bytes in the stream, the first round of the loop did not read enough 20 bytes, but it will continue to loop to ensure that 20 bytes are read when exiting the readNBytes function
© 2022 - 2024 — McMap. All rights reserved.