java: read vs readNBytes of the InputStream instance

Asked 13/12, 2018 at 2:52 Answered 17/7 at 20:27

In java, InputStream class has methods read(byte[], int, int) and readNBytes(byte[], int, int). It seems that these two methods have exactly the same functionality, so I wonder what are the differences between them.

Kubiak answered 13/12, 2018 at 2:52 Comment(3)

What answer do you expect beyond a copy and paste from the respective Javadocs? – Facesaving 13/12, 2018 at 2:56

@ElliottFrisch I just want to know the differences between them, is there any subtle differences about functionality or implementation or anything? Their javadocs are so alike and I cannot figure out the difference. – Kubiak 13/12, 2018 at 3:25

What part of the Javadoc didn't you understand? The difference is perfectly clear. – Mcgough 6/4, 2021 at 1:52

Edited for better visibility of discussion in the comments:

read() says it attempts to read "up to len bytes ... but a smaller number may be read. This method blocks until input data is available, end of file is detected, or an exception is thrown."
readNBytes() says "blocks until len bytes of input data have been read, end of stream is detected, or an exception is thrown."

Even though the JDK's implementation for InputStream is likely to give you identical results for both methods, the documented differences mean than other classes inheriting from it may behave differently.

E.g. Given the stream '12345<end>', read(s,0,10) is allowed to return '123', whereas readNbytes() is more likely to keep going to look for an end-of-stream and give you the whole thing.

Original answer:

You're right that the javadocs are very similar. When in doubt, always drop down to the source. Most IDEs make it easy to attach the OpenJDK source and lets you drill down to them.

This is readNBytes from InputStream.java:

public int readNBytes(byte[] b, int off, int len) throws IOException {
    Objects.requireNonNull(b);
    if (off < 0 || len < 0 || len > b.length - off)
        throw new IndexOutOfBoundsException();
    int n = 0;
    while (n < len) {
        int count = read(b, off + n, len - n);
        if (count < 0)
            break;
        n += count;
    }
    return n;
}

As you can see, it actually performs a call to read(byte[],int,int). The difference in this case is that if the actual read bytes is less than your specified len, it will attempt to read() again until it is confirmed that there is actually nothing left to be read.

Edit: Note that

This is OpenJDK's implementation of the base InputStream. Others may differ.
Subclasses of InputStream may also have their own overridden implementation. Consult the doc/source for the relevant class.

Anaplastic answered 13/12, 2018 at 6:37 Comment(9)

Right you are! But I wonder in which cases the results carried out by these two methods will be different. – Kubiak 13/12, 2018 at 9:7

Seems that the javadoc for read() just says it attempts to read up to len bytes, whereas readNBytes() actually says up to len bytes or end-of-stream is reached. Though in the source we can see that both behave very similarly, the way the doc is worded means that it allows overrides of read() to just give you whatever it can as long as it's less than len bytes. Even if there's an EOS within range, it might stop beforehand. – Anaplastic 13/12, 2018 at 22:24

Great insight! I have never thought that read() method can be overridden, in which situation readNBytes() does differ. Thanks a lot. – Kubiak 14/12, 2018 at 2:44

The Javadoc for readNBytes() does not say 'reads up to len bytes'. It says 'Reads the requested number of bytes from the input stream into the given byte array', and goes on to make it clear that exactly len bytes are read unless end of stream intervenes. Furthermore the two methods do not 'behave very similarly'. – Mcgough 6/4, 2021 at 1:51

@Mcgough Fair point. I'm almost certain the short description at the time did say the above. I'll update, thank you. That said, if you look at the OpenJDK source, readNBytes(,,) calls read(,,), which calls read(); and the doc for read() says "..blocks until input data is available, the end of the stream is detected, or an exception is thrown". Which causes the default implementations to behave very similarly if read() actually does what it should do. – Anaplastic 6/4, 2021 at 3:5

It calls read() in a loop whose purpose is to ensure that it reads len bytes and no fewer unless end of tream intervenes. There is no 'very similarly' about it. – Mcgough 6/4, 2021 at 3:40

It calls read(,,) in a loop. And in turn, read(,,) calls read() in a loop. The doc for read() says: "blocks until input data is available, the end of the stream is detected, or an exception is thrown". So if the implementation for read() follows its documented behaviour, you'd reach EOS with read(,,) too. – Anaplastic 6/4, 2021 at 3:56

Of course you would reach EOS, but if you didn't, readNBytes() would read len bytes where read() need not. – Mcgough 7/4, 2021 at 2:28

It need not, but it will in the OpenJDK implementation, because of the nature of read(void). Hence the similarities, unless overridden. – Anaplastic 7/4, 2021 at 3:2

The difference between the two methods is that readNBytes(byte[] b, int offset , int len) will actually attempt to read len bytes before returning, whereas read(byte[] b, int off, int len) will simply return whenever there's data available. Depending on how data arrives, the read(...) may therefore require many more iterations than the readNBytes(...) before the InputStream has been read in full.

Bottom line: You often want to process the stream in chunks of X bytes. If so, it is much better to use readNBytes(...).

Let's say the InputStream is 3500 bytes in total and you want to process in chunks of 1024 bytes (meaning len = 1024).

With readNBytes(...) you are guaranteed the stream will be read in exactly 4 method invocations: The first 3 will read 1024 bytes each and the fourth will read 428 bytes.

With read(...) you don't really know how many method invocations it will require to read this stream. It might require only 4 invocations but it may also be more.

Melbourne answered 17/7 at 20:27 Comment(0)

-1

Difference: len only 【expects】 to read in the number of bytes, while read(byte[] b, int off, int len) may continue to execute subsequent code even if it does not read full len, while readNBytes (byte[] b, int off, int len) ensure that subsequent code is not executed if it does not read full len

ByteArrayInputStream.read(byte [] b, int off, int len) is implemented as:

public synchronized int read(byte[] b, int off, int len) {
  Objects.checkFromIndexSize(off, len, b.length);
  if (this.pos >= this.count) {
     return -1;
  } else {
     int avail = this.count - this.pos;
     if (len > avail) {
        len = avail;
     }

     if (len <= 0) {
        return 0;
     } else {
        System.arraycopy(this.buf, this.pos, b, off, len);
        this.pos += len;
        return len;
     }
  }

}

It can be seen that when len>available, len is truncated to available, and the actual number of bytes read subsequently=available, while available represents the estimated value of the number of bytes that can still be read in the current stream. There is a possibility that after executing int available=this. count - this. pos; After calculating the availability in this code, due to new data entering ByteArrayInputStream, this. count was increased by other threads. For example, there were originally 10 bytes in the input stream that were not read, but we wanted to get 20 bytes, so we called read (s, 0,20) l, and in the read function, available=this. count - this. pos; After execution, the available is 10, but at this point, another thread added 100 bytes to the stream. Therefore, even though the expected 20 bytes could be read, only 10 were read.

Let's take a look at the implementation of InputStream. readNBytes (byte [] b, int off, int len):

public int readNBytes(byte[] b, int off, int len) throws IOException {
  Objects.checkFromIndexSize(off, len, b.length);

  int n;
  int count;
  for(n = 0; n < len; n += count) {
     count = this.read(b, off + n, len - n);
     if (count < 0) {
        break;
     }
  }

  return n;

}

The for loop here is the difference between read and readNbytes. ReadNbytes does not compromise (len=available), but instead stays in the function through the for loop when there is not enough data to read and does not continue to execute. Using the previous example, len=20, and currently there are only 10 bytes in the stream, the first round of the loop did not read enough 20 bytes, but it will continue to loop to ensure that 20 bytes are read when exiting the readNBytes function

Inflatable answered 18/1 at 11:7 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags