Why does BufferedInputStream copy a field to a local variable rather than use the field directly
Asked Answered
K

3

105

When I read the source code from java.io.BufferedInputStream.getInIfOpen(), I am confused about why it wrote code like this:

/**
 * Check to make sure that underlying input stream has not been
 * nulled out due to close; if not return it;
 */
private InputStream getInIfOpen() throws IOException {
    InputStream input = in;
    if (input == null)
        throw new IOException("Stream closed");
    return input;
}

Why does it use the alias instead of use the field variable in directly like below:

/**
 * Check to make sure that underlying input stream has not been
 * nulled out due to close; if not return it;
 */
private InputStream getInIfOpen() throws IOException {
    if (in == null)
        throw new IOException("Stream closed");
    return in;
}

Can someone give a reasonable explanation?

Krupp answered 26/3, 2016 at 2:42 Comment(3)
In Eclipse, you can't pause a debugger on an if statement. Might be a reason for that alias variable. Just wanted to throw that out there. I speculate, of course.Olivarez
@DebosmitRay : Really can't pause on if statement?Lucy
@Lucy On my version of Eclipse, the problem is similar to this one. Might not be a very common occurrence. And anyway, I didn't meant it on a light note (clearly a bad joke). :)Olivarez
L
117

If you look at this code out of context there is no good explanation for that "alias". It is simply redundant code or poor code style.

But the context is that BufferedInputStream is a class that can be subclassed, and that it needs to work in a multi-threaded context.

The clue is that in is declared in FilterInputStream is protected volatile. That means that there is a chance that a subclass could reach in and assign null to in. Given that possibility, the "alias" is actually there to prevent a race condition.

Consider the code without the "alias"

private InputStream getInIfOpen() throws IOException {
    if (in == null)
        throw new IOException("Stream closed");
    return in;
}
  1. Thread A calls getInIfOpen()
  2. Thread A evaluates in == null and sees that in is not null.
  3. Thread B assigns null to in.
  4. Thread A executes return in. Which returns null because a is a volatile.

The "alias" prevents this. Now in is read just once by thread A. If thread B assigns null after thread A has in it doesn't matter. Thread A will either throw an exception or return a (guaranteed) non-null value.

Luttrell answered 26/3, 2016 at 2:53 Comment(21)
Which shows why protected variables are evil in a multi-threaded context.Abrahamabrahams
It does indeed. However, AFAIK these classes go all the way back to Java 1.0. This is just another example of a poor design decision that could not be fixed for fear of breaking customer code.Luttrell
@StephenC Thanks for the detailed explanation +1. So does that mean, we shouldn't be using protected variables in our code if it is multi-threaded?Unrig
@MadhusudanaReddySunnapu - Well basically because it means that the superclass needs to do things like this to protect against race conditions that could be triggered when a subclass assigns to the protected variable. If the variable was private then the superclass would be able to "mediate" the assignment in some way.Luttrell
@StephenC But then, I don't think that it is a problem because the variable is declared protected in the super class. The same problem/race condition can occur even with a standalone class (i.e., no super class for it). I mean, variable is declared in this single class and there are two methods in the class on that does operation similar to close() method i.e., nullifying the variable and the other method that does similar to getInIfOpen method i.e., checks for null and returns it. Or did I get it wrong?Unrig
Well yes, that is correct. But in this case, it is down to the fact that the variable is protected and a subclass (which you can't see) might assign to it.Luttrell
@MadhusudanaReddySunnapu The overall lesson is that in an environment where multiple threads may access the same state, you need to control that access somehow. That may be a private variable only accessible via setter, it could be a local guard like this, it could be by making the variable write-once in a threadsafe way.Dis
That makes sense, Stephen & Chris.Unrig
I'm surprised... does Java's thread model really work like this? Usually you have to perform an acquire-read (or volatile read, etc. depending on the language) in most languages for a variable that can be written to by another thread... is this not the case in Java?Concinnous
@Mehrdad Which languages? I can't think of many that force you to acquire a lock in order to read a shared variable, though there are many where you'd want to do that for safety's sake. Perhaps I'm misunderstanding your comment.Dis
@ChrisHayes: Oh, by "have to" I meant "if you want desirable results", not "if you want your code to compile" =P For example, C++ has atomic::load, C# has VolatileRead, Rust has atomic.load, etc.Concinnous
@Mehdrad - There are other ways to implement shared state in Java. This Q&A is not intended to be a general tutorial on this topic.Luttrell
@Mehrdad Oh, gotcha. In this case the class member in question is declared volatile, so it is a volatile read. So while it may not actually be the true value at the time the function returns, it is a valid value which was present at the start of the function.Dis
@ChrisHayes: Ohh I didn't know it was volatile, that explains it, thanks.Concinnous
Quoting from my answer: "The clue is that in is declared in FilterInputStream is protected volatile. "Luttrell
This does not cover all race conditions and states available to streams. This answer is misleading and disrespectful of the original code.Crossarm
@sam - 1) It doesn't need to explain all race conditions and states. The aim of the answer is to point out why this seemingly inexplicable code is in fact necessary. 2) How so?Luttrell
@StephenC If you do not use input then after you check in for non-null it may be assigned to null and returned. It is not poor style (which is why I feel your post was disrespectful) but necessary to guarantee the contract in all cases. Volatile will not protect you from two separate reads. You must copy the reference so what you test is what you return.Crossarm
@sam - Did you read past the first two sentences of my answer? Where I explained all of the stuff? Do you understand what "If you look at this code out of context ..." means?Luttrell
@StephenC, I read your first response. You inverted its meaning. I agree with your current answer. That you've condescended to me, spoke insultingly of the original developer, and stand rewarded for this behavior can't be edited.Crossarm
But the fact remains that you did NOT read the version of the answer that you commented on. Your bad. The condescension was deserved.Luttrell
C
20

This is because the class BufferedInputStream is designed for multi-threaded usage.

Here, you see the declaration of in, which is placed in the parent class FilterInputStream:

protected volatile InputStream in;

Since it is protected, its value can be changed by any subclass of FilterInputStream, including BufferedInputStream and its subclasses. Also, it is declared volatile, which means that if any thread changes the value of the variable, this change will immediately be reflected in all other threads. This combination is bad, since it means the class BufferedInputStream has no way to control or know when in is changed. Thus, the value can even be changed between the check for null and the return statement in BufferedInputStream::getInIfOpen, which effectively makes the check for null useless. By reading the value of in only once to cache it in the local variable input, the method BufferedInputStream::getInIfOpen is safe against changes from other threads, since local variables are always owned by a single thread.

There is an example in BufferedInputStream::close, which sets in to null:

public void close() throws IOException {
    byte[] buffer;
    while ( (buffer = buf) != null) {
        if (bufUpdater.compareAndSet(this, buffer, null)) {
            InputStream input = in;
            in = null;
            if (input != null)
                input.close();
            return;
        }
        // Else retry in case a new buf was CASed in fill()
    }
}

If BufferedInputStream::close is called by another thread while BufferedInputStream::getInIfOpen is executed, this would result in the race condition described above.

Chile answered 26/3, 2016 at 2:59 Comment(2)
I agree, inasmuch as we're seeing things like compareAndSet(), CAS, etc. in the code and in the comments. I also searched the BufferedInputStream code and found numerous synchronized methods. So, it is intended for multi-threaded use, though I sure have never used it that way. Anyway, I think your answer is correct!Mileage
This probably makes sense because getInIfOpen() is only called from public synchronized methods of BufferedInputStream.Abrahamabrahams
F
6

This is such a short code, but, theoretically, in a multi-threaded environment, in may change right after the comparison, so the method could return something it didn't check (it could return null, thus doing the exact thing it was meant to prevent).

Farriery answered 26/3, 2016 at 2:59 Comment(2)
Am I correct if I say that the reference in might change between the time you call the method and return of the value (in a multi-threaded environment)?Olivarez
Yes, you can say that. Ultimately the likelihood will really depend on the concrete case (all we know for a fact is that in can change anytime).Farriery

© 2022 - 2024 — McMap. All rights reserved.