Why does BufferedInputStream copy a field to a local variable rather than use the field directly

Asked 26/3, 2016 at 2:42 Answered 26/3, 2016 at 2:59

105

When I read the source code from java.io.BufferedInputStream.getInIfOpen(), I am confused about why it wrote code like this:

/**
 * Check to make sure that underlying input stream has not been
 * nulled out due to close; if not return it;
 */
private InputStream getInIfOpen() throws IOException {
    InputStream input = in;
    if (input == null)
        throw new IOException("Stream closed");
    return input;
}

Why does it use the alias instead of use the field variable in directly like below:

/**
 * Check to make sure that underlying input stream has not been
 * nulled out due to close; if not return it;
 */
private InputStream getInIfOpen() throws IOException {
    if (in == null)
        throw new IOException("Stream closed");
    return in;
}

Can someone give a reasonable explanation?

Krupp answered 26/3, 2016 at 2:42 Comment(3)

In Eclipse, you can't pause a debugger on an if statement. Might be a reason for that alias variable. Just wanted to throw that out there. I speculate, of course. – Olivarez 26/3, 2016 at 3:7

@DebosmitRay : Really can't pause on if statement? – Lucy 26/3, 2016 at 15:39

@Lucy On my version of Eclipse, the problem is similar to this one. Might not be a very common occurrence. And anyway, I didn't meant it on a light note (clearly a bad joke). :) – Olivarez 27/3, 2016 at 2:43

117

If you look at this code out of context there is no good explanation for that "alias". It is simply redundant code or poor code style.

But the context is that BufferedInputStream is a class that can be subclassed, and that it needs to work in a multi-threaded context.

The clue is that in is declared in FilterInputStream is protected volatile. That means that there is a chance that a subclass could reach in and assign null to in. Given that possibility, the "alias" is actually there to prevent a race condition.

Consider the code without the "alias"

private InputStream getInIfOpen() throws IOException {
    if (in == null)
        throw new IOException("Stream closed");
    return in;
}

Thread A calls getInIfOpen()
Thread A evaluates in == null and sees that in is not null.
Thread B assigns null to in.
Thread A executes return in. Which returns null because a is a volatile.

The "alias" prevents this. Now in is read just once by thread A. If thread B assigns null after thread A has in it doesn't matter. Thread A will either throw an exception or return a (guaranteed) non-null value.

Luttrell answered 26/3, 2016 at 2:53 Comment(21)

Which shows why protected variables are evil in a multi-threaded context. – Abrahamabrahams 26/3, 2016 at 3:23

It does indeed. However, AFAIK these classes go all the way back to Java 1.0. This is just another example of a poor design decision that could not be fixed for fear of breaking customer code. – Luttrell 26/3, 2016 at 3:26

@StephenC Thanks for the detailed explanation +1. So does that mean, we shouldn't be using protected variables in our code if it is multi-threaded? – Unrig 26/3, 2016 at 3:29

@MadhusudanaReddySunnapu - Well basically because it means that the superclass needs to do things like this to protect against race conditions that could be triggered when a subclass assigns to the protected variable. If the variable was private then the superclass would be able to "mediate" the assignment in some way. – Luttrell 26/3, 2016 at 3:37

@StephenC But then, I don't think that it is a problem because the variable is declared protected in the super class. The same problem/race condition can occur even with a standalone class (i.e., no super class for it). I mean, variable is declared in this single class and there are two methods in the class on that does operation similar to close() method i.e., nullifying the variable and the other method that does similar to getInIfOpen method i.e., checks for null and returns it. Or did I get it wrong? – Unrig 26/3, 2016 at 3:55

Well yes, that is correct. But in this case, it is down to the fact that the variable is protected and a subclass (which you can't see) might assign to it. – Luttrell 26/3, 2016 at 4:5

@MadhusudanaReddySunnapu The overall lesson is that in an environment where multiple threads may access the same state, you need to control that access somehow. That may be a private variable only accessible via setter, it could be a local guard like this, it could be by making the variable write-once in a threadsafe way. – Dis 26/3, 2016 at 4:13

That makes sense, Stephen & Chris. – Unrig 26/3, 2016 at 4:25

I'm surprised... does Java's thread model really work like this? Usually you have to perform an acquire-read (or volatile read, etc. depending on the language) in most languages for a variable that can be written to by another thread... is this not the case in Java? – Concinnous 26/3, 2016 at 6:52

@Mehrdad Which languages? I can't think of many that force you to acquire a lock in order to read a shared variable, though there are many where you'd want to do that for safety's sake. Perhaps I'm misunderstanding your comment. – Dis 26/3, 2016 at 7:5

@ChrisHayes: Oh, by "have to" I meant "if you want desirable results", not "if you want your code to compile" =P For example, C++ has atomic::load, C# has VolatileRead, Rust has atomic.load, etc. – Concinnous 26/3, 2016 at 7:8

@Mehdrad - There are other ways to implement shared state in Java. This Q&A is not intended to be a general tutorial on this topic. – Luttrell 26/3, 2016 at 7:15

@Mehrdad Oh, gotcha. In this case the class member in question is declared volatile, so it is a volatile read. So while it may not actually be the true value at the time the function returns, it is a valid value which was present at the start of the function. – Dis 26/3, 2016 at 7:23

@ChrisHayes: Ohh I didn't know it was volatile, that explains it, thanks. – Concinnous 26/3, 2016 at 7:24

Quoting from my answer: "The clue is that in is declared in FilterInputStream is protected volatile. " – Luttrell 26/3, 2016 at 7:52

This does not cover all race conditions and states available to streams. This answer is misleading and disrespectful of the original code. – Crossarm 26/3, 2016 at 22:21

@sam - 1) It doesn't need to explain all race conditions and states. The aim of the answer is to point out why this seemingly inexplicable code is in fact necessary. 2) How so? – Luttrell 27/3, 2016 at 1:8

@StephenC If you do not use input then after you check in for non-null it may be assigned to null and returned. It is not poor style (which is why I feel your post was disrespectful) but necessary to guarantee the contract in all cases. Volatile will not protect you from two separate reads. You must copy the reference so what you test is what you return. – Crossarm 28/3, 2016 at 15:30

@sam - Did you read past the first two sentences of my answer? Where I explained all of the stuff? Do you understand what "If you look at this code out of context ..." means? – Luttrell 28/3, 2016 at 15:48

@StephenC, I read your first response. You inverted its meaning. I agree with your current answer. That you've condescended to me, spoke insultingly of the original developer, and stand rewarded for this behavior can't be edited. – Crossarm 28/3, 2016 at 20:47

But the fact remains that you did NOT read the version of the answer that you commented on. Your bad. The condescension was deserved. – Luttrell 30/3, 2016 at 3:10

This is because the class BufferedInputStream is designed for multi-threaded usage.

Here, you see the declaration of in, which is placed in the parent class FilterInputStream:

protected volatile InputStream in;

Since it is protected, its value can be changed by any subclass of FilterInputStream, including BufferedInputStream and its subclasses. Also, it is declared volatile, which means that if any thread changes the value of the variable, this change will immediately be reflected in all other threads. This combination is bad, since it means the class BufferedInputStream has no way to control or know when in is changed. Thus, the value can even be changed between the check for null and the return statement in BufferedInputStream::getInIfOpen, which effectively makes the check for null useless. By reading the value of in only once to cache it in the local variable input, the method BufferedInputStream::getInIfOpen is safe against changes from other threads, since local variables are always owned by a single thread.

There is an example in BufferedInputStream::close, which sets in to null:

public void close() throws IOException {
    byte[] buffer;
    while ( (buffer = buf) != null) {
        if (bufUpdater.compareAndSet(this, buffer, null)) {
            InputStream input = in;
            in = null;
            if (input != null)
                input.close();
            return;
        }
        // Else retry in case a new buf was CASed in fill()
    }
}

If BufferedInputStream::close is called by another thread while BufferedInputStream::getInIfOpen is executed, this would result in the race condition described above.

Chile answered 26/3, 2016 at 2:59 Comment(2)

I agree, inasmuch as we're seeing things like compareAndSet(), CAS, etc. in the code and in the comments. I also searched the BufferedInputStream code and found numerous synchronized methods. So, it is intended for multi-threaded use, though I sure have never used it that way. Anyway, I think your answer is correct! – Mileage 26/3, 2016 at 3:7

This probably makes sense because getInIfOpen() is only called from public synchronized methods of BufferedInputStream. – Abrahamabrahams 26/3, 2016 at 3:11

This is such a short code, but, theoretically, in a multi-threaded environment, in may change right after the comparison, so the method could return something it didn't check (it could return null, thus doing the exact thing it was meant to prevent).

Farriery answered 26/3, 2016 at 2:59 Comment(2)

Am I correct if I say that the reference in might change between the time you call the method and return of the value (in a multi-threaded environment)? – Olivarez 26/3, 2016 at 3:3

Yes, you can say that. Ultimately the likelihood will really depend on the concrete case (all we know for a fact is that in can change anytime). – Farriery 26/3, 2016 at 3:14

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags