do we need volatile when implementing singleton using double-check locking
Asked Answered
P

1

5

suppose we use double-check lock to implement singleton pattern:

    private static Singleton instance;

    private static Object lock = new Object();

    public static Singleton getInstance() {
        if(instance == null) {
            synchronized (lock) {
                if(instance == null) {
                    instance = new Singleton();
                }
            }
        }
        return instance;
    }

Do we need to set variable "instance" as "volatile"? I hear a saying that we need it to disable reordering:

When an object is created , reordering may happen:

address=alloc
instance=someAddress
init(someAddress)

They say that if the last two steps are reordered, we need a volatile instance to disable reordering otherwise other threads may get an object that is not completely initiaized.

However since we are in a synchronized code block, do we really need volatile? Or generally, can I say that synchronized block can guarantee that a shared variable is transparent to other threads and there is no reordering even if it is not volatiled variable?

Presentation answered 6/12, 2019 at 6:33 Comment(9)
Where is the source for such information?Dwaynedweck
Just a discussion from colleaguesPresentation
The real question is why are we using double check locking for our singleton (instead of an enum or other implementation approach)?Laurenlaurena
The real question is : Do we need to set this shared instance variable as "volitile"?Presentation
This also seems to be an issue with multiple threads. See docs.oracle.com/javase/specs/jls/se7/html/… If we have two actions x and y, we write hb(x, y) to indicate that x happens-before y. If x and y are actions of the same thread and x comes before y in program order, then hb(x, y). See also stackoverflow.com/questions/42079959/…Dwaynedweck
@ElliottFrisch I've tried to answer, but it is extremely verbose.Greed
@ScaryWombat correct, but it took quite a lot of words to show thatGreed
@Greed I see nothing in your answer about why we are using double check locking instead of a different implementation... so, why use your extremely verbose answer to do it?Laurenlaurena
@ElliottFrisch hmm. I've tagged you by accident in response to comment below yours The real question is : Do we need to set this shared instance variable as "volatile"? my bad.Greed
G
14

Before I go into this explanation, you need to understand one optimization that compilers do (my explanation is very simplified). Suppose that somewhere in your code you have such a sequence:

 int x = a;
 int y = a;

It is perfectly valid for a compiler to re-order these into:

 // reverse the order
 int y = a;
 int x = a;

No one writes to a here, there are only two reads of a, as such this type of re-ordering is allowed.

A slightly more complicated example would be:

// someone, somehow sets this
int a;

public int test() {

    int x = a;

    if(x == 4) {
       int y = a;
       return y;
    }

    int z = a;
    return z;
}

A compiler might look at this code and notice that if this is entered if(x == 4) { ... }, this : int z = a; never happens. But, at the same time, you can think about it slightly different: if that if statement is entered, we do not care if int z = a; is executed or not, it does not change the fact that:

 int y = a;
 return y;

would still happen. As such let's make that int z = a; to be eager:

public int test() {

   int x = a;
   int z = a; // < --- this jumped in here

   if(x == 4) {
       int y = a;
       return y;
   }

   return z;
}

And now a compiler can further re-order:

// < --- these two have switched places
int z = a;
int x = a;

if(x == 4) { ... }    

Armed with this knowledge, we can try to understand now what is going on.

Let's look at your example:

 private static Singleton instance; // non-volatile     

 public static Singleton getInstance() {
    if (instance == null) {  // < --- read (1)
        synchronized (lock) {
            if (instance == null) { // < --- read (2)
                instance = new Singleton(); // < --- write 
            }
        }
    }
    return instance; // < --- read (3)
}

There are 3 reads of instance (also called load) and a single write to it (also called store). As weird at it may sound, but if read (1) has seen an instance that is not null (meaning that if (instance == null) { ... } is not entered), it does not mean that read (3) will return a non-null instance, it is perfectly valid for read (3) to still return null. This should melt your brain (it did mine a few times). Fortunately, there is a way to prove this.

A compiler might add such a small optimization to your code:

public static Singleton getInstance() {
    if (instance == null) {
        synchronized (lock) {
            if (instance == null) {
                instance = new Singleton();
                // < --- we added this
                return instance;
            }
        }
    }
    return instance;
}

It inserted a return instance, semantically this does not change the logic of the code in any way.

Then, there is a certain optimization that compilers do that will help us here. I am not going to dive into the details, but it introduces some local fields (the benefit is in that link) to do all the reads and writes (stores and loads).

public static Singleton getInstance() {
    Singleton local1 = instance;   // < --- read (1)
    if (local1 == null) {
        synchronized (lock) {
            Singleton local2 = instance; // < --- read (2)
            if (local2 == null) {
                Singleton local3 = new Singleton();
                instance = local3; // < --- write (1)
                return local3;
            }
        }
    }

    Singleton local4 = instance; // < --- read (3)
    return local4;
}

Now a compiler might look at this and see that: if if (local2 == null) { ... } is entered, Singleton local4 = instance; never happens (or as said in the example I started this answer with: it does not really matter if Singleton local4 = instance; happens at all). But in order to enter if (local2 == null) {...}, we need to enter this if (local1 == null) { ... } first. And now let's reason about this as a whole:

if (local1 == null) { ... } NOT ENTERED => NEED to do : Singleton local4 = instance

if (local1 == null) { ... } ENTERED && if (local2 == null) { ... } NOT ENTERED 
=> MUST DO : Singleton local4 = instance. 

if (local1 == null) { ... } ENTERED && if (local2 == null) { ... } ENTERED
=> CAN DO : Singleton local4 = instance.  (remember it does not matter if I do it or not)

You can see that in all the cases, there is no harm in doing that : Singleton local4 = instance before any if checks.

After all this madness, your code could become:

 public static Singleton getInstance() {

    Singleton local4 = instance; // < --- read (3)
    Singleton local1 = instance;   // < --- read (1)

    if (local1 == null) {
        synchronized (lock) {
            Singleton local2 = instance; // < --- read (2)
            if (local2 == null) {
                Singleton local3 = new Singleton();
                instance = local3; // < --- write (1)
                return local3;
            }
        }
    }

    return local4;
}

There are two independent reads of instance here:

Singleton local4 = instance; // < --- read (3)
Singleton local1 = instance;   // < --- read (1)

if(local1 == null) {
   ....
}

return local4;

You read instance into local4 (let's suppose a null), then you read instance into local1 (let's assume that some thread already changed this into a non-null) and ... your getInstance will return a null, not a Singleton. q.e.d.


Conclusion: these optimizations are only possible when private static Singleton instance; is non-volatile, otherwise much of the optimization are prohibited and nothing like this would be even possible. So, yes, using volatile is a MUST for this pattern to work correctly.

Greed answered 12/4, 2020 at 4:0 Comment(9)
What is scarier, is if you see a not fully constructed object. After all, the compiler might reorder the stores in the constructor with the store to instance if it is not volatile.Shevat
@JohannesKuhn right, you are talking about safe publication and for that to work correctly with volatile, both reader and writer have to work together, as a pair, on the same volatile.Greed
While the problem explained in this answer can be solved by declaring the field volatile, it could also be solved by reading the field once and storing it into a local variable instead of reading the field twice. So, the real reason why the field should be volatile (unless the singleton is a truly immutable object), has not been explained yet. And, whenever someone feels the need to discuss the double-checked locking, the pointlessness of it should be mentioned, as the class initialization provides all thread safe lazy initialization for free.Venola
@Venola agreed. I did this only as an exercise and out of pure interest.Greed
@Venola can you please elaborate on the last sentence? How is DCL useless even with volatile?Gherardi
@Gherardi as said “the class initialization provides all thread safe lazy initialization for free”. So just declare the field as private static final Singleton instance = new Singleton(); and it will be initialized on the class’s first use, i.e. when getInstance() is invoked the first time. The method itself is as simple as public static Singleton getInstance() { return instance; } then. It’s lazy, it’s thread safe, and it’s even more efficient than any DCL approach, as after the class has been initialized, the JVM is free to read the field without any locking.Venola
@Venola If it is being initialized at the time of class loading i am not sure how can it be called lazyGherardi
It’s not initialized when the class is loaded. Class loading is a different thing than class initialization. The time of initialization is precisely specified. In this specific case, it’s immediately before the static method getInstance() is invoked for the first time. Which is as lazy as the double checked locking tries to achieve. If you invoke it three days after the application start, the initialization will happen three days after start; if don’t invoke getInstance(), the initialization will never happen.Venola
@Venola ah understood. All this time the answer was Occam's razor.Gherardi

© 2022 - 2024 — McMap. All rights reserved.