Before I go into this explanation, you need to understand one optimization that compilers do (my explanation is very simplified). Suppose that somewhere in your code you have such a sequence:
int x = a;
int y = a;
It is perfectly valid for a compiler to re-order these into:
// reverse the order
int y = a;
int x = a;
No one writes
to a
here, there are only two reads
of a
, as such this type of re-ordering is allowed.
A slightly more complicated example would be:
// someone, somehow sets this
int a;
public int test() {
int x = a;
if(x == 4) {
int y = a;
return y;
}
int z = a;
return z;
}
A compiler might look at this code and notice that if this is entered if(x == 4) { ... }
, this : int z = a;
never happens. But, at the same time, you can think about it slightly different: if that if statement
is entered, we do not care if int z = a;
is executed or not, it does not change the fact that:
int y = a;
return y;
would still happen. As such let's make that int z = a;
to be eager:
public int test() {
int x = a;
int z = a; // < --- this jumped in here
if(x == 4) {
int y = a;
return y;
}
return z;
}
And now a compiler can further re-order:
// < --- these two have switched places
int z = a;
int x = a;
if(x == 4) { ... }
Armed with this knowledge, we can try to understand now what is going on.
Let's look at your example:
private static Singleton instance; // non-volatile
public static Singleton getInstance() {
if (instance == null) { // < --- read (1)
synchronized (lock) {
if (instance == null) { // < --- read (2)
instance = new Singleton(); // < --- write
}
}
}
return instance; // < --- read (3)
}
There are 3 reads of instance
(also called load
) and a single write
to it (also called store
). As weird at it may sound, but if read (1)
has seen an instance
that is not null (meaning that if (instance == null) { ... }
is not entered), it does not mean that read (3)
will return a non-null instance, it is perfectly valid for read (3)
to still return null
. This should melt your brain (it did mine a few times). Fortunately, there is a way to prove this.
A compiler might add such a small optimization to your code:
public static Singleton getInstance() {
if (instance == null) {
synchronized (lock) {
if (instance == null) {
instance = new Singleton();
// < --- we added this
return instance;
}
}
}
return instance;
}
It inserted a return instance
, semantically this does not change the logic of the code in any way.
Then, there is a certain optimization that compilers do that will help us here. I am not going to dive into the details, but it introduces some local fields (the benefit is in that link) to do all the reads and writes (stores and loads).
public static Singleton getInstance() {
Singleton local1 = instance; // < --- read (1)
if (local1 == null) {
synchronized (lock) {
Singleton local2 = instance; // < --- read (2)
if (local2 == null) {
Singleton local3 = new Singleton();
instance = local3; // < --- write (1)
return local3;
}
}
}
Singleton local4 = instance; // < --- read (3)
return local4;
}
Now a compiler might look at this and see that: if if (local2 == null) { ... }
is entered, Singleton local4 = instance;
never happens (or as said in the example I started this answer with: it does not really matter if Singleton local4 = instance;
happens at all). But in order to enter if (local2 == null) {...}
, we need to enter this if (local1 == null) { ... }
first. And now let's reason about this as a whole:
if (local1 == null) { ... } NOT ENTERED => NEED to do : Singleton local4 = instance
if (local1 == null) { ... } ENTERED && if (local2 == null) { ... } NOT ENTERED
=> MUST DO : Singleton local4 = instance.
if (local1 == null) { ... } ENTERED && if (local2 == null) { ... } ENTERED
=> CAN DO : Singleton local4 = instance. (remember it does not matter if I do it or not)
You can see that in all the cases, there is no harm in doing that : Singleton local4 = instance
before any if checks.
After all this madness, your code could become:
public static Singleton getInstance() {
Singleton local4 = instance; // < --- read (3)
Singleton local1 = instance; // < --- read (1)
if (local1 == null) {
synchronized (lock) {
Singleton local2 = instance; // < --- read (2)
if (local2 == null) {
Singleton local3 = new Singleton();
instance = local3; // < --- write (1)
return local3;
}
}
}
return local4;
}
There are two independent reads of instance
here:
Singleton local4 = instance; // < --- read (3)
Singleton local1 = instance; // < --- read (1)
if(local1 == null) {
....
}
return local4;
You read instance
into local4
(let's suppose a null
), then you read instance
into local1
(let's assume that some thread already changed this into a non-null) and ... your getInstance
will return a null
, not a Singleton
. q.e.d.
Conclusion: these optimizations are only possible when private static Singleton instance;
is non-volatile
, otherwise much of the optimization are prohibited and nothing like this would be even possible. So, yes, using volatile
is a MUST for this pattern to work correctly.
enum
or other implementation approach)? – Laurenlaurena