What's the (hidden) cost of Scala's lazy val?

Asked 14/6, 2010 at 21:59 Answered 16/12, 2014 at 8:14

Solved performance scala lazy-evaluation

171

One handy feature of Scala is lazy val, where the evaluation of a val is delayed until it's necessary (at first access).

Of course, a lazy val must have some overhead - somewhere Scala must keep track of whether the value has already been evaluated and the evaluation must be synchronized, because multiple threads might try to access the value for the first time at the same time.

What exactly is the cost of a lazy val - is there a hidden boolean flag associated with a lazy val to keep track if it has been evaluated or not, what exactly is synchronized and are there any more costs?

In addition, suppose I do this:

class Something {
    lazy val (x, y) = { ... }
}

Is this the same as having two separate lazy vals x and y or do I get the overhead only once, for the pair (x, y)?

Grozny answered 14/6, 2010 at 21:59 Comment(0)

This is taken from the scala mailing list and gives implementation details of lazy in terms of Java code (rather than bytecode):

class LazyTest {
  lazy val msg = "Lazy"
}

is compiled to something equivalent to the following Java code:

class LazyTest {
  public int bitmap$0;
  private String msg;

  public String msg() {
    if ((bitmap$0 & 1) == 0) {
        synchronized (this) {
            if ((bitmap$0 & 1) == 0) {
                synchronized (this) {
                    msg = "Lazy";
                }
            }
            bitmap$0 = bitmap$0 | 1;
        }
    }
    return msg;
  }

}

Ventail answered 15/6, 2010 at 7:51 Comment(9)

I think the implementation must've changed since this Java version was posted in 2007. There is only one synchronized block and the bitmap$0 field is volatile in the current implementation (2.8). – Firecrest 15/6, 2010 at 18:9

Yes - I should have paid more attention to what I was posting! – Ventail 16/6, 2010 at 9:25

So, basically, this means that accessing a lazy value for the first time is much slower than a direct value (an might even create deadlocks in weird cases), but the subsequent accesses are hardly slower than non-lazy values. It seems this is not to be taken lightly, only for really expensive initializations. – Bountiful 4/11, 2010 at 9:36

@Mitch -- I hope the implementation has changed! The double-checked initialization anti-pattern is a classic subtle bug. See en.wikipedia.org/wiki/Double-checked_locking – Delacourt 26/12, 2010 at 20:26

It was antipattern up to Java 1.4. Since Java 1.5 volatile keyword has a bit stricter meaning and now such double checking is OK. – Klemens 28/9, 2011 at 8:21

@Klemens I guess I'm confused, but the volatile keyword isn't being used for bitmap$0 in the double checked locking in oxbow_lakes' response, so the code is not thread safe; of the two, only the 2.8 implementation mentioned by Mitch is OK – Tratner 25/2, 2013 at 18:38

So, as of scala 2.10, what is the current implementation? Also, could please somebody give a hint how much of overhead this means in practice and some rule of thumb when to use, when to avoid? – Crosier 9/3, 2013 at 17:46

Why there is a second synchronized block? – Unfinished 4/8, 2015 at 11:16

@Пуя I guess that must be some sort of bug in the compiler. – Hooper 2/12, 2015 at 15:5

It looks like the compiler arranges for a class-level bitmap int field to flag multiple lazy fields as initialized (or not) and initializes the target field in a synchronized block if the relevant xor of the bitmap indicates it is necessary.

Using:

class Something {
  lazy val foo = getFoo
  def getFoo = "foo!"
}

produces sample bytecode:

 0  aload_0 [this]
 1  getfield blevins.example.Something.bitmap$0 : int [15]
 4  iconst_1
 5  iand
 6  iconst_0
 7  if_icmpne 48
10  aload_0 [this]
11  dup
12  astore_1
13  monitorenter
14  aload_0 [this]
15  getfield blevins.example.Something.bitmap$0 : int [15]
18  iconst_1
19  iand
20  iconst_0
21  if_icmpne 42
24  aload_0 [this]
25  aload_0 [this]
26  invokevirtual blevins.example.Something.getFoo() : java.lang.String [18]
29  putfield blevins.example.Something.foo : java.lang.String [20]
32  aload_0 [this]
33  aload_0 [this]
34  getfield blevins.example.Something.bitmap$0 : int [15]
37  iconst_1
38  ior
39  putfield blevins.example.Something.bitmap$0 : int [15]
42  getstatic scala.runtime.BoxedUnit.UNIT : scala.runtime.BoxedUnit [26]
45  pop
46  aload_1
47  monitorexit
48  aload_0 [this]
49  getfield blevins.example.Something.foo : java.lang.String [20]
52  areturn
53  aload_1
54  monitorexit
55  athrow

Values initialed in tuples like lazy val (x,y) = { ... } have nested caching via the same mechanism. The tuple result is lazily evaluated and cached, and an access of either x or y will trigger the tuple evaluation. Extraction of the individual value from the tuple is done independently and lazily (and cached). So the above double-instantiation code generates an x, y, and an x$1 field of type Tuple2.

Firecrest answered 14/6, 2010 at 23:15 Comment(0)

With Scala 2.10, a lazy value like:

class Example {
  lazy val x = "Value";
}

is compiled to byte code that resembles the following Java code:

public class Example {

  private String x;
  private volatile boolean bitmap$0;

  public String x() {
    if(this.bitmap$0 == true) {
      return this.x;
    } else {
      return x$lzycompute();
    }
  }

  private String x$lzycompute() {
    synchronized(this) {
      if(this.bitmap$0 != true) {
        this.x = "Value";
        this.bitmap$0 = true;
      }
      return this.x;
    }
  }
}

Note that the bitmap is represented by a boolean. If you add another field, the compiler will increase the size of the field to being able to represent at least 2 values, i.e. as a byte. This just goes on for huge classes.

But you might wonder why this works? The thread-local caches must be cleared when entering a synchronized block such that the non-volatile x value is flushed into memory. This blog article gives an explanation.

Diskin answered 25/5, 2014 at 14:47 Comment(0)

Scala SIP-20 proposes a new implementation of lazy val, which is more correct but ~25% slower than the "current" version.

The proposed implementation looks like:

class LazyCellBase { // in a Java file - we need a public bitmap_0
  public static AtomicIntegerFieldUpdater<LazyCellBase> arfu_0 =
    AtomicIntegerFieldUpdater.newUpdater(LazyCellBase.class, "bitmap_0");
  public volatile int bitmap_0 = 0;
}
final class LazyCell extends LazyCellBase {
  import LazyCellBase._
  var value_0: Int = _
  @tailrec final def value(): Int = (arfu_0.get(this): @switch) match {
    case 0 =>
      if (arfu_0.compareAndSet(this, 0, 1)) {
        val result = 0
        value_0 = result
        @tailrec def complete(): Unit = (arfu_0.get(this): @switch) match {
          case 1 =>
            if (!arfu_0.compareAndSet(this, 1, 3)) complete()
          case 2 =>
            if (arfu_0.compareAndSet(this, 2, 3)) {
              synchronized { notifyAll() }
            } else complete()
        }
        complete()
        result
      } else value()
    case 1 =>
      arfu_0.compareAndSet(this, 1, 2)
      synchronized {
        while (arfu_0.get(this) != 3) wait()
      }
      value_0
    case 2 =>
      synchronized {
        while (arfu_0.get(this) != 3) wait()
      }
      value_0
    case 3 => value_0
  }
}

As of June 2013 this SIP hasn't been approved. I expect that it's likely to be approved and included in a future version of Scala based on the mailing list discussion. Consequently, I think you'd be wise to heed Daniel Spiewak's observation:

Lazy val is *not* free (or even cheap). Use it only if you absolutely need laziness for correctness, not for optimization.

Wallboard answered 26/6, 2013 at 20:5 Comment(0)

I've written a post with regard to this issue https://dzone.com/articles/cost-laziness

In nutshell, the penalty is so small that in practice you can ignore it.

Punjab answered 16/12, 2014 at 8:14 Comment(1)

Thanks for that benchmark. Can you also benchmark against the SIP-20 proposed implementations? – Latinist 29/1, 2015 at 17:29

-6

given the bycode generated by scala for lazy, it can suffer thread safety problem as mentioned in double check locking http://www.javaworld.com/javaworld/jw-05-2001/jw-0525-double.html?page=1

Mcnamee answered 15/7, 2012 at 9:26 Comment(1)

This claim was also made by a comment to the accepted answer by mitch and refuted by @iirekm: This pattern is fine from java1.5 onward. – Enamelware 15/7, 2012 at 15:24

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags