Lazy Vals in Scala: A Look Under the Hood

24.2.2016 | 8 minutes reading time

Scala allows the special keyword lazy in front of val in order to change the val to one that is lazily initialized. While lazy initialization seems tempting at first, the concrete implementation of lazy vals in scalac has some subtle issues. This article takes a look under the hood and explains some of the pitfalls: we see how lazy initialization is implemented as well as scenarios, where a lazy val can crash your program, inhibit parallelism or have other unexpected behavior.

Introduction

This post was originally inspired by the talk Hands-on Dotty (slides ) by Dmitry Petrashko, given at Scala World 2015. Dmitry gives a wonderful talk about Dotty and explains some of the lazy val pitfalls as currently present in Scala and how their implementation in Dotty differs. This post is a discussion of lazy vals in general followed by some of the examples shown in Dmitry Petrashko’s talk, as well as some further notes and insights.

How `lazy` works

The main characteristic of a lazy val is that the bound expression is not evaluated immediately, but once on the first access¹. When the initial access happens, the expression is evaluated and the result bound to the identifier of the lazy val. On subsequent access, no further evaluation occurs: instead the stored result is returned immediately.

Given the characteristic above, using the lazy modifier seems like an innocent thing to do, when we are defining a val, why not also add a lazy modifier as a speculative “optimization”? In a moment we will see why this is typically not a good idea, but before we dive into this, let’s recall the semantics of a lazy val first.

When we assign an expression to a lazy val like this:

1lazy val two: Int = 1 + 1

we expect that the expression 1 + 1 is bound to two, but the expression is not yet evaluated. On the first (and only on the first) access of two from somewhere else, the stored expression 1 + 1 is evaluated and the result (2 in this case) is returned. On subsequent access of two, no evaluation happens: the stored result of the evaluation was cached and will be returned instead.

This property of “evaluate once” is a very strong one. Especially if we consider a multithreaded scenario: what should happen if two threads access our lazy val at the same time? Given the property that evaluation occurs only once, we have to introduce some kind of synchronization in order to avoid multiple evaluations of our bound expression. In practice, this means the bound expression will be evaluated by one thread, while the other(s) will have to wait until the evaluation has completed, after which the waiting thread(s) will see the evaluated result.

How is this mechanism implemented in Scala? Luckily, we can have a look at SIP-20 . The example class LazyCell with a lazy val value is defined as follows:

1final class LazyCell {
2  lazy val value: Int = 42
3}

A handwritten snippet equivalent to the code the compiler generates for our LazyCell looks like this:

1final class LazyCell {
2  @volatile var bitmap_0: Boolean = false                   // (1)
3  var value_0: Int = _                                      // (2)
4  private def value_lzycompute(): Int = {
5    this.synchronized {                                     // (3)
6      if (!bitmap_0) {                                      // (4)
7        value_0 = 42                                        // (5)
8        bitmap_0 = true
9      }
10    }
11    value_0
12  }
13  def value = if (bitmap_0) value_0 else value_lzycompute() // (6)
14}

At (3) we can see the use of a monitor this.synchronized {...} in order to guarantee that initialization happens only once, even in a multithreaded scenario. The compiler uses a simple flag ((1)) to track the initialization status ((4) & (6)) of the var value_0 ((2)) which holds the actual value and is mutated on first initialization ((5)).

What we can also see in the above implementation is that a lazy val, other than a regular val has to pay the cost of checking the initialization state on each access ((6)). Keep this in mind when you are tempted to (try to) use lazy val as an “optimization”.

Now that we have a better understanding of the underlying mechanisms for the lazy modifier, let’s look at some scenarios where things get interesting.

Scenario 1: Concurrent initialization of multiple independent vals is sequential

Remember the use of this.synchronized { } above? This means we lock the whole instance during initialization. Furthermore, multiple lazy vals defined inside e.g., an object, but accessed concurrently from multiple threads will still all get initialized sequentially. The code snippet below demonstrates this, defining two lazy val ((1) & (2)) inside the ValStore object. In the object Scenario1 we request both of them inside a Future ((3)), but at runtime each of the lazy val is calculated separately. This means we have to wait for the initialization of ValStore.fortyFive until we can continue with ValStore.fortySix.

1import scala.concurrent.ExecutionContext.Implicits.global
2import scala.concurrent._
3import scala.concurrent.duration._
4
5def fib(n: Int): Int = n match {
6  case x if x < 0 =>
7    throw new IllegalArgumentException(
8      "Only positive numbers allowed")
9  case 0 | 1 => 1
10  case _ => fib(n-2) + fib(n-1)
11}
12
13object ValStore {
14  lazy val fortyFive = fib(45)                   // (1)
15  lazy val fortySix  = fib(46)                   // (2)
16}
17
18object Scenario1 {
19  def run = {
20    val result = Future.sequence(Seq(            // (3)
21      Future {
22        ValStore.fortyFive
23        println("done (45)")
24      },
25      Future {
26        ValStore.fortySix
27        println("done (46)")
28      }
29    ))
30    Await.result(result, 1.minute)
31  }
32}

You can test this by copying the above snippet and :paste-ing it into a Scala REPL and starting it with Scenario1.run. You will then be able to see how it firsts evaluates ValStore.fortyFive, then prints the text and afterwards does the same for the second lazy val. Instead of an object you can also imagine this case for a normal class, having multiple lazy vals defined.

Scenario 2: Potential dead lock when accessing lazy vals

In the previous scenario, we only had to suffer from decreased performance, when multiple lazy vals inside an instance are accessed from multiple threads at the same time. This may be surprising, but it is not a deal breaker. The following scenario is more severe:

1import scala.concurrent.ExecutionContext.Implicits.global
2import scala.concurrent._
3import scala.concurrent.duration._
4
5object A {
6  lazy val base = 42
7  lazy val start = B.step
8}
9
10object B {
11  lazy val step = A.base
12}
13
14object Scenario2 {
15  def run = {
16    val result = Future.sequence(Seq(
17      Future { A.start },                        // (1)
18      Future { B.step }                          // (2)
19    ))
20    Await.result(result, 1.minute)
21  }
22}

Here we define three lazy val in two objects A and B. Here is a picture of the resulting dependencies:

The A.start val depends on B.step which in turn depends again on A.base. Although there is no cyclic relation here, running this code can lead to a deadlock:

1scala> :paste
2...
3scala> Scenario2.run
4java.util.concurrent.TimeoutException: Futures timed out after [1 minute]
5  at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
6  at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
7  at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
8  at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
9  at scala.concurrent.Await$.result(package.scala:190)
10  ... 35 elided

(if it succeeds by chance on your first try, give it another chance). So what is happening here? The deadlock occurs, because the two Future in (1) and (2), when trying to access the lazy val will both lock the respective object A / B, thereby denying any other thread access. In order to achieve progress however, the thread accessing A also needs B.step and the thread accessing B needs to access A.base. This is a deadlock situation. While this is a fairly simple scenario, imagine a more complex one, where more objects/classes are involved and you can see why overusing lazy val can get you in trouble. As in the previous scenario the same can occur inside class, although it is a little harder to construct the situation. In general this situation is unlikely to happen, because of the exact timing required to trigger the deadlock, but it is equally hard to reproduce in case you encounter it.

Scenario 3: Deadlock in combination with synchronization

Playing with the fact that lazy val initialization uses a monitor (synchronized), there is another scenario, where we can get in serious trouble.

1import scala.concurrent.ExecutionContext.Implicits.global
2import scala.concurrent._
3import scala.concurrent.duration._
4
5trait Compute {
6  def compute: Future[Int] =
7    Future(this.synchronized { 21 + 21 })        // (1)
8}
9
10object Scenario3 extends Compute {
11  def run: Unit = {
12    lazy val someVal: Int =
13      Await.result(compute, 1.minute)            // (2)
14    println(someVal)
15  }
16}

Again, you can test this for yourself by copying it and doing a :paste inside a Scala REPL:

1scala> :paste
2...
3scala> Scenario3.run
4java.util.concurrent.TimeoutException: Futures timed out after [1 minute]
5  at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
6  at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
7  at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
8  at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
9  at scala.concurrent.Await$.result(package.scala:190)
10  at Scenario3$.someVal$lzycompute$1(<console>:62)
11  at Scenario3$.someVal$1(<console>:62)
12  at Scenario3$.run(<console>:63)
13  ... 33 elided

The Compute trait on it’s own is harmless, but note that it uses synchronized in (1). In combination with the synchronized initialization of the lazy val inside Scenario3 however, we have a deadlock situation. When we try to access the someVal ((2)) for the println call, the triggered evaluation of the lazy val will grab the lock on Scenario3, therefore preventing the compute to also get access: a deadlock situation.

Conclusion

Before we sum this post up, please note that in the examples above we use Future and synchronized, but we can easily get into the same situation by using other concurrency and synchronization primitives as well.

In summary, we had a look under the hood of Scala’s implementation of lazy vals and discussed some surprising cases:

sequential initialization due to monitor on instance
deadlock on concurrent access of lazy vals without cycle
deadlock in combination with other synchronization constructs

As you can see, lazy vals should not be used as a speculative optimization without further thought about the implications. Furthermore you might want to replace some of your lazy val with a regular val or def depending on your initialization needs after becoming aware of the issues above.
Luckily, the Dotty platform has an alternative implementation for lazy val initialization (by Dmitry Petrashko) which does not suffer from the unexpected pitfalls discussed in this post. For more information on Dotty you can watch Dmitry’s talk linked in the “references” section and head over to their github page .

All examples have been tested with Scala 2.11.7.

References

Hands-on Dotty (slides ) by Dmitry Petrashko
SIP-20 – Improved Lazy Vals Initialization
Dotty – The Dotty research platform

Footnotes:

¹This is not completely true, initialization will be tried again in case of exceptions during the first access until the first successful initialization.

Was this post helpful?

Blog author

Markus Hauck

Do you still have questions? Just send me a message.

fromMarkus Hauck

Introduction to Property-based Testing using ScalaCheck

Tired of writing hundreds of unit tests manually? Write properties and let the test cases be generated automatically! We introduce the ScalaCheck library and demonstrate the benefits of property-based testing, an approach that is very different from ...

Testing
Scala

18.11.2015 | 9 minutes reading time

Markus Hauck

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Macro annotations in Scala 3

In a previous blog post we took a look at macro annotations in Scala 2, where they have been present for a while. Only recently they have been added to Scala 3 as well, specifically in the pre-release version 3.3.0-RC2 of the Dotty compiler. Same as...

Scala

4.4.2023 | 9 [Missing String "readingTime"]

Lukas Lehmann

Macro annotations in Scala 2

In this blog post we will take a look at macro annotations, a powerful tool for code transformation and generation in Scala. Macro annotations allow us to transform the code of a definition, e.g., a class or method, at compile time. This can be used ...

Scala

28.3.2023 | 12 [Missing String "readingTime"]

Lukas Lehmann

Hit me baby one more time – What are cache hits and why should you care...

MotivationWhen reasoning about algorithm performance we often look at complexity. Especially when comparing different algorithms, looking at asymptotic complexity (e.g. the big-O notation) is useful. We have to keep in mind, however, that the big-O ...

APM
Software development
Scala

6.12.2019 | 11 [Missing String "readingTime"]

Microbenchmarking your Scala code

Motivation I am sure you recognize this loading spinner icon. I do not know anyone who likes to wait for the computer. However, when writing software I usually favour readability, maintainability, and extensibility over speed. I agree with Donald Knuth...

Microservices
APM
Scala

29.11.2019 | 11 [Missing String "readingTime"]

JWT authentication with Akka HTTP

The authentication of RESTful APIs is quite an often asked question, so I decided to demonstrate basic authentication via JWT (JSON Web Token) in an example of an API built with Akka HTTP.JWT working conceptBefore we start with the actual coding, we ...

Reactive Programming
IT-Security
Scala

19.9.2017 | 6 [Missing String "readingTime"]

Gatling Load Testing Part 1 – Using Gatling

Gatling is a Scala-based load testing tool developed by the Gatling Corp. The tool itself is open source and can be found on GitHub . On top of the open part, an enterprise edition exists.Load tests in Gatling are written in Scala. The API for writing...

Testing
APM
Scala

20.6.2017 | 20 [Missing String "readingTime"]

Lookup additional data in Spark Streaming

When processing streaming data, the raw data from the events are often not sufficient. Additional data must be added in most cases, for example metadata for a sensor, of which only the ID is sent in the event.In this blog post I would like to discuss...

Software architecture
Scala
Big Data
Data
Streaming

1.6.2017 | 8 [Missing String "readingTime"]

Matthias Niehoff

Akka Best Practices: Defining Actor Props

Akka provides an implementation of the actor model for building reactive applications . So in Akka, an application is made up of actors rather than of plain old objects. When creating actors, we need to pass Props instances. So in this blog post I’...

Reactive Programming
Scala

10.3.2017 | 4 [Missing String "readingTime"]

Ad hoc polymorphism in Scala for the mere mortals

In this blog post we are going to discuss ad hoc polymorphism and the Type Class Pattern in Scala in very simple terms. No knowledge of algebraic structures is required. Starting with a simple function for adding a pair of integers, we will progress ...

Scala
Software development

23.2.2017 | 11 [Missing String "readingTime"]

Hello gRPC! (with ScalaPB)

gRPC is a modern RPC framework developed by Google. It picks up the traditional idea of RPC frameworks – call remote methods as easily as if they were local – while trying to avoid mistakes made by its predecessors and focusing on requirements of microservice...

Scala

10.1.2017 | 7 [Missing String "readingTime"]

IoT Analytics Platform

The Internet of Things a.k.a. the next industrial revolution is the current hype, but what kinds of challenges do we face with the consumption of big amounts of data? One variant is to collect all the data and do post processing in batches. However, ...

Cloud
IoT
NoSQL
Scala
Big Data

13.7.2016 | 15 [Missing String "readingTime"]

Spam classification using Spark’s DataFrames, ML and Zeppelin (Part 1)

This is the first entry in a series of blog posts about building and validating machine learning pipelines with Apache Spark . Its main concern is to show how to explore data with Spark and Apache Zeppelin notebooks in order to build machine learning...

Scala
Big Data
Data
Machine Learning

22.6.2016 | 16 [Missing String "readingTime"]

Scala Arrays – functional vs imperative

The Scala collections , which are part of the standard library, are known for their vast amount of high-level functional operations like map, flatMap, filter, sliding or groupBy, just to name a handful. These not only allow for high developer productivity...

Scala

15.2.2016 | 5 [Missing String "readingTime"]

Phantom Types in Scala

Inspired by a recent conversation with my former colleague Brendan McAdams and my current coworker Markus Hauck , I decided to put together a quick post about phantom types, a topic perfectly suited for demonstrating the power of the type system of ...

Scala

5.2.2016 | 5 [Missing String "readingTime"]

Monads demystified

In this short post I want to take a look at monads from a pragmatic perspective, i.e. why and how monads can be useful for developers. I won’t talk about any theory, but instead show code examples in Scala. I’ll even call things monad which don’t fully...

Functional programming
Scala

8.12.2015 | 3 [Missing String "readingTime"]

Introduction to Property-based Testing using ScalaCheck

Testing
Scala

18.11.2015 | 9 [Missing String "readingTime"]

The Essence of Object-Functional Programming and the Practical Potential...

The terms “object-functional” and “object-functional programming” are heard time and again in the context of software development. But what does the object-functional approach look like and what advantages does it have? Isn’t object-orientation or the...

Software architecture
Software development
Scala

30.8.2015 | 7 [Missing String "readingTime"]

A Map of Akka

The amazing Akka project was started by Jonas Bonér in 2009 with the aim to bring the actor model , which has proven to deliver an availability of six nines (99.9999%) and even more, to the JVM. Akka, which is open source and available under the Apache...

Scala
Reactive Programming

26.7.2015 | 8 [Missing String "readingTime"]

The Scala Type System: Parameterized Types and Variances, Part 1

The Scala language has been published in 2004 and is continuously developed by EPFL and Typesafe . These activities are funded on the one hand by the European Union and on the other hand by industrial investors . Scala has gained popularity in recent...

Scala

6.3.2015 | 6 [Missing String "readingTime"]

Lazy Vals in Scala: A Look Under the Hood

Introduction

How `lazy` works

Scenario 1: Concurrent initialization of multiple independent vals is sequential

Scenario 2: Potential dead lock when accessing lazy vals

Scenario 3: Deadlock in combination with synchronization

Conclusion

References

Footnotes:

Was this post helpful?

Blog author

More articles

Introduction to Property-based Testing using ScalaCheck

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

More articles in this subject area

Macro annotations in Scala 3

Macro annotations in Scala 2

Hit me baby one more time – What are cache hits and why should you care...

Microbenchmarking your Scala code

JWT authentication with Akka HTTP

Gatling Load Testing Part 1 – Using Gatling

Lookup additional data in Spark Streaming

Akka Best Practices: Defining Actor Props

Ad hoc polymorphism in Scala for the mere mortals

Hello gRPC! (with ScalaPB)

IoT Analytics Platform

Spam classification using Spark’s DataFrames, ML and Zeppelin (Part 1)

Scala Arrays – functional vs imperative

Phantom Types in Scala

Monads demystified

Introduction to Property-based Testing using ScalaCheck

The Essence of Object-Functional Programming and the Practical Potential...

A Map of Akka

The Scala Type System: Parameterized Types and Variances, Part 1

Lazy Vals in Scala: A Look Under the Hood

Introduction

How lazy works

Scenario 1: Concurrent initialization of multiple independent vals is sequential

Scenario 2: Potential dead lock when accessing lazy vals

Scenario 3: Deadlock in combination with synchronization

Conclusion

References

Footnotes:

Was this post helpful?

Blog author

More articles

Introduction to Property-based Testing using ScalaCheck

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

More articles in this subject area

Macro annotations in Scala 3

Macro annotations in Scala 2

Hit me baby one more time – What are cache hits and why should you care...

Microbenchmarking your Scala code

JWT authentication with Akka HTTP

Gatling Load Testing Part 1 – Using Gatling

Lookup additional data in Spark Streaming

Akka Best Practices: Defining Actor Props

Ad hoc polymorphism in Scala for the mere mortals

Hello gRPC! (with ScalaPB)

IoT Analytics Platform

Spam classification using Spark’s DataFrames, ML and Zeppelin (Part 1)

Scala Arrays – functional vs imperative

Phantom Types in Scala

Monads demystified

Introduction to Property-based Testing using ScalaCheck

The Essence of Object-Functional Programming and the Practical Potential...

A Map of Akka

The Scala Type System: Parameterized Types and Variances, Part 1

How `lazy` works