AWS Lambda is a popular service for hosting microservice functions in the cloud without provisioning actual servers. It supports Node.js, Python, Go, C#, PowerShell and Java – more specifically: java-1.8.0-openjdk
. As Scala 2.12 is compatible with JVM 8, we can also run Scala code serverless in the cloud! But does using Scala have any impact on the performance over using plain old Java? How are the cold start and mean response times? Let’s find out!
tl;dr: Mean response times are equal, cold start times are slower with Scala than with Java, but improve with increased memory.
Project structure
First we create two projects: one Java project using Maven and one Scala project using sbt to build completely independent JAR files. When using AWS Lambda, we have to supply all dependencies in a fat JAR and by splitting the projects, we have a minimal JAR for each Lambda function. Both build files contain dependencies to the AWS lambda libraries com.amazonaws » aws-lambda-java-core
and com.amazonaws » aws-lambda-java-events
to provide the application with the APIGatewayProxyRequestEvent
, APIGatewayProxyResponseEvent
and Context
data structures. Those encapsulate the http request and response from an AWS API Gateway and provide a safe way to get the http request and provide a valid response. The API Gateway is the gate between the internet and our functions. The Scala JAR file additionally includes the Scala library.
1lazy val root = (project in file(".")) 2 .settings( 3 name := "aws_lambda_bench_scala", 4 organization := "de.codecentric.amuttsch", 5 description := "Benchmark Service for AWS Lambda written in Scala", 6 licenses += "Apache License, Version 2.0" -> url("https://www.apache.org/licenses/LICENSE-2.0"), 7 8 version := "0.1", 9 scalaVersion := "2.12.8", 10 11 assemblyJarName in assembly := "aws_lambda_bench_scala.jar", 12 13 libraryDependencies ++= Seq( 14 "com.amazonaws" % "aws-lambda-java-core" % "1.2.0", 15 "com.amazonaws" % "aws-lambda-java-events" % "2.2.5", 16 ) 17 )
1<?xml version="1.0" encoding="UTF-8"?> 2<project xmlns="http://maven.apache.org/POM/4.0.0" 3 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 4 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> 5 <modelVersion>4.0.0</modelVersion> 6 7 <groupId>de.codecentric.amuttsch</groupId> 8 <artifactId>aws_lambda_bench_java</artifactId> 9 <version>0.1</version> 10 11 <packaging>jar</packaging> 12 13 <properties> 14 <maven.compiler.source>1.8</maven.compiler.source> 15 <maven.compiler.target>1.8</maven.compiler.target> 16 <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> 17 </properties> 18 19 <dependencies> 20 <dependency> 21 <groupId>com.amazonaws</groupId> 22 <artifactId>aws-lambda-java-core</artifactId> 23 <version>1.2.0</version> 24 </dependency> 25 <dependency> 26 <groupId>com.amazonaws</groupId> 27 <artifactId>aws-lambda-java-events</artifactId> 28 <version>2.2.5</version> 29 </dependency> 30 </dependencies> 31 32 <build> 33 <plugins> 34 <plugin> 35 <groupId>org.apache.maven.plugins</groupId> 36 <artifactId>maven-shade-plugin</artifactId> 37 <version>3.2.1</version> 38 39 <configuration> 40 <createDependencyReducedPom>false</createDependencyReducedPom> 41 </configuration> 42 <executions> 43 <execution> 44 <phase>package</phase> 45 <goals> 46 <goal>shade</goal> 47 </goals> 48 </execution> 49 </executions> 50 </plugin> 51 </plugins> 52 </build> 53</project>
Lambda functions
Next, we implement the actual handler functions in both Scala and Java. They just return a http 200 response and don’t do any processing to see the actual impact of the language, rather than from some any arbitrary computations.
1package de.codecentric.amuttsch.awsbench.scala
2
3import com.amazonaws.services.lambda.runtime.Context
4import com.amazonaws.services.lambda.runtime.events.{APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent}
5
6class ScalaLambda {
7 def handleRequest(event: APIGatewayProxyRequestEvent, context: Context): APIGatewayProxyResponseEvent = {
8 new APIGatewayProxyResponseEvent()
9 .withStatusCode(200)
10 }
11}
1package de.codecentric.amuttsch.awsbench.java;
2
3import com.amazonaws.services.lambda.runtime.Context;
4import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyRequestEvent;
5import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyResponseEvent;
6
7public class JavaLambda {
8 public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent event, Context context) {
9 return new APIGatewayProxyResponseEvent()
10 .withStatusCode(200);
11 }
12}
The bytecode of the functions are almost similar. The only difference is how Scala and Java handle the 200 argument of withStatusCode
. Java uses java.lang.Integer.valueOf
, whereas Scala makes use of its implicit conversation scala.Predef.int2Integer
.
After building the fat JARs with sbt assembly
and mvn package
, we see the first big difference: the Scala JAR is almost 10 times larger than the Java one – 5.8MB vs 0.7MB. This is due to the included Scala library, which is around 5 MB large.
Serverless
Now we have to deploy the services to the cloud. For this we use Serverless , a toolkit for building serverless applications. We can define our two functions in a YML configuration file and define a separate API Gateway http endpoint for each of them. With only one command we can deploy our serverless application to the cloud.
1service: lambda-java-scala-bench 2 3provider: 4 name: aws 5 runtime: java8 6 region: eu-central-1 7 logRetentionInDays: 1 8 9package: 10 individually: true 11 12functions: 13 ScalaLambda: 14 handler: de.codecentric.amuttsch.awsbench.scala.ScalaLambda::handleRequest 15 reservedConcurrency: 1 16 package: 17 artifact: scala/target/scala-2.12/aws_lambda_bench_scala.jar 18 events: 19 - http: 20 path: scala 21 method: get 22 JavaLambda: 23 handler: de.codecentric.amuttsch.awsbench.java.JavaLambda::handleRequest 24 reservedConcurrency: 1 25 package: 26 artifact: java/target/aws_lambda_bench_java-0.1.jar 27 events: 28 - http: 29 path: java 30 method: get
After defining the name of our service, we set the provider to AWS and the runtime to java8
. Since we use separate JAR files for our services, we have to set the individually
key to true
in the package
section. Otherwise Serverless will look for a gobal package. In the functions themselves we set the handler
, package
and a http event
. We do not take concurrent execution into consideration, so we limit the number of simultaneously active Lambdas to one using the reservedConcurrency
key. We use the default memorySize
of 1024 MB.
Now we deploy our stack with serverless deploy
. After successful execution we get our service information containing the URLs to our functions:
1endpoints: 2 GET - https://example.execute-api.eu-central-1.amazonaws.com/dev/scala 3 GET - https://example.execute-api.eu-central-1.amazonaws.com/dev/java
Using curl
, we can test if they are available and return a 200 http response: curl -v https://example.execute-api.eu-central-1.amazonaws.com/dev/java
.
Benchmarking
The next step is to build a benchmark. For this we use Gatling , a load testing tool written in Scala. It is easy to build a load test and export a graphical report after the execution. For our case we are interested in two metrics: response time on cold and warm Lambdas. AWS kills inactive Lambda instances after some (not specified) time to free up resources. Afterwards, when the function is triggered, the JVM has to start up again which takes some time. So we create a third project and build a test case:
1package de.codecentric.amuttsch.awsbench
2
3import ch.qos.logback.classic.{Level, LoggerContext}
4import io.gatling.core.Predef._
5import io.gatling.http.Predef._
6import org.slf4j.LoggerFactory
7
8import scala.concurrent.duration._
9
10class LambdaBench extends Simulation {
11 val context: LoggerContext = LoggerFactory.getILoggerFactory.asInstanceOf[LoggerContext]
12 // Suppress logging
13 context.getLogger("io.gatling").setLevel(Level.valueOf("WARN"))
14 context.getLogger("io.netty").setLevel(Level.valueOf("WARN"))
15
16 val baseFunctionUrl: String = sys.env("AWS_BENCH_BASE_URL")
17
18 val httpProtocol = http
19 .baseUrl(baseFunctionUrl)
20 .acceptHeader("text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8")
21 .acceptLanguageHeader("en-US,en;q=0.5")
22 .acceptEncodingHeader("gzip, deflate")
23 .userAgentHeader("Mozilla/5.0 (X11; Linux x86_64; rv:64.0) Gecko/20100101 Firefox/64.0")
24
25 val scalaScenario = scenario("ScalaScenario")
26 .exec(http("Scala")
27 .get("/scala"))
28
29 val javaScenario = scenario("JavaScenario")
30 .exec(http("Java")
31 .get("/java"))
32
33 setUp(
34 scalaScenario.inject(constantConcurrentUsers(1) during(120 seconds)),
35 javaScenario.inject(constantConcurrentUsers(1) during(120 seconds))
36 ).protocols(httpProtocol)
37}
First we suppress some logging as Gatling logs every request to the console. We get our endpoint URL from the environment variable AWS_BENCH_BASE_URL
and define a http protocol. In there we set the base URL, some headers and the user agent. It is later used for executing the specific requests. Next, we define two scenarios that point to the scala and Java http endpoint of our serverless application. In the last step we set up both scenarios and constantly have one open active request in the duration of 120 seconds. Now we can start sbt
and run the benchmark using gatling:test
. We have to make sure the Lambdas are cold, otherwise we won’t get any cold boot timings. We can either wait for a few minutes or remove and redeploy the stack. As soon as it finishes running, it prints a text report and provides us with a URL to the graphical report:
Each function was called around 3100 times within the two-minute time span. The time in the max column is the time of the first request when the Lambda function was cold. We can observe that the time until the first response is around 1.6 times as long for Scala as it is for Java. This observation holds true for multiple runs. The mean response time for both Scala and Java is around 38 ms.
Assigning 2048 MB RAM improved the startup time by ~300ms for the Scala and ~200ms for the Java functions. The mean function response time improved only slightly and is negligible:
Conclusion
Scala works great with AWS Lambda as it can be compiled to compatible Java 8 bytecode. You can use all the great features of the language when programming Serverless applications. The startup time for a cold function is a bit longer than the Java counterpart, but improves when the function memory is increased. This test only focuses on the overhead of using the Scala runtime on top of the JVM. The results may vary on production grade functions that actually perform CPU- or network-intensive tasks and depend heavily on the implementation and the used libraries.
You can find the code of the projects and the benchmark here: GitLab
More articles
fromAndreas Muttscheller
Your job at codecentric?
Jobs
Agile Developer und Consultant (w/d/m)
Alle Standorte
More articles in this subject area
Discover exciting further topics and let the codecentric world inspire you.
Gemeinsam bessere Projekte umsetzen.
Wir helfen deinem Unternehmen.
Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.
Hilf uns, noch besser zu werden.
Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.
Blog author
Andreas Muttscheller
Do you still have questions? Just send me a message.
Do you still have questions? Just send me a message.