Now that AWS Lambda has added PowerShell to its growing list of supported languages, let’s take a moment to compare and contrast the different languages available to us.
In this post, we’ll take a look at these languages from a number of angles:
- Cold start performance: performance during a cold start
- Warm performance: performance after the initial cold start
- Cost: does it cost you more to run functions in one language over another? If so, why?
- Ecosystem: libraries, deployment tooling, etc.
- Platform support: is the language supported by other function-as-a-service (FAAS) platforms?
We will also talk about specialized use cases such as Machine Learning (ML) as well as paying attention to the special needs of the enterprise. Finally, we’ll round off the discussion by looking at a few languages that are not officially supported but that you can use with Lambda via shims.
I should stress that the goal of this post is to consider the relative strengths and weaknesses of each language within the specific context of AWS Lambda. This is not a general purpose language comparison!
Comparison of Officially Supported Languages
Cold Start Performance
In 2017, I analyzed the effect language runtime, memory, and code size had on Lambda’s cold start performance. At the time, Go was not yet supported and C# .NetCore support was still at v1.0, but several subsequent analyses by others that included these newer runtimes reached similar conclusions.
The common finding is that Python, Node.js, and Go have far superior cold start performances compared to C# and Java. They are often an order of magnitude better. This makes intuitive sense and corresponds with my personal experience with these languages.
Node.js and Python are both interpreted languages, and both have a lightweight runtime as well. This is why they have really good cold start performances.
Go’s compiler uses tree shaking to bundle only code that is actually used by your application. Additionally, the bundled code is compiled to native as well, so naturally, it has very little initialization overhead during a cold start.
With both C# and Java, Lambda has to do a lot of extra work during cold start. It needs to bootstrap a heavy virtual machine (VM) and language runtime, as well as load all the classes and resources defined in your compiled binary even when they won’t all be used in your code.
This overhead has always existed for these languages, but they were usually written off in traditional application development. Before Lambda came along, these applications ran on long-lived servers, didn’t serve user requests until they fully initialized, and passed load balancer health checks.
With Lambda, that’s no longer the case. These initialization overheads are now felt by your users, which is why I’d recommend avoiding Java and C# for APIs. Alternatively, you can throw more CPU (and therefore more money!) at the problem by using higher memory settings for these functions to help improve the cold start time.
For background processing however, even the 3-7 seconds of cold start time typical of Java functions is not usually a big issue, as the extra latency is not user-facing.
After C# .NetCore 2.0 was added to Lambda, Yun Zhi Lin put together a useful analysis of the warm performance for a number of language runtimes:
My main takeaway from the analysis is that there is no meaningful difference between the warm performances of the different languages. This, of course, should not be taken as a general statement about these languages. After all, we are talking about a very specific context here:
- Only one request is processed in a container at a time, so even the most efficient concurrency model would not offer any meaningful gains.
- Most functions are small and IO-heavy, and no amount of optimization in the language runtime is going to make the network go faster.
As discussed above, C# and Java functions have significantly higher cold start time. In addition, they also tend to have a higher memory footprint as well. Putting the two together means that you will likely have to run these functions on a higher memory setting than their equivalents in Node.js, Python or Go.
Since the cost of Lambda is tied to both execution time as well as memory, you will likely pay more for these C# and Java functions as a result. This is particularly true given that warm performances are basically indistinguishable between different languages, especially for the type of IO-heavy workload we usually see with Lambda. That said, if strong library support for specialized workloads (such as image processing or machine learning) can make your function run a lot faster, you might be able to compensate for the higher memory allocation.
Another unfortunate side effect of running on higher memory settings for C# and Java functions is that they’ll require more ENIs when placed inside a VPC. Although Lambda reuses ENIs where possible, the formula for calculating your ENI capacity requirement tells us that a higher memory setting equates to needing more ENIs.
Needing more ENIs means longer cold starts more often. A 1GB function would need to create a new ENI during one-third of cold starts, whereas a 1.5GB function would need to create a new ENI during one-half of cold starts.
All five officially supported languages are widely adopted and enjoy a rich ecosystem of both open source as well as commercial libraries. Equally, all five languages are supported by the Serverless Framework from the deployment tooling point of view. Generally speaking, there are many language-agnostic deployment frameworks out there, including AWS’s very own Serverless Application Model (SAM) framework. In fact, why not check out this post by Nitzan Shapira, which offers an informative roundup of many deployment frameworks!
Azure Functions has a wide range of supported languages, but the 1.x version of Node.js is stuck at 6.11, and Python support is only “experimental”. Judging by the published product roadmap, there is no planned support for Go in the next version of Azure Functions either.
Considerations for Specialized Use Cases
For specialized use cases such as machine learning or deep learning, a good, highly optimized library can make all the difference. It seems that Python-based libraries dominate the scene with the likes of TensorFlow, scikit-learn, Caffe2, and Keras.
If you’re working within the data science space, chances are that the libraries that you’re familiar with and want to use would dictate the language in which you would author your Lambda functions.
Considerations for the Enterprise
Similarly, for the enterprise user, company rules would likely take precedence over technical merits when it comes to choosing the language you should use with Lambda. While this can sound like company politics, these restrictions often exist for a good reason.
It is notoriously difficult to enforce any form of standardization and governance across a large, distributed, and often siloed engineering team. To ensure everyone adheres to guidelines and best practices and that production systems meet the required quality standards, we often need to rely on shared libraries and tooling support. To provide consistent support for multiple languages is a huge burden on the teams that maintain these shared libraries and tools.
Another common theme inside the enterprise is that you are required to run your functions inside a VPC in order to stay compliant with the company’s security rules. Placing a function inside a VPC can add as much as 10s to its cold start time in my experience, because creating an ENI is a particularly expensive and complicated step.
This brings us back to the aforementioned point about higher memory settings equaling more ENIs and more frequent slow cold starts. With these settings, you should perhaps avoid C# and Java functions due to the impact of cold starts. Ironically, you’re also most likely forced to use C# or Java in these enterprise environments.
Unofficially Supported Languages
Since Lambda’s support for Java and C# are at the runtime level, i.e., the JVM and .NetCore, any other languages that can run on these VMs are supported.
Kotlin, Clojure, Scala and Groovy all run on the JVM and are therefore all unofficially supported. In fact, the Serverless framework already has templates for all four! The same is true for F#, a function-first language that runs on .NetCore.
Besides these semi-officially supported languages, the community has cleverly exploited the interoperability options in Python 3.7 and Go to add support for other languages. Here are a few of the more noteworthy examples.
Rust is supported via the rust-crowbar project with a Python shim. You can also get a more native experience with the rust-aws-lambda project, which leverages the fact that the Go Lambda runtime requires a Linux binary and doesn’t care what language it’s written in. Clever!
Elixir is supported via the exlam project, which uses a Node.js 4.3 function to wrap an installation of the Erlang VM and an Elixir app by starting it as a child process. Distributed systems researcher Christopher Meiklejohn has since made this work with Erlang and Lasp, a peer-to-peer, fully replicated CRDT database that he has been working on. This is super interesting information, and you can read all about here.
Haskell is supported through the serverless-haskell plugin to the Serverless framework, which also uses a Node.js wrapper.
Of all the things we looked at in this post, the biggest technical reason for picking one language over another is the cold start performance. This also has the knock-on effect on the memory setting and therefore ENI capacity requirements, which itself also impacts cold starts.
This cold start performance is mainly relevant for APIs, where the added latency is user-facing. As I discussed in the previous post, many of the popular use cases for Lambda do not have user-facing latency and can therefore tolerate even the most brutal cold start times.
Aside from cold starts, other factors such as library support and company restrictions can also influence your language choice. By and large though, I think you’ll find folks would naturally gravitate towards languages with which they are most comfortable, as evidenced by the extraordinary efforts many have put in to run those unofficially supported languages on Lambda.
I’m confident that with time, limitations such as cold starts would simply go away and the Lambda platform would open up even more. Maybe we’ll be able to bring our own containers or language runtimes, and put all these discussions around what languages to use behind us.
I strongly believe that you can build a great product with any programming language. The fact that our code now runs inside a Lambda function should not change that!