Why AWS Lambda is a Masterstroke from Amazon

Amazon launched AWS Lambda at its re:Invent conference in November 2014, and though there were over half a dozen other cloud services also announced, Lambda stood out as the most innovative and unique. The service runs snippets of JavaScript code in response to events generated by data services like Amazon S3, Amazon Kinesis, and Amazon DynamoDB. Think of it as a sandwich service that sits in between data sources and the compute layer, abstracting microservices at a higher level than virtualization and containerization. Gigaom Research’s recent CloudTracker report named Lambda as the Disruptive Cloud Technology of the fourth quarter of 2014.

The timing of Lambda’s launch could not be better. AWS stepped ahead of the curve with the product when the entire industry was agog over container technology, its impact on public cloud providers, and the increased competition from Microsoft and Google on core compute, storage, and database services. And Lambda might initially appear to be yet-another cloud service exposing compute, but as the following sections illustrate, it is definitely much more than that.

AWS Lambda Functional Microservices Abstraction Layer

Screen Shot 2015-01-08 at 3.46.31 PM

Source: Gigaom Research

AWS Lambda offers the perfect middle ground  between IaaS and PaaS. It also effectively counters the growing threat of containers to its business by simplifying the task of running code in the cloud. It’s Amazon’s way of delivering a microservices framework far ahead of its competitors.

The Architecture of AWS Lambda 

Lambda is the latest addition to Amazon’s compute service. A simple invocation of os.platform() and os.release() methods within a Lambda function prove that it runs on Amazon Linux AMI (Version 3.14.26–24.46.amzn1.x86_64 to be precise). It is powered by Node.js running V8 JavaScript engine. Each JavaScript snippet is associated with a specific identity defined in IAM with permissions to invoke it. There is another role that assumes access to the event source. Associating these two roles with a Lambda function completes the execution loop. Developers can define the maximum timeout of the function that ranges from 1 second to 60 seconds. Memory can be allocated in the increments of 64MB anywhere between 128MB to 1GB. When a data source raises an event, the details are passed to the Lambda function as a parameter. This opens up many interesting opportunities for developers to perform a variety of tasks including using packages and running Node.js modules.

Lambda functions are stateless, allowing them to scale rapidly. Depending on the speed at which the events are raised, the runtime can decide to run multiple Lambda function copies concurrently. Another important aspect of Lambda is that the functions cannot be directly exposed to the outside world. Other than the supported sources, it is impossible to invoke Lambda functions. That means the code cannot be exposed at REST endpoints directly. But a Lambda function can invoke other services making outbound calls. This makes it fundamentally different from PaaS and container environments.

Currently, there are three ways of running code in AWS cloud: Amazon EC2, Amazon ECS, and AWS Elastic Beanstalk. EC2 is a full-blown IaaS while ECS is the hosted container environment. Finally, Elastic Beanstalk is a PaaS layer. AWS Lambda forms the fourth service with the capability to execute code in the cloud. But it’s unique in a sense that it is at the intersection of EC2, ECS, and Elastic Beanstalk.

AWS Lambda with EC2, EC2 Container Services, and Elastic Beanstalk

Screen Shot 2015-01-08 at 3.51.04 PM

While it is clear that the Lambda execution environment runs on Amazon EC2, AWS does not disclose how snippets are isolated from each other. To achieve massive scale and strong isolation, AWS could be running containers to host Lambda functions. So it is a microservices environment at the highest level of abstraction.

Let’s compare Lambda with other AWS compute services:

Lambda versus EC2

With Amazon EC2, developers need to spin up VMs and install the right software stack before uploading and running code. This involves dealing with provisioning, configuration, monitoring, managing, and maintaining VMs throughout the application lifecycle. There are many ongoing DevOps-related tasks involved from the time the VM is provisioned. However, this gives maximum control to administrators since they are in the driver’s seat. AWS Lambda doesn’t deal with the VMs at all. It’s just plain code uploaded or written in-line within the browser-based editor. DevOps can never SSH into the Linux VM running Lambda. It can only monitor logs and timeout exceptions raised at runtime.

Lambda versus the EC2 Container Service

With Amazon EC2 Container Service, the focus shifts from VMs to containers. Developers and operators need to move code to appropriate containers provisioned on EC2. They need to create container images locally and upload them to the hub, which will be provisioned by ECS at a later point. Scheduling and orchestrating the containers is left to ECS service. But, DevOps teams still need to manage the container’s lifecycle. Effectively, DevOps owns container and code and ECS only provides runtime execution services. Though Lambda may be running inside containers, DevOps never has to deal with it. Except capturing the logs, there is no container maintenance required for Lambda. The containers responsible for hosting Lambda are not accessible to the outside world. AWS manages the runtime dynamically by creating and terminating containers on the fly.

Lambda versus Elastic Beanstalk

Since Amazon Elastic Beanstalk is a PaaS layer, developers push the code along with the metadata. The metadata contains the details of the AMI, language, framework, and runtime requirements along with connection information of databases or dependencies. Based on the metadata, AWS Elastic Beanstalk launches an appropriate AMI and configures it to run the code. Similar to other PaaS offerings, developers push the code and configuration to Elastic Beanstalk. The configuration or metadata can be simple or comprehensive depending on the application architecture. Lambda is much simpler than PaaS. It just expects the code and its association with a set of IAM roles. Of course, allocating RAM and defining the timeout may be considered as the configuration and metadata but they are much simpler and consistent across any Lambda function. Most of the code running within PaaS is exposed to the outside world as a Web page or REST endpoint. But Lambda functions are inaccessible from the public Internet. They need to be invoked only through the supported data sources.

Thanks to Docker and containers, microservices are becoming popular. AWS Lambda is one of the first microservices environment on the public cloud. Its innovative pricing model based on the number of requests, execution time, and allocated memory makes it very attractive for moving parts of web-scale applications. When AWS adds additional languages like Ruby, Python, and Java and brings support for EC2, CloudTrail, RDS and other custom event sources, Lambda’s power will exponentially grow. It has the potential to become the focal point of AWS cloud.