Invitation Digital Tech Blog

Building Scalable & Responsive Architecture

By

AWS Lambda: Event-driven systems made simple

Currently, when performing computation in the cloud it is standard to create some kind of a persistent application and host it on a Virtual Machine (VM). Persistent deployments, with their time-based billing, have become the basic model of cloud computing on platforms such as AWS and Microsoft Azure. However, event driven programming and related architecture has been positioned as the next evolution in cloud-based architecture. Many parts of a modern infrastructure are inherently event-driven or can be represented with an event-driven model but managing and reacting to all those events can require a complex infrastructure. Amazon is one cloud provider hoping to greatly simplify the development of event-driven systems.

Introducing AWS Lambda

I recently had the opportunity to attend the annual AWS Summit at the ExCeL London. There wasn’t a great deal of big news there but that may have been due to the fact that I’d spoiled all the surprises for myself by reading up on the events of the previous week’s summit in San Francisco. AWS CTO Werner Vogels gave an enthusiastic keynote in the morning session in which he boldly stated to the audience that their entire infrastructures would soon be hosted in the cloud. The message was that Hybrid IT - systems that are part on-premises and part public cloud - are simply a path to the cloud, not the destination, and Amazon would prefer if that destination was AWS. Hidden amongst the sales pitch and high profile guest speakers was the announcement of a number of new services aimed at further enabling companies to focus on the products and features that differentiate them rather than IT. One such service was AWS Lambda.

So what is AWS Lambda? In computer programming, Lambda functions are quick little throw away anonymous functions. Functions in AWS Lambda are not lambda functions in the traditional sense but they are quick little pieces of code that we can deploy as simple workers in order to respond to events without having to concern ourselves with the management of servers or event queues. They completely abstract the infrastructure needed to run our code. We never have to worry about the provisioning, sizing and monitoring of EC2 instances and can concentrate on creating useful functionality for our applications. When EC2 launched it made such an impact due to the fact that it abstracted application stacks from data centre operations, but AWS Lambda takes this even further.

Node.js is currently the only supported runtime. Functions must be written in Javascript and can make use of any Node.js-compatible libraries. No doubt other runtimes will be added in the future but not right now (although some people have already hacked in other languages such as Go). When an event happens, AWS launches your Node.js application and passes in the event information. These events can be generated by a user application or by a number of other AWS services. These currently include CloudTrail, S3, Kinesis and DynamoDB. If the number of concurrent events grows then AWS Lambda can automatically launch as many instances of a function as needed and just as quickly dispose of them. The AWS infrastructure can support running thousands of instances of a function in parallel across multiple AWS Availability Zones.

A simple example

So let’s have a look at just how simple Amazon have made this. If you log in to the AWS console and click on the Lambda link then you’ll be presented with the option to create a new lambda. The following form looks like this.

Image 1

A name is required and an optional description can be entered. You can edit the code inline or you can upload a zip file containing your JavaScript if you prefer to work outside of the browser (this is a necessity for all but simple functions). Amazon even provide a number of templates to get you started. An AWS Lambda function is a Node module which exports an object with one function - the handler - whose name must also be specified on the form. An execution role is required with the necessary permissions to allow the function to complete, but Amazon even simplifies this process with a number of useful pre-built IAM role policies that can be used to create new roles from this form if suitable ones aren’t currently set up. The only advanced options are allocated memory and the function timeout.

The code in the previous screenshot is the Hello World sample provided by Amazon and is about as simple as an AWS Lambda function can be.

console.log('Loading function');

exports.handler = function (event, context) {
    console.log('value1 =', event.key1);
    console.log('value2 =', event.key2);
    console.log('value3 =', event.key3);
    context.succeed(event.key1);
};

The only permission that is required for this function to execute is that it must be allowed to write logs to Amazon CloudWatch. An appropriate execution role would need to be set up and AWS Lambda will assume this role when executing the function. The event to which this function would respond would be similarly light weight.

{
    "key1": "value1",
    "key2": "value2",
    "key3": "value3"
}

Amazon provides a simple UI for editing and testing our functions and events, as well as changing any of the options that we chose when first setting up our function.

Image 2

For those that prefer the command line and have the AWS Command Line Interface (CLI) correctly configured on their local machines then any of the previously described actions can be completed via CLI commands.

A more involved example

The samples provided by Amazon do a good job of showing the minimum code required to get AWS Lambda setup with various other AWS services but most of them focus on simply logging information sent with the event to Amazon CloudWatch. We really need to think about a more elaborate example if we are to see the potential of AWS Lambda. So, imagine that we have a number of vouchers that we want to store in our database and we want to be able to construct a list of popular vouchers based on how many times each one has been redeemed by customers. This problem is a good fit for AWS Lambda because each voucher redemption is an independent event and the popular vouchers list doesn’t need to be updated by the client. The popular vouchers list may be high-traffic so this list can’t be computed on every read.

For this example, we’ll use DynamoDB as our database due to the ease of setting it up to work with AWS Lambda. This is not necessarily a bad thing. As with many of AWS services, DynamoDB is a managed service. It frees developers from the headaches of provisioning and configuring a distributed database cluster. It is also consistent and very fast. We’ll be storing our voucher records as items in a DynamoDB table and we will denormalize this data and set up a key in a separate query table against which we’ll store an array that will contain a list of our popular vouchers. This will result in a duplication of voucher data but will mean that the popular vouchers list can be returned efficiently to the client by looking up a single key.

We probably want to store records of each redemption (perhaps to make sure that customers don’t redeem a voucher too many times) but we’ll just focus on incrementing the redemption count in this example as this is the database operation which will trigger our Lambda. The following flow shows how we might solve the problem:

  1. Customer redeems a voucher and the redemption count in the voucher record is incremented in DynamoDB.
  2. Lambda function is invoked with the redemption event.
  3. Lambda views the new redemption count in the event and if it beats any of the old redemption counts in the popular vouchers list, updates the list.
  4. If changed, the popular voucher list is stored in a well-known DynamoDB key in a separate table to be read by everyone.

We could take advantage of the DynamoDB’s atomic counters for storing the redemption count in the voucher record which enables concurrent updates of a numerical variable. However, DynamoDB’s atomic counters are not idempotent and are, therefore, not reliable. We have no way of knowing if the connection dropped during an update. In this case it is probably more suitable to do something clever with DynamoDB’s conditional write functionality as that is more suitable for business critical data, which our redemption count variable most certainly is.

We can implement an API that would use Amazon’s DynamoDB Transaction Library to increment the redemption count in the voucher record at the same time as adding the redemption record to a redemptions table. Our vouchers DynamoDB table would need to be stream enabled. DynamoDB streams is new functionality that is currently still in preview. It enables a table to be configured in order to stream event data. Add, update, and delete operations on the table will then send records to the stream. AWS Lambda can be configured to poll the DynamoDB stream and invoke our Lambda function when it detects new data. We just need to create an event source mapping in AWS Lambda in order to associate the stream and our Lambda function, which can be easily done via the DynamoDB UI or using a single command using the AWS CLI. The following JSON shows an example of what we may expect from incrementing the redemption count. Note that this stream is configured to send both the old and updated item image when the voucher record is modified.

{  
    "Records":[  
       {  
           "EventName":"MODIFY",
           "EventVersion":"1.0",
           "EventSource":"aws:dynamodb",
           "Dynamodb":{  
               "NewImage":{  
                   "Id":{  
                       "N":"{voucher-id}"
                   },
                   "Redemptions":{
                       "N":"1"
                   },
                   // more attributes...
               },
               "SizeBytes":{item-size},
               "StreamViewType":"NEW_AND_OLD_IMAGES",
               "SequenceNumber":"{sequence-number}",
               "OldImage":{  
                   "Id":{  
                       "N":"{voucher-id}"
                   },
                   "Redemptions":{
                       "N":"0"
                   },
                   // more attributes...
               },
               "Keys":{  
                   "Id":{  
                       "N":"{voucher-id}"
                   }
               }
           },
           "EventID":"{event-id}",
           "eventSourceARN":"arn:aws:dynamodb:eu-west-1:{account-id}:table/vouchers-table/stream/{stream-id}/",
           "AwsRegion":"eu-west-1"
       }
    ]
}

This record can then be passed to our Lambda function. The following code snippet shows how we may cover each of the steps required in the flow described previously. This Lambda will need to be set up with an execution role that not only allows us to log to CloudWatch but also allows us read and write access to DynamoDB.

var AWS = require('aws-sdk');
var async = require('async');

exports.handler = function(event, context) {
    var ddb = new AWS.DynamoDB();
    console.log("Event: %j", event);
    async.waterfall([
      function getPopularVouchers(next) {
          var params = {
              // popular vouchers list record info ...
          };
          ddb.getItem(params, function(err, data) {
              // pass the popular offers to the next step
              next(null, data.Item);
          });
      },
      function readNewRedemptions(popularVouchers, next) {
          var newPopularVouchers = false;
          for(i = 0; i < event.Records.length; ++i) {
              /* for all the new redemption counts, see if any of them beat the old redemption counts and, if so,
                 insert into the list and set newPopularVouchers to true ... */
          }
          // if the list has changed, update the popular vouchers item
          if (newPopularVouchers) {
              next(null, popularVouchers)
          } else {
              context.done(); // return if there is no change
          }
      },
      function writeNewPopularVouchers(popularVouchers, next) {
          ddb.putItem({
              Item: popularVouchers,
              TableName: "vouchers-table"
          }, function(err, data) {
              next(err);
          })
      }
    ], function(err) {
        if (err) {
            console.error("Failure! " + err);
        }
        context.done();
    });
}

Our function should be able to run upon each update of our vouchers table; however, it could also be run on batches of records in our DynamoDB stream in order to reduce the number of calls to the Lambda function. This may be preferable if our popular vouchers list is not required to be always up-to-date.

Final thoughts

This example would need to be greatly fleshed out before it would be suitable for a production environment but it has hopefully outlined a useful practical application for the use of the AWS Lambda service. AWS Lambda is a further abstraction of the infrastructure behind our applications and reduces what we have to build and maintain to the absolute minimum. We can build a scalable, event-based infrastructure which encourages us to split up tasks into many distinct Lambda functions that we might not have wanted to split previously because of operational complexity. Doing so would allow us to build a service-oriented architecture without the additional operational overhead. AWS Lambda also takes care of the messaging layer between services which further reduces the complexity of building and maintaining a service-oriented architecture. The pricing model is aligned with the success of our applications which means that our infrastructure spend can be directly tied to our applications’ usage. In fact, the free tier includes 1 million free requests and up to 3.2 million seconds of compute time per month depending on the amount of memory allocated per function. The threat of vendor lock-in may scare some teams away but you can limit this by making sure the code is not too coupled to Amazon’s API. There are still a few things that are missing from the service. The lack of support for more AWS services may be a big issue for some, as will Lambda functions being JavaScript only; however, solutions to both of these problems appear to be on Amazon’s roadmap. If Amazon delivers then it could be that AWS Lambda is going to be a big part of building apps on AWS in the future.