Unlocking Scalability: Harnessing the Power of SQS and Lambda Integration

Unlocking Scalability: Harnessing the Power of SQS and Lambda Integration

AWS Simple Queue Service (SQS) provides a reliable and scalable messaging solution, while AWS Lambda offers serverless computing capabilities. Combining these two services can enable robust and event-driven architectures. In this blog post, we will explore the integration between SQS and Lambda in-depth, understanding the mechanisms behind message consumption and processing. We will delve into the options available for processing messages individually or in batches, highlighting the advantages and considerations for each approach. Additionally, we will discuss noteworthy configurations for Lambda event source mappings and SQS, such as batch size, batch window, maximum concurrency, and queue visibility timeout.

Understanding SQS and Lambda Integration

SQS and Lambda provide a powerful integration for building scalable and event-driven architectures. With Lambda event source mappings, you can process messages from SQS queues asynchronously, decoupling components and ensuring reliable message delivery. When a Lambda function subscribes to an SQS queue, it uses a polling mechanism to wait for messages to arrive. Lambda consumes messages in batches. For each batch, Lambda triggers your function once. If there are more messages in the queue, Lambda can scale up to 1,000 concurrent functions, adding up to 60 functions per minute. Upon successful processing, Lambda automatically removes the messages from the queue.

Consume messages from SQS using Lambda

When working with SQS, you can process messages individually or in batches. The Lambda Event Source Mapping supports larger batches, allowing up to 10,000 messages or 6 MB in a single batch for standard SQS queues. In contrast, the SDK is limited to 10 messages per API call. This is one of the reasons why you should use Lambda Event Source Mapping whenever possible when you consume messages from SQS.

Processing messages individually offers advantages such as faster processing and simpler error handling. However, there are situations where batch processing is more appropriate. Batch processing is beneficial when you need higher throughput, improved efficiency, or when cost optimization is important.

Processing individual messages

Consuming individual SQS messages is relatively simple in Lambda, each message is treated as an independent event triggering the execution of your Lambda function. It is still important to implement appropriate error handling to capture any exceptions or errors that may occur during message processing. When a message is successfully processed by your Lambda function, Lambda will automatically delete the message from the queue. However, if an error is caught during the execution of the function, the message will be returned to the queue for further processing or retries. This ensures that messages are not lost in case of errors and allows for proper handling and reprocessing of failed messages.

Here's an example of a Lambda Function written in TypeScript that consumes individual messages:

import { SQSHandler, SQSEvent } from "aws-lambda";

export const handler: SQSHandler = async (event: SQSEvent) => {
  try {
    for (const record of event.Records) {
      const message = record.body;
      // Process the message here
      console.log("Processing message:", message);
    }
  } catch (error) {
    console.error("Error processing SQS messages:", error);
    throw error;
  }
};

Processing batches of messages

When Lambda processes a batch of messages from an SQS queue, the messages remain in the queue but are temporarily hidden based on the queue's visibility timeout (more on this later in this blog post). If your Lambda function successfully handles and processes the batch, Lambda automatically deletes the messages from the queue. However, if your function encounters an error while processing a batch, all messages in that batch reappear in the queue, making them visible again.

To ensure that messages are not processed multiple times, you have a couple of options. Firstly, you can configure your event source mapping to include ReportBatchItemFailures in the function response. This allows you to handle and track failed messages within your function code, this is the recommended approach when dealing with batches. Alternatively, you can utilize the Amazon SQS API action called DeleteMessage to explicitly remove messages from the queue as your Lambda function successfully processes them. The use of this API action ensures that messages are not reprocessed inadvertently.

I will provide two examples of how you can utilize the ReportBatchItemFailures functionality to return partial failures of messages, this will ensure that our function doesn't process messages more than once. I will demonstrate how this can be done by constructing a batchItemFailures function response as well as using the Middy middleware.

⚠️ Please note that you need to configure your Lambda event source mapping to include batch item failures for these examples to work.

Here's an example of a Lambda Function written in TypeScript that consumes a batch of messages and returns BatchItemFailures in the function response:

import { SQSEvent, SQSHandler, SQSBatchResponse } from 'aws-lambda';

export const handler: SQSHandler = async (event: SQSEvent) => {
  const failedMessageIds: string[] = [];

  for (const record of event.Records) {
    try {
      const message = record.body;
      console.log("Processing message:", message);
      // Process the message here
    } catch (error) {
      failedMessageIds.push(record.messageId);
    }
  }

  const response: SQSBatchResponse = {
    batchItemFailures: failedMessageIds.map((id) => ({
      itemIdentifier: id,
    })),
  };

  return response;
};

Here's an example of a Lambda Function written in TypeScript that consumes a batch of messages using the sqs-partial-batch-failure Middy middleware. This code simplifies the consumption of message batches. It automatically includes the failed message IDs as BatchItemFailures in the function response, eliminating the need for manual error tracking and reducing development overhead.

import middy from '@middy/core';
import sqsPartialBatchFailureMiddleware from '@middy/sqs-partial-batch-failure';
import { SQSEvent, SQSRecord } from 'aws-lambda';

async function mainHandler(event: SQSEvent): Promise<any> {
  const messagePromises = event.Records.map(processMessage);
  return Promise.allSettled(messagePromises);
}

async function processMessage(record: SQSRecord): Promise<any> {
  const message = record.body;
  console.log("Processing message:", message);
  // Process the message here
}

export const handler = middy(mainHandler).use(sqsPartialBatchFailureMiddleware());

Noteworthy configurations

Lambda Event Source Mapping

Batch size

The batch size (BatchSize) configuration determines the number of records sent to the Lambda function within each batch. For standard queues, you can set the batch size up to a maximum of 10,000 records. However, it's important to note that the total size of the batch cannot exceed 6 MB, regardless of the number of records configured.

Batch window

The batch window (MaximumBatchingWindowInSeconds) setting determines the maximum time, in seconds, that records are gathered before invoking the Lambda function. Please note that this configuration applies only to standard queues.

Maximum Concurrency

The maximum concurrency (ScalingConfig) feature enables us to set a maximum number of concurrent invocations for an Event Source Mapping, eliminating the issues caused by excessive throttling that was previously present when using reserved concurrency in Lambda. With this capability, we have gained better control over concurrency, especially when utilizing multiple Event Source Mappings with the same function.

SQS configurations

Queue visibility timeout

The visibility timeout (VisibilityTimeout) in SQS is a setting that determines how long a message remains invisible in the queue after it has been retrieved by a consumer. When a consumer receives a message from the queue, it becomes temporarily hidden from other consumers for the duration of the visibility timeout. If you choose a batch window greater than 0 seconds, it is important to consider the increased processing time within your queue's visibility timeout. It's recommended to set the visibility timeout to at least six times your function's timeout, plus the value of the batch window (MaximumBatchingWindowInSeconds). This ensures sufficient time for your Lambda function to process each batch of events and handle potential retries caused by throttling errors. If you for example would have function timeout of 30 seconds and a batch window of 20 seconds. This will be set to (30 x 6) + 20 = 200 seconds.

You can read more about this here: https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html#events-sqs-eventsource

Dead-letter queues (DLQs)

When a message fails to be processed by Lambda, it is returned to the queue for retrying. However, to prevent the message from being added to the queue multiple times and causing unnecessary consumption of Lambda resources, it is recommended to designate a Dead Letter Queue (DLQ) and send failed messages there. To control the number of retries for failed messages, you can set the Maximum receives value for the DLQ. Once a message has been re-added to the queue more times than the Maximum receives value, it will be moved to the DLQ. This allows you to process these failed messages at a later time, separate from the main queue.

You can read more about when to use a DLQ here: https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-dead-letter-queues.html#sqs-dead-letter-queues-when-to-use

Conclusion

In conclusion, the integration between SQS and Lambda provides a powerful and scalable solution for building event-driven architectures. By leveraging event source mappings, configuring batch processing, and utilizing noteworthy features like dead-letter queues and visibility timeouts, you can ensure reliable message processing and optimize resource utilization. Embrace this integration to unlock the full potential of distributed systems and create resilient applications that scale effortlessly.


Elva is a serverless-first consulting company that can help you transform or begin your AWS journey for the future