SQS-Lambda DLQ issue

I’m trying to upgrade the version of Localstack used by some tests from v0.14.5 to v3.8.1. Infrastructure is deployed using CloudFormation templates. Key components are

  • Input SQS queue
  • SQS dead-letter/re-drive queue
  • Java 11 lambda function, configured to read from the input SQS queue and write to a DynamoDB table

Happy path test sends a message to the input queue and waits until the Lambda writes an entry to Dynamo.
Failure test sends an invalid message to the input queue and waits until the message ends up on the dead-letter queue.

With Localstack 0.14.5, both tests run fine. With Localstack 3.8.1, the second test fails because the invalid message never gets sent to the DLQ, and I’m struggling to work out why. In both cases, I can see the Lambda throws an exception. With 0.14.5 I see the following logs

localstack_service-1 | localstack.services.awslambda.lambda_executors.InvocationException: Lambda process returned with error. Result: {"errorType":"java.lang.NullPointerException" ...

and

localstack_service-1 | 2024-10-24T19:32:58.522:INFO:localstack.utils.aws.dead_letter_queue: Sending failed execution arn:aws:sqs:eu-west-1:000000000000:product-updated-input-queue to dead letter queue arn:aws:sqs:eu-west-1:000000000000:product-updated-input-queue-dead-letter

With 3.8.1 I see

localstack_service-1 | 2024-10-24T20:04:58.345 DEBUG --- [functhread17] l.s.l.e.s.lambda_sender : Pipe target function arn:aws:lambda:eu-west-1:000000000000:function:product-updated failed with FunctionError Unhandled. Payload: {'errorMessage': 'java.lang.NullPointerException', 'errorType': 'java.lang.NullPointerException', ...

But nothing about the message being sent to the DLQ. I’m using the exact same CloudFormation template in both cases.

Anyone any suggestions what I might be doing wrong? What additional logging or config I can try? Or could this be a change in Localstack behaviour?

Thanks

I think I may have found the reason for this. Our CloudFormation templates set DeadLetterQueueReceiveCount to 3 and a high value for VisibilityTimeout, so the failed message DOES eventually end up on the DLQ - just nowhere near as fast as it used to. Having now parameterised those values so I can set a receive count of 1 and visibility timeout of 10 seconds when testing, the DLQ check is working again. Not sure how to check, but I wonder if those parameters were not supported via CF in the previous version (0.14.5) we were using?

1 Like

Hello @jonp,

Thank you for your follow-up message. Indeed, the upgrade from 0.14.5 to 3.8.1 has introduced numerous enhancements across all the services you have been utilizing. It is difficult to pinpoint the exact time when support for specific operations was implemented. I would suggest checking our Release Notes · localstack/localstack.

1 Like

@jonp Our latest Lambda event source mappings implementation comes with a lot of improvements and more transparent behavior coverage documentation.

Check out our new Lambda event source mapping (ESM) implementation in LocalStack v4 offering better functional and non-functional support for ESM :rocket: . The behavioral coverage documentation provides more details for your use case Lambda | Docs

Feel free to create a GitHub issue or share your feedback if you experience any challenges👂