The messages moved to DLQ are considered as you said, erroneous.
If the messages are erroneous due to a bug in the code etc, you should redrive these DLQ messages to source queue once you fixed the bug. So that they'll have another chance to be reprocessed.
It is very unlikely that "temporarly" erroneous messages are moved to DLQ, if you already configured the maxReceiveCount as 3 or more for your source queue. Temporary problems are mostly bypassed with this retry configuration.
And eventually DLQ is also an ordinary SQS queue which retains messages up to 14 days. Even if there are thousands of messages there, they will be gone. At this point, there are two options:
- Messages in DLQ are "really" erroneous. So see the metrics, messages and logs to identify the root cause. If there is no bug to fix, it means you keep unrequired data in DLQ. So there is nothing wrong to lose them in 14 days. If there is a bug, fix it an simply redrive messages from DLQ to source queue.
- You dont want to investigate through the messages to identify that what was the reason for failure, and you only want to persist message data for historical reasons (god knows why). You can create a lambda function to poll messages and persist in a desired target database.