SQS Messages Not Deleting
Asked Answered
O

3

18

I have a small set of messages in an SQS queue, that are not deleted even though a deletion request sent to the AWS endpoint returns with a 200 response. The messages are processed by my application fine, and the deletion request is sent fine too.

I'm using the Java AWS SDK 1.3.6.

Has anyone else experienced this problem?

Overhasty answered 17/4, 2012 at 15:35 Comment(3)
Are these SQS messages not deleted at all or does it just take a few seconds? Could you show us some code?Hardenberg
Hi Daan. They're never deleted, or at least they haven't been in the last few hours. I could show you some code, but it's just regular use of the AWS SDK, so there's not much point!Overhasty
Hmm, I've never had this problem. Can you delete them manually from the AWS web management console? Is this affecting all of your SQS queues or just this one? If you just have one, could you try creating a new queue and seeing whether the same code works for deleting messages from that queue? I'm not seeing any issues on the AWS service status page, so I don't think SQS is acting up. Could be, though.Hardenberg
O
16

Whoops - the queue was accidentally set to defaultVisibilityTimeout=0. Changing this to a positive value fixed the problem.

This still raises a few questions though:

  1. Why did this only affect some messages? Perhaps some took longer to process?
  2. Why did Amazon return a 200 for delete when the messages weren't being deleted?
  3. Was the deletion failing because it fell outside of the 0-second window (in which case why did any deletion requests succeed?), or did they fail because another consumer had picked them up by the time the deletion request was received?
Overhasty answered 18/4, 2012 at 10:21 Comment(5)
Hmm. That's mysterious. I don't know the answers to these questions either but I'd love to hear if there's anyone who does. Thanks for sharing the solution, by the way!Hardenberg
I am experiencing this exact same thing right now using the JavaScript SDK. My timeout was 0 also, but increasing it did not help. I too am getting success messages, but only when I delete via the AWS console does it work. Any updates from anyone?Ebberta
@Hardenberg can you check my answer and confirm it is the case? Thanks !Liponis
I had the same problem using the PHP SDK. Changing defaultVisibilityTimeout to not be zero also fixed it for me.Sweettalk
This also seems to happen if you specify a zero Visibility Timeout per receive message request, rather than as the default on the queue. Hypothesis: you cannot delete messages at all if they are not currently in a visibility timeout (delete will always falsely "succeed").Overcome
L
5

Official documentation (version 1.9.13)

IMPORTANT: It is possible you will receive a message even after you have deleted it. This might happen on rare occasions if one of the servers storing a copy of the message is unavailable when you request to delete the message. The copy remains on the server and might be returned to you again on a subsequent receive request. You should create your system to be idempotent so that receiving a particular message more than once is not a problem.

Liponis answered 1/4, 2015 at 10:28 Comment(4)
One solution is to extend the visibility timeout (ideally programmatically).Liponis
I don't think this really "answers" the question because, crucially, the quoted documentation does not mention the visibility timeout at all, and implies that this is some rare condition having to do with server availability. The issue being asked about is not rare or unlikely, does not seem to have anything to do with server availability, and is so reliable that it seems delete always fails with a false success if the visibility timeout is zero (and maybe just in general any time that the delete happens after the visibility timeout).Overcome
And the solution actually proposed in the quoted documentation (idempotency) is a non-solution to the actual problem that happens when messages are literally never deleted - idempotently processing messages that keep being redelivered won't avoid problems with a forever-growing amount of queued messages, such as a unboundedly growing bill for SQS usage, and potentially never receiving any new new messages at all once there is enough old messages still undeleted.Overcome
And yes it is intended.Sherilyn
T
-1

If you have pass in your exception there is a good chance you're not actually deleting your sqs message so it is getting stuck "in flight" because there may be an error in your processing. Wrap your sqs polling in a try except like this and see if you have any exceptions:

import traceback
import sys

try:
    pollsqshere()
except Exception:
    print(traceback.format_exc())
    # or
    print(sys.exc_info()[2])
Toney answered 20/10, 2021 at 19:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.