Amazon Cloudwatch alarm not triggered
Asked Answered
N

3

23

I have a cloudwatch alarm configured :

Threshold : "GreaterThan 0" for 1 consecutive period,

Period : 1 minute,

Statistic : Sum

The alarm is configured on top of AWS SQS NumberOfMessagesSent. The queue was empty and no messages were being published to it. I sent a message manually. I could see the spike in metric but state of alarm was still OK. I am a bit confused why this alarm is not changing its state even though all the conditions to trigger this are met.

Nero answered 9/7, 2015 at 19:28 Comment(3)
do you have an action associated with the alarm? When did you look? the way this works is that it will go to ALARM and go back to OK in the next minute. So if you looked at the wrong time, or caught end of minute for reporting, it's possible that you did not observe it (but it happened)Epperson
I had attached an action for each of the state- ALARM, OK, INSUFFICIENT-STATE. The action was to send out email, but I did not receive any email either.Nero
@JuhiKulshreshtha - Facing the same problem. Did you figure out the solution? If yes, please share the same.Lyonnaise
L
16

I just overcame this problem with the help of AWS support. You need to set the period on your alarm to ~15 minutes. It's got to do with how SQS marks the event's timestamps as it pushes them to CloudWatch.

Don't worry, as setting the period to a greater number will not affect how quickly you are alerted of an alarm. It will still get data from SQS every 5 minutes.

Leeke answered 6/3, 2017 at 21:26 Comment(5)
Although very unintuitive, I can confirm that this works.Selectman
I too can confirmMongo
I ran into this same issue today. For anyone else finding this, the answer above can now be improved upon (if you need it) by using the "M out of N" datapoints feature that has since been added. You can set a period of 5 minutes and "1 out of 2 datapoints", essentially getting your evaluation period at 10 minutes. As noted above, your alarm will still trigger within about 5 minutes of a message being sent to the queue, but since the eval period is 10 instead of 15 it will reset to OK about 5 minutes sooner than with a 15 minute period.Cuprum
by "set the period on your alarm to ~15 minutes" - does it mean the alarm will check the metric for every 15 mins so scaling can only be done at at least 15 mins interval? Also, when we set the period to 15 mins, do we need to adjust the metric value (eg.: NumberOfMessagesSent would be 3 times of the value we use for 5 mins)?Floats
FWIW I ran into this using the "AWS/ApiGateway" "5xx" error metric. When I changed the period to 15 minutes it worked. It also worked when I changed the period to 5 minutes and used "1 out of 2" Datapoints to alarm as per the comment by @CuprumTanjatanjore
P
1

Sometimes they suffer something calling "Delayed Metric delivery", it's something more usual when the alarm period is around narrow times, like 1 minute.

When the delayed timestamp arrive, is too late for the alarm, but not for the graph, because it finally print it nicely without gap.

Play with Evalution Periods and Datapoints to Alarm, not 1/1, maybe 3/2 or 3/1 would work fine.

Phratry answered 6/1, 2023 at 19:26 Comment(0)
P
0

It could be that the interval time is set to less than 300 seconds. The free CloudWatch checks every 5 minutes so if you set an alarm for less than that it you will sometimes get INSUFFICIENT_DATA.

Pronation answered 10/7, 2015 at 12:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.