Firehose to S3: What happens to data after unsuccessful tries for 24 hours
Asked Answered
P

0

7

From AWS documentation:

Data delivery to your S3 bucket might fail for reasons such as the bucket doesn’t exist anymore, the IAM role that Kinesis Firehose assumes doesn’t have access to the bucket, network failure, or similar events. Under these conditions, Kinesis Firehose keeps retrying for up to 24 hours until the delivery succeeds. The maximum data storage time of Kinesis Firehose is 24 hours and your data is lost if data delivery fails for more than 24 hours.

  • Now what happens to data that is lost?
  • Are there any logs or metrics to check for such failures?

I have created alarm on DeliverytoS3.Success metrics (if metric value < 1 for 1 min, alarm triggers). So whenever there is a failure while sending to S3, it retries till 24 hrs but metrics show value < 1 for that period and alarm triggers. Also I am not seeing any CloudWatch error (S3Delivery) logs.

My aim is to trigger alarm only when we are not able to send data to S3 ultimately (even after 24 hrs).

Note: Please let me know if any explanation or correction is required.

Pitchdark answered 6/9, 2017 at 5:59 Comment(2)
I still haven't found any metric to track whether data is lost. However I was able to correct the alarm to not send false alarms by increasing the observation time 30 mins instead of 1min. And therefore trusting firehose reliability. Its been 4 days and no false alarms yet.Pitchdark
I think it would be better to rely on this metric : "DeliveryToS3.DataFreshness". The value should be consistent if Firehose is able to deliver data to S3 properly. As soon as there's an error, this should spike.Alegar

© 2022 - 2024 — McMap. All rights reserved.