Is there any difference in processing times between AWS Kinesis Firehose and Streams?
Asked Answered
Y

4

8

Reading over the documentation of both offerings (Firehose and Streams), it sounds like Firehose is "near" real-time with a potential of 60 seconds delay between producing a message to emitting it, whereas Streams documentation makes no mentioning of this potential delay.

Does anyone have any real-world insight into any differences with regards to the message delivery times?

[Notes]

Link to Firehose FAQ mentioning the delay, based on buffer size for S3 events.

Yellowwood answered 17/6, 2017 at 18:53 Comment(0)
S
5

With Kinesis Streams you can get your processing times to under a second. In my current streams the latency seems to be 5.5 ms for the Kinesis part and 330 ms for processing the record with a Lambda function. That is with a batch size of 1, which means that the lambda function processes records one by one.

Kinesis Streams can be a little expensive. To save some money I used a batch size of 500 in another stream with higher throughput. That added a couple of minutes of latency.

Firehose is generally much cheaper, but also gives limited functionality. If you are streaming a larger amount of data (more than you can 1 MB/minute) you can get the average processing time to under 60 seconds by adding a buffer size hint.

Sadness answered 26/1, 2018 at 7:42 Comment(0)
D
1

It does look to me like Kinesis Firehose is more or less a buffer collecting data until the buffer runs full or the oldest message in it is N seconds old (where N is configured by the user; I think 900 seconds is the max), at which point the entire buffer contents are written to their destination (eg. S3). Scaling is nothing you need to worry about unlike with Streams.

I can't quite comment on Kinesis Streams as I have not productively worked with them. But Streams is much more than a buffer as suggested by partition keys. A different approach to the same problem Firehose is trying to solve, but more flexible in how/where you process it.

Maybe this will be of any use to demystify what's what better than I can :) https://www.sumologic.com/wp-content/uploads/DemystifyingAmazonKinesis_infographic.pdf

Diffidence answered 26/6, 2017 at 14:33 Comment(0)
K
1

This surprised me, causing me to investigate, and report my findings. I'd seen Firehose used in several architectures as a go between, where adding a minute's latency might have seemed counter-productive. Also, the pressure of water under pressure may have misled me, it is more concerned with containing and directing that pressure. Fluid dynamics was always hard.

buffer size and buffer interval

Kinesis Data Firehose buffers incoming streaming data to a certain size or for a certain period of time before delivering it to destinations. Buffer Size is in MBs and Buffer Interval is in seconds.

from what is firehose?

Buffer size and buffer interval for the destination

Kinesis Data Firehose buffers incoming data before delivering it to the specified destination. For Amazon S3, Amazon Redshift, and Splunk as your chosen destination, you can choose a buffer size of 1–128 MiBs and a buffer interval of 60–900 seconds. For Amazon Elasticsearch as your chosen destination, you can choose a buffer size of 1–100 MiBs and a buffer interval of 60–900 seconds. For the HTTP endpoint destinations, including Datadog, and New Relic you can choose a buffer size of 1-64 MiBs and a buffer interval of 60-900 seconds. For MongoDB Cloud you can choose a buffer size of 1-16 MiBs and a buffer interval of 60-900 seconds.

from configuration settings

Kayceekaye answered 12/1, 2021 at 8:54 Comment(0)
Y
0

After digging into this further, I've found that the buffer/time settings on Firehose do indeed add an additional latency. However, the use case for Firehose (for me at least) wasn't correct. It seems that if you're OK for allowing latency, Firehose is the simpler way forward, and obviously if you're just ingesting data for downstream analysis. For real-time, Kinesis Streams is the way forward as the latency is up to the application.

Yellowwood answered 13/8, 2017 at 23:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.