I'm currently sending a series of xml messages to aws kinesis stream, I've been using this on different projects, so I'm pretty confident that this bit works. Then I've written a lambda to process events from kinesis stream to kinesis firehose:
import os
import boto3
import base64
firehose = boto3.client('firehose')
def lambda_handler(event, context):
deliveryStreamName = os.environ['FIREHOSE_STREAM_NAME']
# Send record directly to firehose
for record in event['Records']:
data = record['kinesis']['data']
response = firehose.put_record(
DeliveryStreamName=deliveryStreamName,
Record={'Data': data}
)
print(response)
I've set the kinesis stream as the lamdba trigger, and set the batch size as 1, and starting position LATEST.
For the kinesis firehose I have the following config:
Data transformation*: Disabled
Source record backup*: Disabled
S3 buffer size (MB)*: 10
S3 buffer interval (sec)*: 60
S3 Compression: UNCOMPRESSED
S3 Encryption: No Encryption
Status: ACTIVE
Error logging: Enabled
I sent 162 events, and I read them from s3, and the most I've managed to get it 160, and usually it's less. I've even tried to wait a few hours incase something strange was happening with retries.
Anyone had any experience using kinesis-> lamdba -> firehose, and seen issues of lost data?