Athena can only see the first JSON record written to Firehose by Kinesis Analytics
Asked Answered
P

1

5

I am using Kinesis Analytics to read in JSON from Kinesis Firehose. I am successfully filtering out some of the records and writing a subset of the JSON properties to another Firehose.

I wanted to execute an Athena query on the data being written to S3 via the destination Firehose. However, the JSON records written to the files in S3 do not have any newlines. Consequently, when I query the data using Athena, it only returns the first record in each file.

When I write records to the source Firehose, I manually insert a newline between records, but Analytics doesn't seem to do this when writing to the destination.

Is there a way to get Analytics to write out a separator or newline between records, so Athena can see all of the records?

Perfectionism answered 6/10, 2017 at 12:34 Comment(2)
Looks like for now the only way is to add a lambda function to the firehose :(Selinaselinda
you need to check the answer here. #48226972Reflect
S
1

Enabling "New line delimiter" in the Destination setting of the firehose delivery stream resolves the issue for me.

Sokoto answered 23/2, 2023 at 6:42 Comment(1)
"new line delimiter" is only available when dynamic partitioning is enabled.Sherrillsherrington

© 2022 - 2024 — McMap. All rights reserved.