We are evaluating Amazon Redshift for real time data warehousing.
Data will be streamed and processed through a Java service and it should be stored in the database. We process row by row (real time) and we will only insert one row per transaction.
What is best practice for real time data loading to Amazon Redshift?
Shall we use JDBC and perform INSERT INTO
statements, or try to use Kinesis Firehose, or perhaps AWS Lambda?
I'm concerned about using one of these services because both will use Amazon S3 as a middle layer and perform the COPY
command which is suitable for bigger data sets, not for "one-row" inserts.