I know , in KSQL we can set offset to earliest or latest But can we get data from specific time period i.e I need to get data inserted to a topic from 06-May-2020 ?
Do we have a option to get data in KSQL streams from specific time-period/Timestamp
Asked Answered
In ksqlDB you can query from the beginning (SET 'auto.offset.reset' = 'earliest';
) or end of a topic (SET 'auto.offset.reset' = 'latest';
).
You cannot currently (0.8.1 / CP 5.5) seek to an arbitrary offset.
What you can do is start from the earliest offset and then use ROWTIME
in your predicate to identify messages that match your requirement.
SELECT *
FROM MY_SOURCE_STREAM
WHERE ROWTIME>=1588772149620
Note that this scans through sequentially so depending on how much data you have in your topic may not be particularly fast.
Thanks @Robin Moffatt. I can understand setting earliest and getting data is a performance issue. I can use this solution when KSQL struggles to stream data . Can you please suggest how to add ROWTIME in predicate in KSQL query. Please provide a sample KSQL query if my ROWTIMW is '1588772149620' –
Fca
i've added an example –
Miskolc
Can i get messages from ROWTIME=1588772149620 ? I believe this query will fetch only single record . Basically KSQL streams are getting stopped if no data a pushed for couple of days . So i wan a restart a stream from last message which I consumed . –
Fca
I've updated my example to use
greater than or equal to
instead of equal to
in the predicate –
Miskolc If you had a consumer then it was restarted, is there any way to get the records that were missed during the restart process? –
Randellrandene
© 2022 - 2024 — McMap. All rights reserved.