Do we have a option to get data in KSQL streams from specific time-period/Timestamp
Asked Answered
F

1

5

I know , in KSQL we can set offset to earliest or latest But can we get data from specific time period i.e I need to get data inserted to a topic from 06-May-2020 ?

Fca answered 6/5, 2020 at 11:2 Comment(0)
M
8

In ksqlDB you can query from the beginning (SET 'auto.offset.reset' = 'earliest';) or end of a topic (SET 'auto.offset.reset' = 'latest';).

You cannot currently (0.8.1 / CP 5.5) seek to an arbitrary offset.

What you can do is start from the earliest offset and then use ROWTIME in your predicate to identify messages that match your requirement.

SELECT * 
  FROM MY_SOURCE_STREAM 
WHERE  ROWTIME>=1588772149620

Note that this scans through sequentially so depending on how much data you have in your topic may not be particularly fast.

Miskolc answered 6/5, 2020 at 11:42 Comment(5)
Thanks @Robin Moffatt. I can understand setting earliest and getting data is a performance issue. I can use this solution when KSQL struggles to stream data . Can you please suggest how to add ROWTIME in predicate in KSQL query. Please provide a sample KSQL query if my ROWTIMW is '1588772149620'Fca
i've added an exampleMiskolc
Can i get messages from ROWTIME=1588772149620 ? I believe this query will fetch only single record . Basically KSQL streams are getting stopped if no data a pushed for couple of days . So i wan a restart a stream from last message which I consumed .Fca
I've updated my example to use greater than or equal to instead of equal to in the predicateMiskolc
If you had a consumer then it was restarted, is there any way to get the records that were missed during the restart process?Randellrandene

© 2022 - 2024 — McMap. All rights reserved.