Why retention.ms of Kaka Streams repartition topic is set to -1 by default? Isn't this infinitely retain messages in repartition topic?
Asked Answered
R

2

7

I think it's related to the below links, but I don't understand.

It's possible to provide topic configurations like "retention.ms", "cleanup.policy" for kafka streams internal topics like *-changelog topics to delete useless logs.

But when it comes to internal topics like *-repartition topics, it's not possible to provide topic configuration values, even though the default "retention.ms" for repartition topic is "-1" which means infinite retention. How can I delete or manage repartition topics? Otherwise the repartition topic's size is going to be too large and disk malfunction problems might occur.

How can I manage repartition topics? What is purgeData? Couldn't find any related explanations on the documentation.

Rapprochement answered 30/1, 2021 at 18:59 Comment(0)
R
4

Fact

  • retention.ms for the repartition topics is -1 by default and there's no way to override this value in kafka-streams client code.

What I misunderstood

  • Size of the repartition topic would be increasing infinitely since the retentions.ms for the repartition topics is -1.

Fix misunderstanding

  • There's a method called maybeCommit in the StreamThread class.
  • maybeCommit method is called iteratively inside the loop that handles stream records.
  • Inside the maybeCommit method (version 2.7.1), there's a comment like below.

    try to purge the committed records for repartition topics if possible

  • Based on this, what I understand is that when the record in the repartition topics is streamed down to the changelog topic, then the records already sent are purged periodically.
  • Therefore, there's no need to clear or manage retention.ms for the repartition topics.

Reference

Please leave a comment or correct this if I'm wrong.

Rapprochement answered 11/5, 2022 at 4:20 Comment(0)
A
1

I was facing the same issue with ksqldb. Internal topics grew up like TB of data in a matter of days with infinite retention by default. We altered them setting retention.ms to some value instead of infinite (-1) but after that everything broke. Today I executed this command: set topic.retention.ms=3600000 After that, I created a table and all internal topics were created with retention.ms=1h instead of infinite. Will try next week in prd environment to see if ksqldb (0.28.2) evicts segments and everything is ok. Source: https://docs.confluent.io/platform/current/streams/developer-guide/config-streams.html#internal-topic-parameters Hope it helps Regards

Anemia answered 3/12, 2022 at 19:50 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.