Understanding Spring Cloud Stream Kafka and Spring Retry
Asked Answered
S

1

8

I have a Spring Cloud Stream project using the Kafka binder and I'm trying to understand and eventually customize the RetryTemplate used by Cloud Stream.

I'm not finding a lot of documentation on how this works, but what I've read leads me to the following assumptions:

  • Cloud Stream configures and enables Spring Retry by default, including default retry and backoff policies.
  • By default, any uncaught exception in a @StreamListener will trigger Spring Retry
  • Cloud Stream will somehow track RetryContext information for each message (how? I'm not sure)

Are these assumptions correct?

Now, in my application, I have a pattern where some messages can be handled immediately, but others must be deferred to be tried again later (using exponential backoff etc).

Should I be throwing an exception causing Spring Cloud Stream to retry these messages at the binder layer, or implementing retry myself and tracking my own retry contexts?

If I should be relying on Cloud Stream's retry setup, how should I customize the backoff policies, etc?

Shieh answered 8/6, 2020 at 20:49 Comment(0)
F
7

The default retry configuration is 3 attempts, 1 second initial delay, 2.0 multiplier, max delay 10 seconds.

By default stateless retry is used, meaning that the retries are in memory.

The aggregate delay for all retries for all records returned by a poll() must not exceed max.poll.interval.ms.

With modern versions of Spring for Apache Kafka (used by the binder); it is better to disable binder retries (maxAttempts=1) and use a SeekToCurrentErrorHandler with an appropriate BackOff configured.

You can set the error handler with a ListenerContainerCustomizer<AbstractMessageListenerContainer<?, ?>> @Bean with return (container, dest, grp) -> container.setErrorHandler(handler).

This avoids the problem mentioned above and only the max delay interval for one record must be less than max.poll.interval.ms.

You can also classify which exceptions are retryable and which are not, as well as configuring a dead-letter recoverer which is invoked after retries are exhausted.

See the reference documentation.

Fillagree answered 8/6, 2020 at 22:37 Comment(1)
Thanks for this, Gary. Don't know how I missed the docs you linked.Shieh

© 2022 - 2025 — McMap. All rights reserved.