Difference between kafka idempotent and transactional producer setup?
Asked Answered
P

2

7

When setting up a kafka producer to use idempotent behaviour, and transactional behaviour:

I understand that for idempotency we set: enable.idempotence=true and that by changing this one flag on our producer, we are guaranteed exactly-once event delivery?

and for transactions, we must go further and set the transaction.id=<some value> but by setting this value, it also sets idempotence to true?

Also, by setting one or both of the above to true, the producer will also set acks=all.

With the above should I be able to add 'exactly once delivery' by simply changing the enable idempotency setting? If i wanted to go further and enable transactional support, On the Consumer side, I would only need to change their setting, isolation.level=read_committed? Does this image reflect how to setup the producer in terms of EOS?

enter image description here

Passed answered 18/2, 2020 at 14:57 Comment(0)
P
6

Yes you understood the main concepts.

By enabling idempotence, the producer automatically sets acks to all and guarantees message delivery for the lifetime of the Producer instance.

By enabling transactions, the producer automatically enables idempotence (and acks=all). Transactions allow to group produce requests and offset commits and ensure all or nothing gets committed to Kafka.

When using transactions, you can configure if consumers should only see records from committed transactions by setting isolation.level to read_committed, otherwise by default they see all records including from discarded transactions.

Pm answered 18/2, 2020 at 20:33 Comment(1)
idempotence is not only acks=all, but also retries=Integer.MAX_VALUEVery
V
3

Actually idemnpotency by itself does not always guarantee exactly once event delivery. Let's say you have a consumer that consumes an event, processes it and produces an event. Somewhere in this process the offset that the consumer uses must be incremented and persisted. Without a transactional producer, if it happens before the producer sends a message, the message might not be sent and its at most once delivery. If you do it after the message is sent you might fail in persisting the offset and then the consumer would read the same message again and the producer would send a duplicate, you get an at least once delivery. The all or nothing mechanism of a transactional producer prevents this scenario given that you store your offset on kafka, the new message and the incrementation of the offset of the consumer becomes an atomic action.

Vinitavinn answered 15/2, 2022 at 16:30 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.