What is a partition leader in Apache Kafka?
Asked Answered
T

4

11

Are kafka leaders partitions themselves or are they brokers? My initial understanding was that they were partitions which acted as read/write agents which then deffered their value to ISRs.

However recently I have been hearing them mentioned as though they happen at the "broker" level, hence my confusion.

I know there are other posts which aim to answer this question, but the answers there did not help.

Tamarin answered 24/3, 2020 at 17:3 Comment(0)
B
32

Some answers here are not absolutely correct so I would like to make it clearer.

Every partition has exactly one partition leader which handles all the read/write requests of that partition. (update: from Kafka 2.4.0, consumers are allowed to read from replicas)
If replication factor is greater than 1, the additional partition replications act as partition followers.
Kafka guarantees that every partition replica resides on a different broker (whether if it's the leader or a follower), so the maximum replication factor is the number of brokers in the cluster.

Every partition follower is reading messages from the partition leader (acts like a kind of consumer) and does not serve any consumers of that partition (only the partition leader serves read/writes).
A partition follower is considered in-sync if it's reading records from the partition leader without lagging behind and without losing connection to ZooKeeper (max lag default is 10 seconds and ZooKeeper timeout is 6 seconds, both are configurable).
If a partition follower is lagging behind or lost connection from ZooKeeper, it considered out-of-sync.
When a partition leader shuts down for any reason (e.g a broker shuts down), one of it's in-sync partition followers becomes the new leader.

The replication section in Kafka Documentation explains this in details.
Confluent also wrote a nice blog about this topic.

Burble answered 24/3, 2020 at 18:33 Comment(7)
It's a really good and comprehensive answer. But I think last part of the answer is not correct. I guess you assume that acks=all is set. I know, if acks=all and there is not enough ISR according to min.insync.replicas, producer cannot produce data and gets NotEnoughReplicas exception. But it doesn't mean that "partition is not available". consumers that subscribe this topic can still send fetch requests and partition is still available. Also other producers that have acks=1 or acks=0 can still produce messages. You may consider to change this part. Regards.Tammeratammi
Thank you! As far as I know acks configuration is only about when a producer get acknowledged about his sends, but without any relation to that, a partition may be unavailable if it doesn't have enough in-sync follower partitions. So even for acks=0 if the partition is unavailable due to too many out-of-sync partitions, the producer will fail producing any new messages to that topic. Anyway, I will dig further and try to backup this explanation with a trusted source. If we will find out that I'm wrong I'll update the answer :)Burble
I checked in the documentation and you are completely right, they say: If a less stringent acknowledgement is requested by the producer, then the message can be committed, and consumed, even if the number of in-sync replicas is lower than the minimum (e.g. it can be as low as just the leader).. I removed the last part about the minimum in-sync replicas. I don't think adding more information about this and the acks parameter is relevant to this question. If you think I should add the explanation please tell me!Burble
I think it is okay now :)Tammeratammi
This answer is outdated even in March 24, 2020. Kafka 2.4.0 released in Dec 16 2019 supports consuming from replicas. See KIP-392Coppola
if partition leader dies or down, how one of its followers would become leader? does partition leader election needs to be forced? or is it automatic?Paean
When you talk about brokers in a cluster, you mean the servers in the cluster, right?Jealous
D
9

tl;dr

Are kafka leaders partitions themselves or are they brokers?

The partition leader is a Kafka Broker.


Partition Leader

This is clearly mentioned in Kafka Docs:

Each partition has one server which acts as the "leader" and zero or more servers which act as "followers". The leader handles all read and write requests for the partition while the followers passively replicate the leader. If the leader fails, one of the followers will automatically become the new leader. Each server acts as a leader for some of its partitions and a follower for others so load is well balanced within the cluster.

Therefore, a partition leader is actually the broker that serves this purpose and is responsible for all read and write requests for this particular partition.


Partition Leader Election

The assignment of a leader for a particular partition happens during a process called partition leader election. This process happens when the topic/partition is created or when the partition leader (i.e. the broker) is unavailable for any reason.

Additionally, you can force preferred replica election by using Preferred Replica Leader Election Tool:

With replication, each partition can have multiple replicas. The list of replicas for a partition is called the "assigned replicas". The first replica in this list is the "preferred replica". When topic/partitions are created, Kafka ensures that the "preferred replica" for the partitions across topics are equally distributed amongst the brokers in a cluster. In an ideal scenario, the leader for a given partition should be the "preferred replica". This guarantees that the leadership load across the brokers in a cluster are evenly balanced. However, over time the leadership load could get imbalanced due to broker shutdowns (caused by controlled shutdown, crashes, machine failures etc). This tool helps to restore the leadership balance between the brokers in the cluster.

To do so, you have to run the following command:

bin/kafka-preferred-replica-election.sh --zookeeper localhost:12913/kafka --path-to-json-file topicPartitionList.json

where the content of topicPartitionList.json should look like the one below:

{
 "partitions":
  [
    {"topic": "topic1", "partition": 0},
    {"topic": "topic1", "partition": 1},
    {"topic": "topic1", "partition": 2},
    {"topic": "topic2", "partition": 0},
    {"topic": "topic2", "partition": 1}
  ]
}

How to find which broker serves as the partition leader

In order to find which broker serves as the partition leader and which serve as In-Sync Replicas (ISR), you have to run the following command:

kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic myTopic

and the output should be identical to the one below:

Topic:myTopic       PartitionCount:4        ReplicationFactor:1     Configs:
    Topic: myTopic      Partition: 0    Leader: 2       Replicas: 2     Isr: 2
    Topic: myTopic      Partition: 1    Leader: 3       Replicas: 3     Isr: 3
    Topic: myTopic      Partition: 2    Leader: 4       Replicas: 4     Isr: 4
    Topic: myTopic      Partition: 3    Leader: 0       Replicas: 0     Isr: 0
Dehypnotize answered 24/3, 2020 at 19:28 Comment(0)
A
3

Partition leader concept works, when Kafka topic have --replication-factor more then 1 (that also means our cluster must have broker count greater or equals to replication-factor).

In such scenario when ever producer push any message to topic's partition, the request first comes to partition's leader (among all replicated partition present on Kafka cluster). Which stores the message and first replicate the message on other follower partitions and then after sends acknowledge for the message to producer.

After completion above process only, particular message would be available for consumer to consume.

I recommend official link for more understanding.

Acyl answered 31/8, 2020 at 17:10 Comment(0)
J
0

All topic-partitions in Kafka has one leader and if replication factor is greater than 1, leader has follower(s). Partition leaders can be checked with this command:

bin/kafka-topics.sh --bootstrap-server localhost:9092 --topic myTopic --describe

In the output of this command broker ids of partition leaders is shown as leader: xx

Jaenicke answered 24/3, 2020 at 17:48 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.