tl;dr
Are kafka leaders partitions themselves or are they brokers?
The partition leader is a Kafka Broker.
Partition Leader
This is clearly mentioned in Kafka Docs:
Each partition has one server which acts as the "leader" and zero or
more servers which act as "followers". The leader handles all read and
write requests for the partition while the followers passively
replicate the leader. If the leader fails, one of the followers will
automatically become the new leader. Each server acts as a leader for
some of its partitions and a follower for others so load is well
balanced within the cluster.
Therefore, a partition leader is actually the broker that serves this purpose and is responsible for all read and write requests for this particular partition.
Partition Leader Election
The assignment of a leader for a particular partition happens during a process called partition leader election. This process happens when the topic/partition is created or when the partition leader (i.e. the broker) is unavailable for any reason.
Additionally, you can force preferred replica election by using Preferred Replica Leader Election Tool:
With replication, each partition can have multiple replicas. The list
of replicas for a partition is called the "assigned replicas". The
first replica in this list is the "preferred replica". When
topic/partitions are created, Kafka ensures that the "preferred
replica" for the partitions across topics are equally distributed
amongst the brokers in a cluster. In an ideal scenario, the leader for
a given partition should be the "preferred replica". This guarantees
that the leadership load across the brokers in a cluster are evenly
balanced. However, over time the leadership load could get imbalanced
due to broker shutdowns (caused by controlled shutdown, crashes,
machine failures etc). This tool helps to restore the leadership
balance between the brokers in the cluster.
To do so, you have to run the following command:
bin/kafka-preferred-replica-election.sh --zookeeper localhost:12913/kafka --path-to-json-file topicPartitionList.json
where the content of topicPartitionList.json
should look like the one below:
{
"partitions":
[
{"topic": "topic1", "partition": 0},
{"topic": "topic1", "partition": 1},
{"topic": "topic1", "partition": 2},
{"topic": "topic2", "partition": 0},
{"topic": "topic2", "partition": 1}
]
}
How to find which broker serves as the partition leader
In order to find which broker serves as the partition leader and which serve as In-Sync Replicas (ISR), you have to run the following command:
kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic myTopic
and the output should be identical to the one below:
Topic:myTopic PartitionCount:4 ReplicationFactor:1 Configs:
Topic: myTopic Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Topic: myTopic Partition: 1 Leader: 3 Replicas: 3 Isr: 3
Topic: myTopic Partition: 2 Leader: 4 Replicas: 4 Isr: 4
Topic: myTopic Partition: 3 Leader: 0 Replicas: 0 Isr: 0
acks=all
is set. I know, ifacks=all
and there is not enough ISR according tomin.insync.replicas
, producer cannot produce data and gets NotEnoughReplicas exception. But it doesn't mean that "partition is not available". consumers that subscribe this topic can still send fetch requests and partition is still available. Also other producers that haveacks=1
oracks=0
can still produce messages. You may consider to change this part. Regards. – Tammeratammi