I have the following structure:
zookeeper: 3.4.12
kafka: kafka_2.11-1.1.0
server1: zookeeper + kafka
server2: zookeeper + kafka
server3: zookeeper + kafka
Created topic with replication factor 3 and partitions 3 by kafka-topics shell script.
./kafka-topics.sh --create --zookeeper localhost:2181 --topic test-flow --partitions 3 --replication-factor 3
And use group localConsumers. it works fine when leader is ok.
./kafka-topics.sh --describe --zookeeper localhost:2181 --topic test-flow
Topic:test-flow PartitionCount:3 ReplicationFactor:3 Configs:
Topic: test-flow Partition: 0 Leader: 3 Replicas: 3,2,1 Isr: 3,2,1
Topic: test-flow Partition: 1 Leader: 1 Replicas: 1,3,2 Isr: 1,3,2
Topic: test-flow Partition: 2 Leader: 2 Replicas: 2,1,3 Isr: 2,1,3
Consumers' log
Received FindCoordinator response ClientResponse(receivedTimeMs=1529508772673, latencyMs=217, disconnected=false, requestHeader=RequestHeader(apiKey=FIND_COORDINATOR, apiVersion=1, clientId=consumer-1, correlationId=0), responseBody=FindCoordinatorResponse(throttleTimeMs=0, errorMessage='null', error=NONE, node=myserver3:9092 (id: 3 rack: null)))
But if leader is down - I get the error in consumer (systemctl stop kafka):
Node 3 is unavailable. ok
./kafka-topics.sh --describe --zookeeper localhost:2181 --topic test-flow
Topic:test-flow PartitionCount:3 ReplicationFactor:3 Configs:
Topic: test-flow Partition: 0 Leader: 2 Replicas: 3,2,1 Isr: 2,1
Topic: test-flow Partition: 1 Leader: 1 Replicas: 1,3,2 Isr: 1,2
Topic: test-flow Partition: 2 Leader: 2 Replicas: 2,1,3 Isr: 2,1
Consumers' log
Received FindCoordinator response
ClientResponse(receivedTimeMs=1529507314193, latencyMs=36,
disconnected=false,
requestHeader=RequestHeader(apiKey=FIND_COORDINATOR, apiVersion=1,
clientId=consumer-1, correlationId=149),
responseBody=FindCoordinatorResponse(throttleTimeMs=0,
errorMessage='null', error=COORDINATOR_NOT_AVAILABLE, node=:-1 (id: -1
rack: null)))
- Group coordinator lookup failed: The coordinator is not available.
- Coordinator discovery failed, refreshing metadata
Consumer unable to connect until leader is down or reconnect with another consumer group.
Can't understand why it happens? Consumer should be rebalanced to another broker, but it doesn't.