Kafka - Broker: Group coordinator not available
Asked Answered
B

1

10

I have the following structure:

zookeeper: 3.4.12
kafka: kafka_2.11-1.1.0
server1: zookeeper + kafka
server2: zookeeper + kafka
server3: zookeeper + kafka

Created topic with replication factor 3 and partitions 3 by kafka-topics shell script.

./kafka-topics.sh --create --zookeeper localhost:2181 --topic test-flow --partitions 3 --replication-factor 3

And use group localConsumers. it works fine when leader is ok.

./kafka-topics.sh --describe --zookeeper localhost:2181 --topic test-flow
Topic:test-flow PartitionCount:3    ReplicationFactor:3 Configs:
    Topic: test-flow    Partition: 0    Leader: 3   Replicas: 3,2,1 Isr: 3,2,1
    Topic: test-flow    Partition: 1    Leader: 1   Replicas: 1,3,2 Isr: 1,3,2
    Topic: test-flow    Partition: 2    Leader: 2   Replicas: 2,1,3 Isr: 2,1,3

Consumers' log

Received FindCoordinator response ClientResponse(receivedTimeMs=1529508772673, latencyMs=217, disconnected=false, requestHeader=RequestHeader(apiKey=FIND_COORDINATOR, apiVersion=1, clientId=consumer-1, correlationId=0), responseBody=FindCoordinatorResponse(throttleTimeMs=0, errorMessage='null', error=NONE, node=myserver3:9092 (id: 3 rack: null)))

But if leader is down - I get the error in consumer (systemctl stop kafka):

Node 3 is unavailable. ok

./kafka-topics.sh --describe --zookeeper localhost:2181 --topic test-flow
Topic:test-flow PartitionCount:3    ReplicationFactor:3 Configs:
    Topic: test-flow    Partition: 0    Leader: 2   Replicas: 3,2,1 Isr: 2,1
    Topic: test-flow    Partition: 1    Leader: 1   Replicas: 1,3,2 Isr: 1,2
    Topic: test-flow    Partition: 2    Leader: 2   Replicas: 2,1,3 Isr: 2,1

Consumers' log

Received FindCoordinator response 
ClientResponse(receivedTimeMs=1529507314193, latencyMs=36, 
disconnected=false, 
requestHeader=RequestHeader(apiKey=FIND_COORDINATOR, apiVersion=1, 
clientId=consumer-1, correlationId=149), 
responseBody=FindCoordinatorResponse(throttleTimeMs=0, 
errorMessage='null', error=COORDINATOR_NOT_AVAILABLE, node=:-1 (id: -1 
rack: null)))

- Group coordinator lookup failed: The coordinator is not available.
- Coordinator discovery failed, refreshing metadata

Consumer unable to connect until leader is down or reconnect with another consumer group.

Can't understand why it happens? Consumer should be rebalanced to another broker, but it doesn't.

Bussard answered 20/6, 2018 at 15:38 Comment(0)
L
23

Try to add properties into the server.conf and clean zookeeper cache. It should help

offsets.topic.replication.factor=3
default.replication.factor=3

Root cause of this issue is impossibility to distribute topic offsets between nodes.

Auto generated topic: __consumer_offsets

You can check it by

$ ./kafka-topics.sh --describe --zookeeper localhost:2181 --topic __consumer_offsets

Pay attention to this article: https://kafka.apache.org/documentation/#prodconfig

by default it creates __consumer_offsets with RF - 1

Important thing is to configure replication factor before the kafka/cluster start. Otherwise it can bring some issues with re configuring instances like in your case.

Leith answered 20/6, 2018 at 18:27 Comment(6)
I think an explaination why the error happened and why we should configure like this is better than only providing the solution.Denim
@Oleksandr Loushkin Thanks, you saved my life. I was running an in-memory kafka for unit testing, and the producers and consumers were not working. Then I changed the value of offsets.topic.replication.factor to 1, and it worked. Apparently if auto topic creation of topics is enabled, then the default value of this property is 3.Institutionalism
In the latest version of the kafka, the default value has been changed to 3, but you need to manually configure it. I had used the default server.properties file and it had the value set as 1, but the comment above the properties said to use a value of 3 or above if it is not development server.Tumult
Or, after you reconfigure Kafka, you can go to ZooKeeper CLI, and run rmr /brokers/topics/__consumer_offsets That will regenerate __consumer_offsets with new replication factor configuration.Seaweed
I can not agree more with @Institutionalism - this was a major headache for me in some integration test. very much appreciate your answer.Lemos
this did not resolve the issue in my environment. I still see the error Group coordinator lookup failed: The coordinator is not availableLeech

© 2022 - 2024 — McMap. All rights reserved.