I am using Kafka Connect from Confluent to consume Kafka stream and write to HDFS in parquet format. I am using Schema Registry service in 1 node and it is running fine. Now I want to distribute Schema Registry to cluster mode to handle fail over. Any link or snippet on how to achieve that will be very useful.
It is hard to find, but we covered this architecture in our documentation: http://docs.confluent.io/3.0.0/schema-registry/docs/deployment.html#multi-dc-setup
To quote from the docs a bit (although you should read the docs, lots of good architecture advice and a recovery runbook are included):
Assuming you have Schema Registry running, here are the recommended steps to add Schema Registry instances in a new “slave” datacenter (call it DC B):
In DC B, make sure Kafka has unclean.leader.election.enable set to false. In Kafka in DC B, create the _schemas topic. It should have 1 partition, kafkastore.topic.replication.factor of 3, and min.insync.replicas at least 2. In DC B, run MirrorMaker with Kafka in the “master” datacenter as the source and Kafka in DC B as the target. In the Schema Registry config files in DC B, set kafkastore.connection.url and schema.registry.zk.namespace to match the instances already running, and set master.eligibility to false. Start your new Schema Registry instances with these configs.
8081
, and another listening on 18081
; Do both of them accept GET
requests? Although I am able to get a response when calling kafka-host:8081/subjects
, I am not able to get a response for kafka-host:18081/subjects
. Is this a normal behaviour? –
Pizzicato I used confluent schema-registry docker image to form the cluster.
docker run --restart always -d -p 8081:8081 --name=schema-registry-1 -e SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=ip1:2181,ip2:2181,ip3:2181 -e SCHEMA_REGISTRY_HOST_NAME=schema-registry-1 -e SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081 -e SCHEMA_REGISTRY_DEBUG=true confluentinc/cp-schema-registry:5.2.1-1
docker run --restart always -d -p 8081:8081 --name=schema-registry-2 -e SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=ip1:2181,ip2:2181,ip3:2181 -e SCHEMA_REGISTRY_HOST_NAME=schema-registry-2 -e SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081 -e SCHEMA_REGISTRY_DEBUG=true confluentinc/cp-schema-registry:5.2.1-1
docker run --restart always -d -p 8081:8081 --name=schema-registry-3 -e SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=ip1:2181,ip2:2181,ip3:2181 -e SCHEMA_REGISTRY_HOST_NAME=schema-registry-3 -e SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081 -e SCHEMA_REGISTRY_DEBUG=true confluentinc/cp-schema-registry:5.2.1-1
Once this is up and running I verified if schema-registry cluster is formed and if its leader election is successful, by checking the zookeeper contents.
$ docker exec -it zookeeper bash
# /usr/bin/zookeeper-shell localhost:2181
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is enabled
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[schema_registry, cluster, controller, brokers, zookeeper, admin, isr_change_notification, log_dir_event_notification, controller_epoch, kafka-manager, CruiseControlBrokerList, consumers, latest_producer_id_block, config]
[zk: localhost:2181(CONNECTED) 1] ls /schema_registry
[schema_registry_master, schema_id_counter]
[zk: localhost:2181(CONNECTED) 4] get /schema_registry/schema_registry_master
{"host":"schema-registry-1","port":8081,"master_eligibility":true,"scheme":"http","version":1}
#
Hope this helps.
You just need to put this in the connect-avro-distributed.properties to use multi schema registry:
key.converter.schema.registry.url=http://node1:8081,http://node2:8081
value.converter.schema.registry.url=http://node1:8081,http://node2:8081
Hope this is useful for you.
Don't forget to mention master.eligibility=true
in all the nodes.
© 2022 - 2024 — McMap. All rights reserved.