Confluent Schema Registry Cluster Mode
Asked Answered
D

4

6

I am using Kafka Connect from Confluent to consume Kafka stream and write to HDFS in parquet format. I am using Schema Registry service in 1 node and it is running fine. Now I want to distribute Schema Registry to cluster mode to handle fail over. Any link or snippet on how to achieve that will be very useful.

Drenthe answered 26/8, 2016 at 9:4 Comment(0)
G
6

It is hard to find, but we covered this architecture in our documentation: http://docs.confluent.io/3.0.0/schema-registry/docs/deployment.html#multi-dc-setup

To quote from the docs a bit (although you should read the docs, lots of good architecture advice and a recovery runbook are included):

Assuming you have Schema Registry running, here are the recommended steps to add Schema Registry instances in a new “slave” datacenter (call it DC B):

In DC B, make sure Kafka has unclean.leader.election.enable set to false. In Kafka in DC B, create the _schemas topic. It should have 1 partition, kafkastore.topic.replication.factor of 3, and min.insync.replicas at least 2. In DC B, run MirrorMaker with Kafka in the “master” datacenter as the source and Kafka in DC B as the target. In the Schema Registry config files in DC B, set kafkastore.connection.url and schema.registry.zk.namespace to match the instances already running, and set master.eligibility to false. Start your new Schema Registry instances with these configs.

Gyre answered 26/8, 2016 at 18:39 Comment(4)
Thanks for the info Gwen. My use case is not multi-DC. I just need to make my current running Schema Registry to handle failover by making it cluster (master-slaves). I have only once DC & once Kafka cluster.Drenthe
Ah, in that case, just install two schema registry servers and point them to the same zookeeper path. One of them will become leader and the other a follower automatically. You can see the current leader in /<schema.registry.zk.namespace>/schema_registry_master path in ZooKeeperGyre
Thanks Gwen. Also please let me know how can I use schema registry service. Currently I am using like http://<registry-server-hostname>:port. Now after adding additional server how can i point to a common name while using service ?Drenthe
@GwenShapira Assuming that I have two instances of Schema Registry up and running, one listening on 8081, and another listening on 18081; Do both of them accept GET requests? Although I am able to get a response when calling kafka-host:8081/subjects, I am not able to get a response for kafka-host:18081/subjects. Is this a normal behaviour?Pizzicato
R
3

I used confluent schema-registry docker image to form the cluster.

docker run --restart always -d -p 8081:8081 --name=schema-registry-1 -e SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=ip1:2181,ip2:2181,ip3:2181 -e SCHEMA_REGISTRY_HOST_NAME=schema-registry-1 -e SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081 -e SCHEMA_REGISTRY_DEBUG=true confluentinc/cp-schema-registry:5.2.1-1

docker run --restart always -d -p 8081:8081 --name=schema-registry-2 -e SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=ip1:2181,ip2:2181,ip3:2181 -e SCHEMA_REGISTRY_HOST_NAME=schema-registry-2 -e SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081 -e SCHEMA_REGISTRY_DEBUG=true confluentinc/cp-schema-registry:5.2.1-1

docker run --restart always -d -p 8081:8081 --name=schema-registry-3 -e SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=ip1:2181,ip2:2181,ip3:2181 -e SCHEMA_REGISTRY_HOST_NAME=schema-registry-3 -e SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081 -e SCHEMA_REGISTRY_DEBUG=true confluentinc/cp-schema-registry:5.2.1-1

Once this is up and running I verified if schema-registry cluster is formed and if its leader election is successful, by checking the zookeeper contents.

$ docker exec -it zookeeper bash
# /usr/bin/zookeeper-shell localhost:2181
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is enabled

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[schema_registry, cluster, controller, brokers, zookeeper, admin, isr_change_notification, log_dir_event_notification, controller_epoch, kafka-manager, CruiseControlBrokerList, consumers, latest_producer_id_block, config]
[zk: localhost:2181(CONNECTED) 1] ls /schema_registry
[schema_registry_master, schema_id_counter]
[zk: localhost:2181(CONNECTED) 4] get /schema_registry/schema_registry_master
{"host":"schema-registry-1","port":8081,"master_eligibility":true,"scheme":"http","version":1}
#

Hope this helps.

Ribosome answered 19/7, 2019 at 11:4 Comment(0)
S
1

You just need to put this in the connect-avro-distributed.properties to use multi schema registry:

key.converter.schema.registry.url=http://node1:8081,http://node2:8081
value.converter.schema.registry.url=http://node1:8081,http://node2:8081

Hope this is useful for you.

Saki answered 20/1, 2017 at 1:55 Comment(1)
Hmm, is this truly legit. Because, what if you want to add new connect with new schema registry and etc. You would need to add new schema-url on every connect which means - you would need to restart every single connect which ofc can be catastrophic. Isn't that what you want to avoid?Cara
E
0

Don't forget to mention master.eligibility=true in all the nodes.

Ecospecies answered 18/6, 2020 at 8:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.