We have a few SolrCloud & ZooKeeper setups running in AWS EC2, and for the most part they're running smoothly, but after a recent failure of one of our ZooKeeper nodes I started wondering if any one method of having the clients address the ZooKeepers was better than others. Our clients are java based using the Solr 4.1 java client.
Originally we were using hostfile entries for identifying the ZooKeepers, but ensuring that the entries in /etc/hosts
were up-to-date given the nature of AWS it became very tedious to do so. So we're now using custom DNS via Route53 to identify the ZooKeepers instead. But we're still identifying the ZooKeeper nodes individually, so as an example we currently specify this when launching our clients:
-Dsolr.zookeeperHosts='zk-1.mydomain.com:2181,zk-2.mydomain.com:2181,zk-3.mydomain.com:2181'
The hosts zk-1.mydomain.com
etc. are simply CNAME'd to the DNS for each ZooKeeper EC2 instance. So now if Amazon forces us to reboot a ZooKeeper, which causes it to get a new IP address, the client will eventually get the new IP when the DNS record is updated.
My question has to do with wondering if there's an even better approach to take in handling this. Suppose we wanted to add additional ZooKeepers into the mix, so we had a quorum of 5 nodes instead of 3. (I actually want to do this.) Would it make more sense to have a single DNS round-robin record that contains all the ZooKeepers in it and pass that single DNS name to the client?
For example, set up the DNS record zookeepers.mydomain.com
as a CNAME that points to zk-1.mydomain.com
, zk-2.mydomain.com
and zk-mydomain.com
and then simply pas this to my clients:
-Dsolr.zookeeperHosts='zookeepers.mydomain.com:2181'
This way, when I add new ZooKeepers to the cluster I could simply add another CNAME record to zookeepers.mydomain.com
and not need to worry about updating the configs on all the clients.
Is the Solr client smart enough to make use of a DNS record with multiple records in it? Specifically, if one ZooKeeper happens to be down, and the client tries to connect to it, will the client know enough to query DNS again to get the IP of the next ZooKeeper and attempt to communicate with it?