My solution for this has been to use the IP as the ID: trim the dots and you get a unique ID that is also available outside of the container to other containers.
With a Service you can get access to the multiple containers's IPs (see my answer here on how to do this:
what's the best way to let kubenetes pods communicate with each other?
so you can get their IDs too if you use IPs as the unique ID.
The only issue is that IDs are not continuous or start at 0, but zookeeper / kafka don't seem to mind.
EDIT 1:
The follow up concerns configuring Zookeeper:
Each ZK node needs to know of the other nodes. The Kubernetes discovery service knowns of nodes that are within a Service so the idea is to start a Service with the ZK nodes.
This Service needs to be started BEFORE creating the ReplicationController (RC) of the Zookeeper pods.
The start-up script of the ZK container will then need to:
- wait for the discovery service to populate the ZK Service with its nodes (that takes a few seconds, for now I just add a sleep 10 at the beginning of my startup script but more reliably you should look for the service to have at least 3 nodes in it.)
- look up the containers forming the Service in the discovery service:
this is done by querying the API.
the
KUBERNETES_SERVICE_HOST
environment variable is available in each container.
The endpoint to find service description is then
URL="http(s)://$USERNAME:$PASSWORD@${KUBERNETES_SERVICE_HOST/api/v1/namespaces/${NAMESPACE}/endpoints/${SERVICE_NAME}"
where NAMESPACE
is default
unless you changed it, and SERVICE_NAME
would be zookeeper if you named your service zookeeper.
there you get the description of the containers forming the Service, with their ip in a "ip" field.
You can do:
curl -s $URL | grep '\"ip\"' | awk '{print $2}' | awk -F\" '{print $2}'
to get the list of IPs in the Service.
With that, populate the zoo.cfg on the node using the ID defined above
You might need the USERNAME and PASSWORD to reach the endpoint on services like google container engine. These need to be put in a Secret volume (see doc here: http://kubernetes.io/v1.0/docs/user-guide/secrets.html )
You would also need to use curl -s --insecure
on Google Container Engine unless you go through the trouble of adding the CA cert to your pods
Basically add the volume to the container, and look up the values from file. (contrary to what the doc says, DO NOT put the \n at the end of the username or password when base64 encoding: it just make your life more complicated when reading those)
EDIT 2:
Another thing you'll need to do on the Kafka nodes is get the IP and hostnames, and put them in the /etc/hosts file.
Kafka seems to need to know the nodes by hostnames, and these are not set within service nodes by default
EDIT 3:
After much trial and thoughts using IP as an ID may not be so great: it depends on how you configure storage.
for any kind of distributed service like zookeeper, kafka, mongo, hdfs, you might want to use the emptyDir type of storage, so it is just on that node (mounting a remote storage kind of defeats the purpose of distributing these services!)
emptyDir will relaod with the data on the same node, so it seems more logical to use the NODE ID (node IP) as the ID, because then a pod that restarts on the same node will have the data.
That avoid potential corruption of the data (if a new node starts writing in the same dir that is not actually empty, who knows what can happen) and also with Kafka, the topics being assigned a broker.id, if the broker id changes, zookeeper does not update the topic broker.id and the topic looks like it is available BUT points to the wrong broker.id and it's a mess.
So far I have yet to find how to get the node IP though, but I think it's possible to lookup in the API by looking up the service pods names and then the node they are deployed on.
EDIT 4
To get the node IP, you can get the pod hostname == name from the endpoints API
/api/v1/namespaces/default/endpoints/
as explained above.
then you can get the node IP from the pod name with
/api/v1/namespaces/default/pods/
PS: this is inspired by the example in the Kubernetes repo (example for rethinkdb here: https://github.com/kubernetes/kubernetes/tree/master/examples/rethinkdb