Zookeeper: Hostname resolution fails
Asked Answered
W

1

6

I am running Zookeeper in an OpenShift/Kubernetes environment. I have setup zookeeper as a StatefulSet in order to reliably persist config data.

I configured three servers in my zoo.cfg by hostname, but on startup, hostname resolution fails. I verified hostnames are indeed resolvable using nslookup inside my cluster.

zoo.cfg:

clientPort=2181
dataDir=/var/lib/zookeeper/data
dataLogDir=/var/lib/zookeeper/log
tickTime=2000
initLimit=10
syncLimit=2000
maxClientCnxns=60
minSessionTimeout= 4000
maxSessionTimeout= 40000
autopurge.snapRetainCount=3
autopurge.purgeInteval=0
server.1=zookeeper-0.zookeeper-headless:2888:3888
server.2=zookeeper-1.zookeeper-headless:2888:3888
server.3=zookeeper-2.zookeeper-headless:2888:3888

Relevant parts of my OpenShift / Kubernetes configuration:

  # StatefulSet
  - apiVersion: apps/v1beta1
    kind: StatefulSet
    metadata:
      labels:
        app: zookeeper
      name: zookeeper
    spec:
      serviceName: zookeeper-headless
      replicas: 3
      template:
        metadata:
          labels:
            app: zookeeper
        spec:
          containers:
            - image: 172.30.158.156:5000/os-cloud-platform/zookeeper:latest
              name: zookeeper
              ports:
                - containerPort: 2181
                  protocol: TCP
                  name: client
                - containerPort: 2888
                  protocol: TCP
                  name: server
                - containerPort: 3888
                  protocol: TCP
                  name: leader-election
          dnsPolicy: ClusterFirst
          schedulerName: default-scheduler

  # Service
  - apiVersion: v1
    kind: Service
    metadata:
      labels:
        app: zookeeper
      name: zookeeper
    spec:
      ports:
        - name: client
          port: 2181
          protocol: TCP
          targetPort: 2181
      selector:
        app: zookeeper
      sessionAffinity: None
      type: ClusterIP

  - apiVersion: v1
    kind: Service
    metadata:
      name: zookeeper-headless
      labels:
        app: zookeeper
    spec:
      ports:
        - port: 2888
          name: server
        - port: 3888
          name: leader-election
      clusterIP: None
      selector:
        app: zookeeper

OpenShift logs show UnknownHostExceptions, though:

2017-10-06 10:59:18,289 [myid:] - WARN  [main:QuorumPeer$QuorumServer@155] - Failed to resolve address: zookeeper-2.zookeeper-headless
java.net.UnknownHostException: zookeeper-2.zookeeper-headless: No address associated with hostname
    at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
    at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
    at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
    at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
    at java.net.InetAddress.getAllByName(InetAddress.java:1192)
    at java.net.InetAddress.getAllByName(InetAddress.java:1126)
    at java.net.InetAddress.getByName(InetAddress.java:1076)
    at org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.recreateSocketAddresses(QuorumPeer.java:148)
    at org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.<init>(QuorumPeer.java:133)
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseProperties(QuorumPeerConfig.java:228)
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:140)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:101)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
...

What could be the cause? I verified that the hostname (e.g. zookeeper-2.zookeeper-headless) is available from other pods through nslookup.

Weston answered 6/10, 2017 at 12:24 Comment(2)
You might want to look at official docs on how to run zookeeper in kubernetes: kubernetes.io/docs/tutorials/stateful-application/zookeeperCyclostome
I used this documentation to set up my zookeeper ensemble.Weston
W
5

I found a working solution for this issue. ZooKeeper reads the list of servers in the ensemble on startup and looks for its "own" entry. It then uses this entry to determine which port and interface to listen on.

server.1=zookeeper-0.zookeeper-headless:2888:3888
server.2=zookeeper-1.zookeeper-headless:2888:3888
server.3=zookeeper-2.zookeeper-headless:2888:3888

Since the provided hostname will resolve to 127.0.0.1 on this machine, ZooKeeper will listen on the local loopback interface and therefore does not accept connections from the other ZooKeeper servers.

server.1=0.0.0.0:2888:3888
server.2=zookeeper-1.zookeeper-headless:2888:3888
server.3=zookeeper-2.zookeeper-headless:2888:3888

To automate things in the cluster, I wrote a bash script that will replace the one "own" entry on container startup.

EDIT: As asked in the comments, here is my ENTRYPOINT script that takes care of placing the myid file and setting the appropriate hostname for each zoo.cfg:

#!/bin/bash
# This script extracts the number out of the pod's hostname and sets it as zookeepers id.

# Exact paths may vary according to your setup
MYID_FILE="/var/lib/zookeeper/data/myid"
ZOOCFG_FILE="/conf/zoo.cfg"

# Create myid-file
# Extract only numbers from hostname
id=$(hostname | tr -d -c 0-9)
echo $id > "${MYID_FILE}"

# change own hostname to 0.0.0.0
# otherwise, the own hostname will resolve to 127.0.0.1
# https://mcmap.net/q/327411/-zookeeper-error-cannot-open-channel-to-x-at-election-address
fullHostname="$(hostname).zookeeper-headless"
sed -i -e "s/${fullHostname}/0.0.0.0/g" "${ZOOCFG_FILE}"

echo "Executing $@"
exec "$@"
Weston answered 16/10, 2017 at 15:15 Comment(8)
Hi Franz. I am having the same issue. I wanted to ask how do you change a particular server of a pod to resolve to its localhost. Also can you share the bash script please? That would be really helpful. Thanks a lotPapert
Thank you so much for the script. Can you please tell me where to run this script (initContainer or something?). Sorry I am really new to KubernetesPapert
This Script is the ENTRYPOINT of your Docker image. It will execute all args you pass in your Kubernetes setup.Weston
I used configmap to mount a volume and then used that mounted volume to run the script in my container like so command: ['/bin/bash', '/var/lib/zk-init.sh']. But I got error sed: /conf/zoo.cfg: No such file or directory. Although when I successfully run the container. It does contain /conf/zoo.cfg. Please tell me what I am doing wrong? NOTE: /var/lib is the mountPathPapert
Sorry to bother you again. but i really need to know how you ran the script as an entrypoint of the docker imagePapert
You have to add the script to your build context and ADD it within your Dockerfile. There is no need for a volume for that.Weston
@FranzWimmer just wanted to tell you that this helped with the kubernetes/istio context too. zookeeper was running fine in kubernetes but as soon as I deployed zk with the istio sidecard, the zk-instances couldn't talk to each other anymore. after changing the serverconfig as described it worked again. Thank you very much.Aspergillum
@FranzWimmer, I cannot figure out how you implemented this with a config map. Is it run before the start-zookeeper command or after? Can you please share your actual zookeeper manifest? No matter what I try name resolution continues to fail.Vulgarian

© 2022 - 2024 — McMap. All rights reserved.