Kubernetes pod resolve external kafka hostname in coredns not as hostaliases inside pod
Asked Answered
H

1

7

I am having spring boot app where in application.property we are specifying below properties. kafka is installed on remote machine with self-signed certificate (outside the kubernete cluster).

camel.component.kafka.configuration.brokers=kafka-worker1.abc.com:9092,kafka-worker2.abc.com:9092,kafka-worker3.abc.com:9092

at startup of application it will try to look for kafka broker. now if i add hostaliases to deployment it will work fine like below

  hostAliases:
  - ip: 10.76.XX.XX
    hostnames:
    - kafka-worker1.abc.com
  - ip: 10.76.XX.XX
    hostnames:
    - kafka-worker2.abc.com
  - ip: 10.76.XX.XX
    hostnames:
    - kafka-worker3.abc.com

it will work fine but i dont want this as its not good practice to have hostaliases , we may need to restart the pod if IP changes. we want that hostname resolution to be happen on coredns, or to resolve without adding ip to host file of pod.

how to achieve this. followed this Cannot connecto to external database from inside kubernetes pod service endpoint like below created for kafka-worker2 & kafka-worker3 with respective IP

    kind: Service
    apiVersion: v1
    metadata:
     name: kafka-worker1
    spec:
     clusterIP: None
     ports:
     - port: 9092
       targetPort: 9092
     externalIPs:
       - 10.76.XX.XX

and added this in property file

camel.component.kafka.configuration.brokers=kafka-worker1.default:9092,kafka-worker2.default:9092,kafka-worker3.default:9092

still getting same WARN

2020-05-13T11:57:12.004+0000 Etc/UTC docker-desktop WARN  [main] org.apache.kafka.clients.ClientUtils(:74) - Couldn't resolve server hal18-coworker2.default:9092 from bootstrap.servers as DNS resolution failed for kafka-worker1.default
2020-05-13T11:57:12.318+0000 Etc/UTC docker-desktop WARN  [main] org.apache.kafka.clients.ClientUtils(:74) - Couldn't resolve server hal18-coworker1.default:9092 from bootstrap.servers as DNS resolution failed for kafka-worker2.default
2020-05-13T11:57:12.567+0000 Etc/UTC docker-desktop WARN  [main] org.apache.kafka.clients.ClientUtils(:74) - Couldn't resolve server hal18-coworker3.default:9092 from bootstrap.servers as DNS resolution failed for kafka-worker3.default

Update Section

Used "Services without selectors" as below still getting same error

2020-05-18T14:47:10.865+0000 Etc/UTC docker-desktop WARN  [Camel (SMP-Proactive-Camel) thread #1 - KafkaConsumer[recommendations-topic]] org.apache.kafka.clients.NetworkClient(:750) - [Consumer clientId=consumer-hal-tr69-streaming-1, groupId=hal-tr69-streaming] Connection to node -1 (kafka-worker.default.svc.cluster.local/10.100.153.152:9092) could not be established. Broker may not be available.
2020-05-18T14:47:12.271+0000 Etc/UTC docker-desktop WARN  [Camel (SMP-Proactive-Camel) thread #1 - KafkaConsumer[recommendations-topic]] org.apache.kafka.clients.NetworkClient(:750) - [Consumer clientId=consumer-hal-tr69-streaming-1, groupId=hal-tr69-streaming] Connection to node -1 (kafka-worker.default.svc.cluster.local/10.100.153.152:9092) could not be established. Broker may not be available.
2020-05-18T14:47:14.191+0000 Etc/UTC docker-desktop WARN  [Camel (SMP-Proactive-Camel) thread #1 - KafkaConsumer[recommendations-topic]] org.apache.kafka.clients.NetworkClient(:750) - [Consumer clientId=consumer-hal-tr69-streaming-1, groupId=hal-tr69-streaming] Connection to node -1 (kafka-worker.default.svc.cluster.local/10.100.153.152:9092) could not be established. Broker may not be available.

Services & Endpoint yaml

apiVersion: v1
kind: Service
metadata:
 name: kafka-worker
spec:
 type: ClusterIP
 ports:
 - port: 9092
   targetPort: 9092
---
apiVersion: v1
kind: Endpoints
metadata:
 name: kafka-worker
subsets:
 - addresses:
   - ip: 10.76.XX.XX # kafka worker 1
   - ip: 10.76.XX.XX # kafka worker 2
   - ip: 10.76.XX.XX # kafka worker 3
   ports:
   - port: 9092
     name: kafka-worker

kubectl.exe get svc,ep
NAME                                         TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
service/ingress-nginx-controller             LoadBalancer   10.99.101.185    localhost     80:31247/TCP,443:31340/TCP   11d
service/ingress-nginx-controller-admission   ClusterIP      10.103.212.117   <none>        443/TCP                      11d
service/kafka-worker                         ClusterIP      10.100.153.152   <none>        9092/TCP                     97s
service/kubernetes                           ClusterIP      10.96.0.1        <none>        443/TCP                      17d

NAME                                           ENDPOINTS                                            AGE
endpoints/ingress-nginx-controller             10.1.0.XX:80,10.1.0.XX:443                           11d
endpoints/ingress-nginx-controller-admission   10.1.0.xx:8443                                       11d
endpoints/kafka-worker                         10.76.xx.xx:9092,10.76.xx.xx:9092,10.76.xx.xx:9092   97s
endpoints/kubernetes                           192.168.XX.XX:6443                                   17d
Hermeneutics answered 13/5, 2020 at 12:3 Comment(3)
What about externalnameEsquimau
if the IP changes, how are you planning to update coreDNS anyways?Rager
how to add entry to coredns resolve.conf like below, 10.76.XX.1 kafka-worker1.abc.com 10.76.XX.2 kafka-worker2.abc.comHermeneutics
T
7

Thank you for the question and showing your effort to solve the problem.

You are right about adding hostAliases to be not a good practice because on an event your kafka hosts IP changes then you will have to apply the new IP to the deployment and it will trigger a pod reload.

I am not sure how externalIPs fits over here as a solution since:

Traffic that ingresses into the cluster with the external IP (as destination IP), on the Service port, will be routed to one of the Service endpoints. externalIPs are not managed by Kubernetes and are the responsibility of the cluster administrator.

But for a moment if I take it for granted that externalIP solution is working, even then the way you are accessing your service is not correct!

DNS resolution is failing because your domain name is wrong camel.component.kafka.configuration.brokers=kafka-worker1.default:9092 changing it to this camel.component.kafka.configuration.brokers=kafka-worker1.default.svc.cluster.local:9092 may fix it. Note: if your k8s cluster has a different domain than the default then replace cluster.local with your k8s cluster domain.

Check DNS debugging REF

There are two solutions which I can think of:

First Service without selectors and manual endpoint creation:

(example code) the name of the endpoint is used to attach to the service. therefore use same names for both service and endpoint which is kafka-worker

apiVersion: v1
kind: Service
metadata:
 name: kafka-worker
spec:
 type: ClusterIP
 ports:
 - port: 9092
   targetPort: 9092
---
apiVersion: v1
kind: Endpoints
metadata:
 name: kafka-worker
subsets:
 - addresses:
   - ip: 10.76.XX.XX # kafka worker 1
   - ip: 10.76.XX.XX # kafka worker 2
   - ip: 10.76.XX.XX # kafka worker 3
   ports:
   - port: 9092
     name: kafka-worker

Way to access this would be camel.component.kafka.configuration.brokers=kafka-worker.default.svc.cluster.local:9092

Note: - You can add more information to you endpoint ip like nodeName, hostName checkout this api ref - advantage of this approach is k8s will load balance for you to the kafka workers

Second ExternalName:

For this approach you will need to have single Domain name defined already, how to do that is out of scope of this answer but for example kafka-worker.abc.com is your domain name, now it is your responsibility to attach all of your 3 kafka worker node IPs in a (maybe) roundrobin fashion to your DNS server. Note: this kind load balancing (via DNS) is not always preferred because there is no health check performed by the DNS server to make sure which nodes are alive and which are dead.

This approach is not guaranteed and may need addition tweaks depending on your systems networking to resolve domain names. which is to say the node where you have your coredns/kube-dns running that node should be able to resolved kafka-worker.abc.com otherwise when k8s return the CNAME you application will fail to resolve it!

Here is an example:

kind: Service
metadata:
  name: kafka-worker
spec:
  type: ExternalName
  externalName: kafka-worker.abc.com

Update: Following your update in Question. Looking at the first error it seems you have created 3 services which generates 3 DNS

kafka-worker3.default.svc.cluster.local
kafka-worker2.default.svc.cluster.local
kafka-worker1.default.svc.cluster.local

I suggest, if you could please check my example code! you DO NOT need to create 3 services, just one service which is attach to a endpoint which has 3 IPs of your 3 brokers.

For your second error: hostname is not domain name, hostname is typically the name give to the machine (please check the difference). just for the sake of simplicity I would suggest to use only IP in endpoint object.

Tildie answered 15/5, 2020 at 15:27 Comment(6)
thanks @Tildie i tried using first solution Headless Service now DNS got resolved but giving "could not be established. Broker may not be available." error. check update section in Question.Hermeneutics
apologies for writing the wrong name, the first solution is called "Services without selectors" not "Headless Service". I have updated the same in the answer. I have also answered your other problems.Tildie
still getting same error Francium, update the Question section with latest solution.and below is resolve.conf. >kubectl exec -ti dnsutils -- cat /etc/resolv.conf nameserver 10.96.0.10 search default.svc.cluster.local svc.cluster.local cluster.local options ndots:5Hermeneutics
did you check your kafka actually working? like can you curl healthz endpoint from the node where you have your k8s cluster? are you sure kafka is listening to 9092 ? also i don't think your ingress is interfering but wont hurt to take down ingress for a while to debug. Caution you must do this in a development lab and not production!Tildie
yeah ping is happening and if i add host aliases to deployment its working. 2020-05-19T07:07:01.029+0000 Etc/UTC docker-desktop INFO [Camel (SMP-Proactive-Camel) thread #1 - KafkaConsumer[recommendations-topic]] org.apache.kafka.clients.consumer.internals.ConsumerCoordinator(:267) - [Consumer clientId=consumer-hal-tr69-streaming-1, groupId=hal-tr69-streaming] Adding newly assigned partitions: recommendations-topic-2. i brought down ingress also to checkHermeneutics
check this out #47678049Tildie

© 2022 - 2024 — McMap. All rights reserved.