Kubernetes - Container image already present on machine
Asked Answered
R

2

31

So I have 2 similar deployments on k8s that pulls the same image from GitLab. Apparently this resulted in my second deployment to go on a CrashLoopBackOff error and I can't seem to connect to the port to check on the /healthz of my pod. Logging the pod shows that the pod received an interrupt signal while describing the pod shows the following message.

 FirstSeen  LastSeen    Count   From            SubObjectPath                   Type        Reason          Message
  --------- --------    -----   ----            -------------                   --------    ------          -------
  29m       29m     1   default-scheduler                           Normal      Scheduled       Successfully assigned java-kafka-rest-kafka-data-2-development-5c6f7f597-5t2mr to 172.18.14.110
  29m       29m     1   kubelet, 172.18.14.110                          Normal      SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "default-token-m4m55" 
  29m       29m     1   kubelet, 172.18.14.110  spec.containers{consul}             Normal      Pulled          Container image "..../consul-image:0.0.10" already present on machine
  29m       29m     1   kubelet, 172.18.14.110  spec.containers{consul}             Normal      Created         Created container
  29m       29m     1   kubelet, 172.18.14.110  spec.containers{consul}             Normal      Started         Started container
  28m       28m     1   kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Normal      Killing         Killing container with id docker://java-kafka-rest-development:Container failed liveness probe.. Container will be killed and recreated.
  29m       28m     2   kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Normal      Created         Created container
  29m       28m     2   kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Normal      Started         Started container
  29m       27m     10  kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Warning     Unhealthy       Readiness probe failed: Get http://10.5.59.35:7533/healthz: dial tcp 10.5.59.35:7533: getsockopt: connection refused
  28m       24m     13  kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Warning     Unhealthy       Liveness probe failed: Get http://10.5.59.35:7533/healthz: dial tcp 10.5.59.35:7533: getsockopt: connection refused
  29m       19m     8   kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Normal      Pulled          Container image "r..../java-kafka-rest:0.3.2-dev" already present on machine
  24m       4m      73  kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Warning     BackOff         Back-off restarting failed container

I have tried to redeploy the deployments under different images and it seems to work just fine. However I don't think this will be efficient as the images are the same throughout. How do I go on about this?

Here's what my deployment file looks like:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: "java-kafka-rest-kafka-data-2-development"
  labels:
    repository: "java-kafka-rest"
    project: "java-kafka-rest"
    service: "java-kafka-rest-kafka-data-2"
    env: "development"
spec:
  replicas: 1
  selector:
    matchLabels:
      repository: "java-kafka-rest"
      project: "java-kafka-rest"
      service: "java-kafka-rest-kafka-data-2"
      env: "development"
  template:
    metadata:
      labels:
        repository: "java-kafka-rest"
        project: "java-kafka-rest"
        service: "java-kafka-rest-kafka-data-2"
        env: "development"
        release: "0.3.2-dev"
    spec:
      imagePullSecrets:
      - name: ...
      containers:
      - name: java-kafka-rest-development
        image: registry...../java-kafka-rest:0.3.2-dev
        env:
        - name: DEPLOYMENT_COMMIT_HASH
          value: "0.3.2-dev"
        - name: DEPLOYMENT_PORT
          value: "7533"
        livenessProbe:
          httpGet:
            path: /healthz
            port: 7533
          initialDelaySeconds: 30
          timeoutSeconds: 1
        readinessProbe:
          httpGet:
            path: /healthz
            port: 7533
          timeoutSeconds: 1
        ports:
        - containerPort: 7533
        resources:
          requests:
            cpu: 0.5
            memory: 6Gi
          limits:
            cpu: 3
            memory: 10Gi
        command:
          - /envconsul
          - -consul=127.0.0.1:8500
          - -sanitize
          - -upcase
          - -prefix=java-kafka-rest/
          - -prefix=java-kafka-rest/kafka-data-2
          - java
          - -jar
          - /build/libs/java-kafka-rest-0.3.2-dev.jar
        securityContext:
          readOnlyRootFilesystem: true
      - name: consul
        image: registry.../consul-image:0.0.10
        env:
        - name: SERVICE_NAME
          value: java-kafka-rest-kafka-data-2
        - name: SERVICE_ENVIRONMENT
          value: development
        - name: SERVICE_PORT
          value: "7533"
        - name: CONSUL1
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: node1
        - name: CONSUL2
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: node2
        - name: CONSUL3
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: node3
        - name: CONSUL_ENCRYPT
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: encrypt
        ports:
        - containerPort: 8300
        - containerPort: 8301
        - containerPort: 8302
        - containerPort: 8400
        - containerPort: 8500
        - containerPort: 8600
        command: [ entrypoint, agent, -config-dir=/config, -join=$(CONSUL1), -join=$(CONSUL2), -join=$(CONSUL3), -encrypt=$(CONSUL_ENCRYPT) ]
      terminationGracePeriodSeconds: 30
      nodeSelector:
        env: ...
Reisinger answered 29/11, 2018 at 9:21 Comment(8)
Could be that your readinessProbe is killing your container. Is this a kafka broker image or ... ?Mantoman
@Urosh T. yes, thats why assume as well. It is indeed a kafka image used to produce kafka messages. However I'm confused as what causes the readinessProbe to trigger that way; as to my understanding, the image thats pulled from GitLab should be put on the k8s pod inconsequential of what image other pods pulls.Reisinger
Yes but the readinesProbe is defined in your k8s deployment file so you might need to increase the values (if kafka needs a lot of time to start) or even remove the probe to see if that is what is killing your podMantoman
Actually - Kafka doesn't even have any health check endpoints as far as I know. Have you implemented any custom health checks or ...?Mantoman
@UroshT. I indeed have implemented a custom health check, which I've pasted on to pastebin and added my deployment file for clarity. However, even if the readinesProbe is indeed the cause of this, why would it affect my deployment if they pull the same image but not when they pull from individual images?Reisinger
I am actually wrong, it is your livelinessProbe that killed your pod, it says so in the logs: Killing container with id docker://java-kafka-rest-development:Container failed liveness probe.. Container will be killed and recreated.. So what you are trying to say is that when you pull the image explicitly, you have no issues, but when the image is not pulled (the same image), the issue occurs?Mantoman
@UroshT. Yes, if 1 deployment pulls the image 0.3.2-dev while the other pulls 0.3.2.1-dev, everything works fine, but if they're both pulling from the same image, the issue occurs. Seems to me the problem stems from Pulled Container image "..../java-kafka-rest:0.3.2-dev" already present on machine error which somehow triggers the readinesProbe/livelinessProbe to failReisinger
That would be a pretty weird (but maybe possible) bug. Have you tried increasing that timeoutSeconds value for the livelinessProbe? 1 second might just not cut it sometimes (and it happens to be when the images are aleady present)...Mantoman
R
23

To those having this problem, I've discovered the problem and solution to my question. Apparently the problem lies with my service.yml where my targetPort was aimed to a port different than the one I opened in my docker image. Make sure the port that's opened in the docker image connects to the right port.

Hope this helps.

Reisinger answered 30/11, 2018 at 3:8 Comment(0)
S
4

You can also check the logs of the pods. for me error was in the pod

kubectl logs <pod> -n your-namespace
Silkstocking answered 20/9, 2022 at 11:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.