Deployment affinity
Asked Answered
R

2

5

I have k8s cluster with 3 nodes

I would like that a sample deployment with 3 replicas as the follwing , so each pod got schduled in a different node ?

apiVersion: apps/v1 kind: Deployment metadata:   name: tomcat-deployment   labels:
    app: tomcat spec:   replicas: 3   selector:
    matchLabels:
      app: tomcat   template:
    metadata:
      labels:
        app: tomcat
    spec:
      containers:
      - name: tomcat
        image: tomcat:9.0
        ports:
        - containerPort: 80
Resupinate answered 9/6, 2020 at 21:52 Comment(1)
check on DeamonSets.Myna
C
6

You can use podAntiAffinity to make sure that the same pods of a deployment should never run on the same node(depends on topology). Check the following document Assigning Pods to Nodes.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tomcat-deployment
  labels:
    app: tomcat
spec:
  replicas: 3
  selector:
    matchLabels:
      app: tomcat
  template:
    metadata:
      labels:
        app: tomcat
    spec:
      containers:
      - name: tomcat
        image: tomcat:9.0
        ports:
        - containerPort: 80
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: "app"
                operator: In
                values:
                - tomcat
            topologyKey: "kubernetes.io/hostname"
Caudillo answered 10/6, 2020 at 2:37 Comment(0)
T
2

Kubernetes scheduler will by default try to schedule deployment replicas on different nodes if possible (as long as a node satisfies momory/cpu requirements).

If don't, 2 (or more) pod replicas can get scheduled on one node and you can use several techniques to prevent this.

One of these techniques is called pod affinity. In k8s documentation you can read:

Inter-pod affinity and anti-affinity allow you to constrain which nodes your pod is eligible to be scheduled based on labels on pods that are already running on the node rather than based on labels on nodes. The rules are of the form "this pod should (or, in the case of anti-affinity, should not) run in an X if that X is already running one or more pods that meet rule Y"

With podaffinity you need to be aware that if a pod for some reason cannot be scheduled on a node (lack of resources or tainted node) and it will end up in pending state.

You should also remember that when running 3 node cluster (1 master + 2 workers) it is common to have NoSchedule taint on master node (which is typical for clusters created with e.g. kubeadm) that disallows scheduling pods on master node.

If this applies to you and you still want to schedult pods on mater node, you need to either delete NoSchedule taint:

kubectl taint nodes $(hostname) node-role.kubernetes.io/master:NoSchedule-

Or use toleration:

apiVersion: extensions/v1beta1
kind: Deployment
  spec:
    spec:
      tolerations:
        - key: "node-role.kubernetes.io/master"
          effect: "NoSchedule"
          operator: "Exists"

In comments @suren mentioned DaemonSets which can be used in some cases but when scaling your cluster, your application will scale with it, and it may not be desired.

Tsarevitch answered 10/6, 2020 at 10:56 Comment(5)
Thanks a lot for you answer , so actually what I need is to define this pod affinity inside the deployment so each replica got scheduled to different nodeResupinate
Is the onlyway via daemon set ?Resupinate
Yes, you probably want to use pod affinityTsarevitch
RE Kubernetes scheduler will by default try to schedule deployment replicas on different nodes: i was trying to find the documentation for what you said... but i can not find the documentation. can you provide a link to the documentation? i ask because this is exactly what i want/need. but i need to verify that kubernetes is actually doing this on purpose (so i don't end up with 10 instances by accident running all on one node).Shearwater
@TrevorBoydSmith I wish load balancing of pods between all available nodes happened periodically... but it does not, and sooner or later, sometimes after a single downtime, all pods can end up running on one node (even if there are hundreds of them)Hydracid

© 2022 - 2024 — McMap. All rights reserved.