How to make k8s allocate gpu/npu devices following specific rule
Asked Answered
V

1

7

I have multiple gpu cards within one machine, and I need to let the k8s allocate gpu/npus device following some rules I set.

For example, supposing there are 8 gpu cards whose id is from 0-7, and only device0、device1、device6 and device7 are available. Now I need to create one pod with 2 devices, these two devices must be either of (device0, device1) or (device6, device7). Other device combinations such as (device0, device6) are not valid.

Is there any way to do that? I am using kubernetes of version 1.18 and implemented my own device plugin.

Vogul answered 27/5, 2020 at 15:29 Comment(4)
where will you write those rules? to use specific gpus, you could set NVIDIA_VISIBLE_DEVICES env instead of using nvidia.com/gpu resource request for pod, that's what nvidia k8s device plugin does.Fondue
from custom device plugin or custom device manager or what ever, anything that I can achieve it @FondueVogul
so you write something like "gpu/rule": "smaller than 4" in pod spec and use a device plugin to parse that rule? you can either set the NVIDIA_VISIBLE_DEVICES env to 0,1,2 in pod spec directly without using device plugin, or modify the Allocate function: get the request rule "smaller than 4" and set m.allocateEnvvar: "0,1,2".Fondue
what do you mean "modify the allocate funcition"? which module does this allocate function belongs, device plugin or device manager? could you give more details?Vogul
M
-1

I don't understand why would you write a rule like this:

every device-id be smaller than 4

If you want to limit the amount of GPUs you should be using limits and requests which is nicely explained on Schedule GPUs. So you can just limit the resource to only 4 GPUs like so:

apiVersion: v1
kind: Pod
metadata:
  name: cuda-vector-add
spec:
  restartPolicy: OnFailure
  containers:
    - name: cuda-vector-add
      # https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile
      image: "k8s.gcr.io/cuda-vector-add:v0.1"
      resources:
        limits:
          nvidia.com/gpu: 4 # requesting 1 GPU

If you have different types of GPUs on different nodes you can use labels which you can read here Clusters containing different types of GPUs.

# Label your nodes with the accelerator type they have.
kubectl label nodes <node-with-k80> accelerator=nvidia-tesla-k80
kubectl label nodes <node-with-p100> accelerator=nvidia-tesla-p100

If your nodes are running different versions of GPUs, then use Node Labels and Node Selectors to schedule pods to appropriate GPUs. Following is an illustration of this workflow:

As part of your Node bootstrapping, identify the GPU hardware type on your nodes and expose it as a node label.

NVIDIA_GPU_NAME=$(nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0)
source /etc/default/kubelet
KUBELET_OPTS="$KUBELET_OPTS --node-labels='alpha.kubernetes.io/nvidia-gpu-name=$NVIDIA_GPU_NAME'"
echo "KUBELET_OPTS=$KUBELET_OPTS" > /etc/default/kubelet

Specify the GPU types a pod can use via Node Affinity rules.

kind: pod
apiVersion: v1
metadata:
  annotations:
    scheduler.alpha.kubernetes.io/affinity: >
      {
        "nodeAffinity": {
          "requiredDuringSchedulingIgnoredDuringExecution": {
            "nodeSelectorTerms": [
              {
                "matchExpressions": [
                  {
                    "key": "alpha.kubernetes.io/nvidia-gpu-name",
                    "operator": "In",
                    "values": ["Tesla K80", "Tesla P100"]
                  }
                ]
              }
            ]
          }
        }
      }
spec: 
  containers: 
    - 
      name: gpu-container-1
      resources: 
        limits: 
          alpha.kubernetes.io/nvidia-gpu: 2

This will ensure that the pod will be scheduled to a node that has a Tesla K80 or a Tesla P100 Nvidia GPU.

You could find other relevant information on unofficial-kubernetes Scheduling gpus.

Mlle answered 28/5, 2020 at 13:40 Comment(5)
the nodeSelector is for selecting node which has devices of the same type and doesn't fit my requirement that picking up specific device from different versions within same machineVogul
e.g. I have 8 gpu cards in one machine, and they are of different versions. Due to some reason, I need to pick up some of them before mounting them to container by some selection ruleVogul
nodeAffinity on Node Labels and Node Selectors is for different versions. Please read the docs.Mlle
difference versions between different nodes, or between different gpu devices of one node? The later case is what I need here.Vogul
Unfortunately it's the first case from my understanding. But you can test it yourself and post an answer that will be helpful to the community.Mlle

© 2022 - 2024 — McMap. All rights reserved.