Allocate or Limit resource for pods in Kubernetes?
Asked Answered
A

2

13

The resource limit of Pod has been set as:

resource
  limit
    cpu: 500m
    memory: 5Gi

and there's 10G mem left on the node.

I've created 5 pods in a short time successfully, and the node maybe still have some mem left, e.g. 8G.

The mem usage is growing as the time goes on, and reach the limit (5G x 5 = 25G > 10G), then the node will be out of response.

In order to ensure the usability, is there a way to set the resource limit on the node?

Update

The core problem is that pod memory usage does not always equal to the limit, especially in the time when it just starts. So there can be unlimited pods created as soon as possible, then make all nodes full load. That's not good. There might be something to allocate resources rather than setting the limit.

Update 2

I've tested again for the limits and resources:

resources:
  limits:
    cpu: 500m
    memory: 5Gi
  requests:
    cpu: 500m
    memory: 5Gi

The total mem is 15G and left 14G, but 3 pods are scheduled and running successfully:

> free -mh
              total        used        free      shared  buff/cache   available
Mem:            15G        1.1G        8.3G        3.4M        6.2G         14G
Swap:            0B          0B          0B

> docker stats

CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O
44eaa3e2d68c        0.63%               1.939 GB / 5.369 GB   36.11%              0 B / 0 B           47.84 MB / 0 B
87099000037c        0.58%               2.187 GB / 5.369 GB   40.74%              0 B / 0 B           48.01 MB / 0 B
d5954ab37642        0.58%               1.936 GB / 5.369 GB   36.07%              0 B / 0 B           47.81 MB / 0 B

It seems that the node will be crushed soon XD

Update 3

Now I change the resources limits, request 5G and limit 8G:

resources:
  limits:
    cpu: 500m
    memory: 5Gi
  requests:
    cpu: 500m
    memory: 8Gi

The results are: enter image description here

According to the k8s source code about the resource check:

enter image description here

The total memory is only 15G, and all the pods needs 24G, so all the pods may be killed. (my single one container will cost more than 16G usually if not limited.)

It means that you'd better keep the requests exactly equals to the limits in order to avoid pod killed or node crush. As if the requests value is not specified, it will be set to the limit as default, so what exactly requests used for? I think only limits is totally enough, or IMO, on the contrary of what K8s claimed, I rather like to set the resource request greater than the limit, in order to ensure the usability of nodes.

Update 4

Kubernetes 1.1 schedule the pods mem requests via the formula:

(capacity - memoryRequested) >= podRequest.memory

It seems that kubernetes is not caring about memory usage as Vishnu Kannan said. So the node will be crushed if the mem used much by other apps.

Fortunately, from the commit e64fe822, the formula has been changed as:

(allocatable - memoryRequested) >= podRequest.memory

waiting for the k8s v1.2!

Anton answered 25/2, 2016 at 6:51 Comment(0)
D
20

Kubernetes resource specifications have two fields, request and limit.

limits place a cap on how much of a resource a container can use. For memory, if a container goes above its limits, it will be OOM killed. For CPU, its usage may be throttled.

requests are different in that they ensure the node that the pod is put on has at least that much capacity available for it. If you want to make sure that your pods will be able to grow to a particular size without the node running out of resources, specify a request of that size. This will limit how many pods you can schedule, though -- a 10G node will only be able to fit 2 pods with a 5G memory request.

Deberadeberry answered 25/2, 2016 at 17:35 Comment(6)
Thanks! Do you mean 10G is teh left mem? I've tested again, my total mem is 15G, and left 14G, then there 5Gi pods are scheduled and running succefully, other pods are pending. Is it possible the k8s does not count the used mem?Anton
I'd expect three 5GB pods to be scheduled if your total memory is 15GB, then the rest to be left pending, because there wouldn't be enough unreserved memory to satisfy the requests of the already scheduled pods. You could schedule more pods that don't request a specific amount of memory, and they'll be able to use the memory that's been reserved but isn't yet being used.Deberadeberry
The k8s source code shows exactly what you said. Does it mean, if there's only 1G mem left on a node whose total mem is 15G, it's still possible to create three 5GB pods? If that true, all nodes will be crushed like mine, please check details in the "update 2" of this question. Thanks!Anton
It might be possible to create them, but only if the memory on the node isn't reserved for the pods that are already there. If all the memory on the node gets used up by the pods there, one or more of them may be killed, as Vishnu describes in his answer.Deberadeberry
With respect, bother you again :) in the situation described on "Update 3", will the three pods be killed soon? OK, would you like to show me some links of documentation, since the resource limitation is critical in my project, it means usability to me. Thank you very much!Anton
Looks like you already found it in update 4 :) Best of luck with your project!Deberadeberry
E
9

Kubernetes supports Quality of Service. If your Pods have limits set, they belong to the Guaranteed class and the likelihood of them getting killed due to system memory pressure is extremely low. If the docker daemon or some other daemon you run on the node consumes a lot of memory, that's when there is a possibility for Guaranteed Pods to get killed.

The Kube scheduler does take into account memory capacity and memory allocated while scheduling. For instance, you cannot schedule more than two pods each requesting 5 GB on a 10GB node.

Memory usage is not consumed by Kubernetes as of now for the purposes of scheduling.

Egwin answered 25/2, 2016 at 21:24 Comment(1)
Thanks! IMO, the 'available mem' should be counted by totalMem - usedMem - RequestMemByScheduledPod, and if it is greater than the pod request, the pod should be scheduled, but k8s source code seems different.Anton

© 2022 - 2024 — McMap. All rights reserved.