Persistent Storage in EKS failing to provision volume
Asked Answered
R

3

11

I followed the steps from AWS knowledge base to create persistent storage: Use persistent storage in Amazon EKS

Unfortunately, PersistentVolume(PV) wasn't created:

kubectl get pv
No resources found

When I checked the PVC logs, I'm getting the following provisioning failed message:

storageclass.storage.k8s.io "ebs-sc" not found

failed to provision volume with StorageClass "ebs-sc": rpc error: code = DeadlineExceeded desc = context deadline exceeded

I'm using Kubernetes v1.21.2-eks-0389ca3


Update:

The storageclass.yaml used in the example has provisioner set to ebs.csi.aws.com

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: ebs-sc
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer

When I updated it using @gohm'c answer, it created a pv.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ebs-sc
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
Ref answered 22/9, 2021 at 20:19 Comment(1)
DeadlineExceeded means it is failed to complete the task. Can you please check if node in clusters all are in same zonesMethylal
O
10
storageclass.storage.k8s.io "ebs-sc" not found

failed to provision volume with StorageClass "ebs-sc"

You need to create the storage class "ebs-sc" after EBS CSI driver is installed, example:

cat << EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ebs-sc
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
EOF

See here for more options.

Ossieossietzky answered 23/9, 2021 at 1:50 Comment(1)
This is it. The example from AWS has provisioner ebs.csi.aws.com. After updating it to kubernetes.io/aws-ebs, it works. Thanks!Ref
T
8

check if the the aws-ebs-csi-driver is running or not

kubectl get deployment -n kube-system
NAME                 READY   UP-TO-DATE   AVAILABLE   AGE
coredns              2/2     2            2           14d
ebs-csi-controller   2/2     2            2           53m

if not then add this service in to you EKS cluster https://docs.aws.amazon.com/eks/latest/userguide/managing-ebs-csi.html

Don't forgot to add AmazonEBSCSIDriverPolicy in your NodeRole.

And don't forgot to thanks me :D

Trejo answered 1/9, 2022 at 20:35 Comment(2)
AmazonEBSCSIDriverPolicy is what was missing for me, many thanks!! Side-note: I installed the addon via the AWS console, when it asks for the role, there its a comment that if we don't choose one, it will take the node role... they could have mentioned the need for the policy on that screen. right...Phenazine
Role is the key in my case.Hartsfield
G
0

Уour question has already been asked several times and it remained unanswered.

E.g. here: SweetOps #kubernetes for March, 2020

Or here (need login to AWS console): AWS Developer Forums: PVC are in Pending state that are ...

The source code is here

    opComplete := util.OperationCompleteHook(plugin.GetPluginName(), "volume_provision")
    volume, err = provisioner.Provision(selectedNode, allowedTopologies)
    opComplete(volumetypes.CompleteFuncParam{Err: &err})
    if err != nil {
        // Other places of failure have nothing to do with VolumeScheduling,
        // so just let controller retry in the next sync. We'll only call func
        // rescheduleProvisioning here when the underlying provisioning actually failed.
        ctrl.rescheduleProvisioning(claim)

        strerr := fmt.Sprintf("Failed to provision volume with StorageClass %q: %v", storageClass.Name, err)
        klog.V(2).Infof("failed to provision volume for claim %q with StorageClass %q: %v", claimToClaimKey(claim), storageClass.Name, err)
        ctrl.eventRecorder.Event(claim, v1.EventTypeWarning, events.ProvisioningFailed, strerr)
        return pluginName, err
    }

But there is a solution in another repo, /kubernetes-sigs/aws-ebs-csi-driver

the issue was resolved after fixing a misconfigured CNI setup, which prevented inter-node-communication and thus a provisioning of storage never got triggered.

We have not tried upgrading our current working cluster (v1.15.x) to any newer versions, but we can confirm that mounting volumes and provisioning storage works on v1.17.x when starting from scratch (aka. building a new test-cluster in our case).

we are using the specs provided above by @gini-schorsch - but since opening this issue we also moved to the external AWS cloud-controller-manager (aka. aws-cloud-controller-manager)

we have been using the provided IAM profiles for both components (CSI and CCM) and cut them down to the use-cases we require for our operations and did not see any problems with that so far.

So, check you connectivity. And maybe @muni-kumar-gundu is right. And then you may want to check AZ's of your nodes.

Galway answered 22/9, 2021 at 21:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.