添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

이슈사항 및 발생원인

production 환경에서 잘못해서 PVC를 삭제한 경우 어떻게 대처하고 재생성을 하였는지를 기록해보고자 한다.

실수로 dev 환경으로 착각하고 아래와 같은 명령어를 실행하였다.

[root@kube ~]# kubectl delete pvc/spinnaker-minio -n spinnaker

실제 pvc는 volumeattached 명령어로 확인해본 결과 아직 attached 상태이기에 pv가 삭제되지는 않았다.
(다시한번 production에 대한 permission 관리와 다양한 접근제어가 필요하다는 생각을...)

[root@kube ~]# kubectl get volumeattachment -n spinnaker
NAME                                                                   ATTACHER           PV                                         NODE                 ATTACHED   AGE
csi-1c55703a3cb26172ebdb32009649025c70e69d304ae7450f7f1578811d4022df   rbd.csi.ceph.com   pvc-03a39b1f-529e-4593-a5c3-a0b2405605b8   labs-kube-infra003   true       160d
csi-1fb754fc8fe8099b1a192cac40d2697d43c95b4330d8f4dd50911791e9c25634   rbd.csi.ceph.com   pvc-de078b0b-3971-4ff4-869b-94286ac8e25b   labs-kube-infra003   true       160d
csi-6ab5b4d38a317ecf49de5e89d18a309bd606f7191169be2c93b0c357352deafa   rbd.csi.ceph.com   pvc-724c6e4a-c31e-4fd8-bb1e-4c31b2115d2b   labs-kube-infra001   true       26h
csi-78aba44159c7b7dcb2ab1ddbf9a29636d0d523f307530a5fac75c8b0dfb2a649   rbd.csi.ceph.com   pvc-18ec78bc-7c2f-4c32-a0cd-17d1ef39a5da   labs-kube-infra003   true       26d
csi-8040726a7148a1b974a3e0d768d1149c93a3f016320cdebf163f27cdfb94ea9a   rbd.csi.ceph.com   pvc-4c17447d-9296-486f-86d0-e3fe97212d5e   labs-kube-infra003   true       154d
csi-a60256901e0359b09a21b191f0dae84cfbf13f9d5b5fb59e50707bf5d6993cc1   rbd.csi.ceph.com   pvc-b58f1a67-ef46-477e-9855-4dbd99a779ee   labs-kube-infra001   true       99d
csi-b97e177375b47e804630cbdd375b5143571dd6439522f1185e0db29e7cf75c1a   rbd.csi.ceph.com   pvc-88ad71d2-76a4-4a84-b551-e6a0528c3b6c   labs-kube-infra003   true       160d
csi-c0526b6e8e5c48a4cd03a33379c1243ba9a19f1f4dc5351e36ec58cf85fa3f61   rbd.csi.ceph.com   pvc-cb233626-914a-49e4-8a9a-2df36299334f   labs-kube-infra001   true       47d
csi-c294ba3a06564c9a61516528bf69196197c3ea50fbda647e23f0bd011cec385c   rbd.csi.ceph.com   pvc-bd62608d-4290-4452-b05e-427e3358e927   labs-kube-infra001   true       154d
csi-c94771c39008c9c5ca2ffd0fc2eded39519d54a84705cdd0fa166b3fbb5da587   rbd.csi.ceph.com   pvc-6e30e1c4-25e5-4028-a750-8d345607a0ea   labs-kube-infra001   true       47d
csi-caeb766845d9b67da72a9ce7722c1371baa0cf4b25bb88bad0106a81167a3c3d   rbd.csi.ceph.com   pvc-ec02fe57-9dc1-423d-9029-dffde5d6cf7c   labs-kube-infra003   true       26h
csi-d6325abe1a5025e5fef9542e7d4ed921cababb4c245fa210673722ea78e252ba   rbd.csi.ceph.com   pvc-abc9480e-4a12-47c4-8d93-0045aa3df8c6   labs-kube-infra002   true       26h
csi-e5dd81617dc89f8ac4f5ee2ad5a06cf8fccbfe19da9f1f8aab4052ac9cb9f844   rbd.csi.ceph.com   pvc-b2f32f76-46fe-4ad0-a7f4-018955b11e71   labs-kube-infra001   true       154d
csi-e95796dd692ddd3da775d8cb5152d315bd0999474262781cf065dcd6eefe60db   rbd.csi.ceph.com   pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7   labs-kube-infra003   true       32d

실제 pvc는 다음과 같은 terminating 상태를 가졌다.

[root@kube ~]# kubectl get pvc -n spinnaker
NAME                                         STATUS        VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
halyard-home-spinnaker-spinnaker-halyard-0   Bound         pvc-bd62608d-4290-4452-b05e-427e3358e927   10Gi       RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-master-0          Bound         pvc-abc9480e-4a12-47c4-8d93-0045aa3df8c6   8Gi        RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-slave-0           Bound         pvc-b2f32f76-46fe-4ad0-a7f4-018955b11e71   8Gi        RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-slave-1           Bound         pvc-4c17447d-9296-486f-86d0-e3fe97212d5e   8Gi        RWO            csi-rbd-sc     154d
spinnaker-minio                              Terminating   pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7   10Gi       RWO            csi-rbd-sc     32d

대처 및 해결방법

우선 삭제후 재생성이 필요하다고 생각하였고
아래와 같은 metadata patch 명령어를 실행해 기존에 있던 pvc를 삭제해버렸다.

kubectl patch pvc db-pv-claim -p '{"metadata":{"finalizers":null}}'
  • https://github.com/kubernetes/kubernetes/issues/69697
    즉, 위와 같이 finalizers metadata를 null 처리하게 되면 Terminating 상태로만 되었던 pvc를 삭제시켜준다.
  • 기존에 백업해 놓은 pvc manifest를 복사해와 이를 가지고 복원하고자 했다.
    하지만 아래와 같이 실제 status가 Lost로 정상상태인 Bound가 되지 않았다.

    [root@kube ~]# kubectl get pvc -n spinnaker
    NAME                                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    halyard-home-spinnaker-spinnaker-halyard-0   Bound    pvc-bd62608d-4290-4452-b05e-427e3358e927   10Gi       RWO            csi-rbd-sc     154d
    redis-data-spinnaker-redis-master-0          Bound    pvc-abc9480e-4a12-47c4-8d93-0045aa3df8c6   8Gi        RWO            csi-rbd-sc     154d
    redis-data-spinnaker-redis-slave-0           Bound    pvc-b2f32f76-46fe-4ad0-a7f4-018955b11e71   8Gi        RWO            csi-rbd-sc     154d
    redis-data-spinnaker-redis-slave-1           Bound    pvc-4c17447d-9296-486f-86d0-e3fe97212d5e   8Gi        RWO            csi-rbd-sc     154d
    spinnaker-minio                              Lost     pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7   0                         csi-rbd-sc     106s

    확인해본 결과 아래와 같이 해당 pvc에서 claim하려는 pv가 다른곳에서 이미 bound 한상태이라 lost상태가 된것이었다.

    [root@kube ~]# kubectl describe pvc/spinnaker-minio -n spinnaker
    Name:          spinnaker-minio
    Namespace:     spinnaker
    StorageClass:  csi-rbd-sc
    Status:        Lost
    Volume:        pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7
    Labels:        app=minio
                   chart=minio-1.6.3
                   heritage=Helm
                   release=spinnaker
    Annotations:   pv.kubernetes.io/bind-completed: yes
                   pv.kubernetes.io/bound-by-controller: yes
                   volume.beta.kubernetes.io/storage-provisioner: rbd.csi.ceph.com
    Finalizers:    [kubernetes.io/pvc-protection]
    Capacity:      0
    Access Modes:  
    VolumeMode:    Filesystem
    Mounted By:    spinnaker-minio-76fb7f68c9-hrsf5
    Events:
      Type     Reason         Age    From                         Message
      ----     ------         ----   ----                         -------
      Warning  ClaimMisbound  3m50s  persistentvolume-controller  Two claims are bound to the same volume, this one is bound incorrectly

    실제 해당 pv를 확인해보자 아래와 같이 uid가 새로 생성한 pvc의 uid가 아닌 다른 uid를 가지고 있었다.

    [root@kube ~]# kubectl get pv/pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7 -n spinnaker -o yaml
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      annotations:
        pv.kubernetes.io/provisioned-by: rbd.csi.ceph.com
      creationTimestamp: "2020-05-22T07:53:30Z"
      finalizers:
      - kubernetes.io/pv-protection
      name: pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7
      resourceVersion: "32467045"
      selfLink: /api/v1/persistentvolumes/pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7
      uid: 43e88b05-c89f-4eb3-b81a-eaee251c2247
    spec:
      accessModes:
      - ReadWriteOnce
      capacity:
        storage: 10Gi
      claimRef:
        apiVersion: v1
        kind: PersistentVolumeClaim
        name: spinnaker-minio
        namespace: spinnaker
        resourceVersion: "25843213"
        uid: 3685c8f5-aba0-4e0b-99df-ea339842a9d7
        driver: rbd.csi.ceph.com
        fsType: ext4
        nodeStageSecretRef:
          name: csi-rbd-secret
          namespace: default
        volumeAttributes:
          clusterID: 5fb1204b-6152-41a8-b4cb-f819e8728a6c
          pool: kubernetes
          storage.kubernetes.io/csiProvisionerIdentity: 1589350154546-8081-rbd.csi.ceph.com
        volumeHandle: 0001-0024-5fb1204b-6152-41a8-b4cb-f819e8728a6c-0000000000000003-495c8d33-9c01-11ea-875b-3acb21213cec
      mountOptions:
      - discard
      persistentVolumeReclaimPolicy: Delete
      storageClassName: csi-rbd-sc
      volumeMode: Filesystem
    status:
      phase: Released
    

    하여 검색해보니 아래와 같이 pv의 claimRef의 uid를 patch로 새로 생성한 pvc uid로 변경하거나 edit를 통해 해당 claimRef의 uid를 변경하는것을 권장했다.
    edit를 통해 변경하였고 이와 같이 변경한후 아래와 같이 정상적으로 bound된것을 확인할 수 있었다.

    [root@kube ~]# kubectl get pvc -n spinnaker
    NAME                                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    halyard-home-spinnaker-spinnaker-halyard-0   Bound    pvc-bd62608d-4290-4452-b05e-427e3358e927   10Gi       RWO            csi-rbd-sc     154d
    redis-data-spinnaker-redis-master-0          Bound    pvc-abc9480e-4a12-47c4-8d93-0045aa3df8c6   8Gi        RWO            csi-rbd-sc     154d
    redis-data-spinnaker-redis-slave-0           Bound    pvc-b2f32f76-46fe-4ad0-a7f4-018955b11e71   8Gi        RWO            csi-rbd-sc     154d
    redis-data-spinnaker-redis-slave-1           Bound    pvc-4c17447d-9296-486f-86d0-e3fe97212d5e   8Gi        RWO            csi-rbd-sc     154d
    spinnaker-minio                              Bound    pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7   10Gi       RWO            csi-rbd-sc     113s

    즉, 해결방법은 기존 Terminating상태의 pvc를 삭제하고 새로운 pvc를 생성하는데 기존 pv가 claimRef로 참조되어있던 정보를 새로운 pvc의 정보로 업데이트 해주면 되었다.

    참고사이트

  • https://github.com/kubernetes/kubernetes/issues/20753