Azure/AKS

[BUG] Creating cloned PVC based on Disk from another ResourceGroup fails with malformatted path in `sourceResourceId`

Opened this issue · 7 comments

Describe the bug
My goal is to copy a PVC from one AKS cluster to another.

I am doing so by first cloning the disk from the source MC_-resource group into a manually managed resource group, then importing that disk via volumeHandle into the new cluster.
This is based on piecing together building blocks from
https://learn.microsoft.com/en-us/azure/aks/azure-disk-csi#volume-snapshots
https://learn.microsoft.com/en-us/azure/aks/csi-disk-move-subscriptions

Following the guide exactly works, when it breaks down is when you want to make a clone of the newly imported PVC, using it as a dataSource. That results in ProvisioningFailed error event on the cloned PVC like
failed to provision volume with StorageClass "managed-csi": rpc error: code = Internal desc = sourceResourceID(/subscriptions/SUBSCRIPTION_ID/resourceGroups/mc_destcluster_westeurope/providers/Microsoft.Compute/disks//subscriptions/SUBSCRIPTION_ID/resourcegroups/IntermediateRG/providers/Microsoft.Compute/disks/restored-disk2) is invalid, correct format: .*/subscriptions/(?:.*)/resourceGroups/(?:.*)/providers/Microsoft.Compute/disks/(.+)

Notice that the path appears to partially duplicated, pointing to both resource groups, with an odd /disks//subscriptions in the middle.

To Reproduce
First, in the source-cluster:

  1. Create a PVC using azure disk CSI driver, called original-pvc
  2. Create a VolumeSnapshotClass to create snapshots in a different resourcegroup (IntermediateRG):
apiVersion: snapshot.storage.k8s.io/v1
deletionPolicy: Delete
driver: disk.csi.azure.com
kind: VolumeSnapshotClass
metadata:
  name: csi-azuredisk-vsc-to-other-rg
parameters:
  incremental: "true"
  resourceGroup: IntermediateRG
  1. Create VolumeSnapshot that creates a snapshot of original-pvc in IntermediateRG
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: my-snapshot
spec:
  volumeSnapshotClassName: csi-azuredisk-vsc-to-other-rg
  source:
    persistentVolumeClaimName: original-pvc

  1. Inspect the VolumeSnapshotContent that is created by doing so, find the resource ID and open it in Azure Portal
  2. Click on Create Disk and complete the wizard, we call this restored-disk2

Moving on, in the destination-cluster:
7. Create a PV that imports the Disk from the intermediate resource group, using volumeHandle

kind: PersistentVolume
metadata:
  name: pv-moveddisk2
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: managed-csi
  csi:
    driver: disk.csi.azure.com
    readOnly: false
    volumeHandle: /subscriptions/SUBSCRIPTION_ID/resourcegroups/IntermediateRG/providers/Microsoft.Compute/disks/restored-disk2
    volumeAttributes:
      fsType: ext4
  1. Create a PVC, pvc-moveddisk2 and a pod that mounts it. So far, everything works good. 👍
  2. Create a PVC that clones pvc-moveddisk2, using dataSource:
kind: PersistentVolumeClaim
metadata:
  name: pvc-clone
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: managed-csi
  resources:
    requests:
      storage: 10Gi
  dataSource:
    kind: PersistentVolumeClaim
    name: pvc-moveddisk2
  1. Create a pod that mounts pvc-clone
  2. Observe kubectl describe pvc/pvc-clone
    failed to provision volume with StorageClass "managed-csi": rpc error: code = Internal desc = sourceResourceID(/subscriptions/SUBSCRIPTION_ID/resourceGroups/mc_destcluster_westeurope/providers/Microsoft.Compute/disks//subscriptions/SUBSCRIPTION_ID/resourcegroups/IntermediateRG/providers/Microsoft.Compute/disks/restored-disk2) is invalid, correct format: .*/subscriptions/(?:.*)/resourceGroups/(?:.*)/providers/Microsoft.Compute/disks/(.+)

Expected behavior

  • A new PV is created.
  • The new PV is bound to a new disk in the destination MC_ resource group.
  • After the move operation is done, the source disk can be removed and one can continue to rely on only the Kuberenetes managed resources.

Environment (please complete the following information):

  • AKS Kubernetes version. Tested both 1.29.7 and 1.30.2

Hmmm, it appears like it works if in step 7 i change the
volumeHandle: /subscriptions/SUBSCRIPTION_ID/resourcegroups/IntermediateRG/providers/Microsoft.Compute/disks/restored-disk2
to
volumeHandle: /subscriptions/SUBSCRIPTION_ID/resourceGroups/IntermediateRG/providers/Microsoft.Compute/disks/restored-disk2 .

The only difference is the camelCasing of resourceGroups.

This is inconsistent with the full Resource ID given by the Azure portal, where i used the copy-to-clipboard button, giving a faulty resourcegroups value without capital G:
image

Issue needing attention of @Azure/aks-leads

Issue needing attention of @Azure/aks-leads

Issue needing attention of @Azure/aks-leads

Issue needing attention of @Azure/aks-leads

Issue needing attention of @Azure/aks-leads