Azure/azure-monitor-baseline-alerts

RV Backup Policy Definition not automatically remediating

Closed this issue · 9 comments

Check for previous/existing GitHub issues

  • I have checked for previous/existing GitHub issues

Description

I have tried this in my own test environment and could not reproduce. However customer reported this happened on 5 RSVs in his sandbox environment. The issue is after we deploy this policy:
image
https://github.com/Azure/alz-monitor/blob/main/src/resources/Microsoft.Authorization/policyDefinitions/deploy-rv_backuphealth_monitor.bicep

It shows customers Recovery Vaults as non compliant as expected.
image
image

and this seems accurate
image

After customer tried remediation, it didnt bring any of them into compliance.
I have manually checked that box and the next time the policy runs it shows the RV as compliant (manual workaround).

image

Had some discussion internally about this and was told:
The non compliance detail you shared is correct, any RSV that has not been remediated will show empty current values. By default this configuration doesnt exist on the RSV, therefore the value is empty. (you can ignore MonitorDisable, this is only used when you want to exclude certain resources from being monitored).

We were asked to investigate these things:
Can you review the following in the customer environment:

Get the details of the remediation task. In particular the related events.
Does the RSV have any resource locks? A read-only lock will cause the remediation to fail.
Can you review the Managed Identity of the Policy Assignment. For the remediation to work, we require at least sufficient permissions to modify the RSV. (By default Contributor is assigned)

Hello @stdistef, can you let me know the status of this issue? Were you able to review the remediation task/ resource locks and managed identity?

@stdistef can you share an update here?

We seem to now have and I think this may be a new resource added...
image

This is the new resource:
image

and the details:
image

Here is the activity log for that resource. Its the only activitly log I could find and need still to confirm with the customer if that was the result of manual remediation:
{
"authorization": {
"action": "Microsoft.RecoveryServices/vaults/write",
"scope": "/subscriptions//resourceGroups/rgp-ins-seas-itr-uiq-dv02/providers/Microsoft.RecoveryServices/vaults/rsv-ins-seas-itr-uiq-dv02"
},
"caller": "cffbb479-12fc-4811-8bb6-2ff1806f13aa",
"channels": "Operation",
"claims": {
"aud": "https://management.core.windows.net/",
"iss": "https://sts.windows.net/6d58556a-30e2-499f-b4b4-db8169da027a/",
"iat": "1696935125",
"nbf": "1696935125",
"exp": "1696939025",
"aio": "E2FgYOgv/PB12f3Qo9F1yw5E/NxuCwA=",
"appid": "a64119a7-fc22-4e88-b334-30efbd2111fc",
"appidacr": "1",
"groups": "e5b27b69-2a66-4150-9266-2249c2228c70",
"http://schemas.microsoft.com/identity/claims/identityprovider": "https://sts.windows.net/6d58556a-30e2-499f-b4b4-db8169da027a/",
"idtyp": "app",
"http://schemas.microsoft.com/identity/claims/objectidentifier": "",
"rh": "0.AVIAalVYbeIwn0m0tNuBadoCekZIf3kAutdPukPawfj2MBPNAAA.",
"http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier": "cffbb479-12fc-4811-8bb6-2ff1806f13aa",
"http://schemas.microsoft.com/identity/claims/tenantid": "6
",
"uti": "UZ4Q8HL4Mk-f9-eiJKbSAA",
"ver": "1.0",
"xms_cae": "1",
"xms_tcdt": "1586891129"
},
"correlationId": "6ffc6c09-6661-4e48-8f3d-878fa9ae199a",
"description": "",
"eventDataId": "f6d85ba0-bc81-43b6-bb3b-60899ee6e808",
"eventName": {
"value": "EndRequest",
"localizedValue": "End request"
},
"category": {
"value": "Policy",
"localizedValue": "Policy"
},
"eventTimestamp": "2023-10-10T10:57:06.6689938Z",
"id": "/subscriptions/***/resourceGroups/rgp-ins-seas-itr-uiq-dv02/providers/Microsoft.RecoveryServices/vaults/rsv-ins-seas-itr-uiq-dv02/events/f6d85ba0-bc81-43b6-bb3b-60899ee6e808/ticks/638325322266689938",
"level": "Informational",
"operationId": "6ffc6c09-6661-4e48-8f3d-878fa9ae199a",
"operationName": {
"value": "Microsoft.Authorization/policies/modify/action",
"localizedValue": "Microsoft.Authorization/policies/modify/action"
},
"resourceGroupName": "rgp-ins-seas-itr-uiq-dv02",
"resourceProviderName": {
"value": "Microsoft.RecoveryServices",
"localizedValue": "Microsoft.RecoveryServices"
},
"resourceType": {
"value": "Microsoft.RecoveryServices/vaults",
"localizedValue": "Microsoft.RecoveryServices/vaults"
},
"resourceId": "/subscriptions/7
01/resourceGroups/rgp-ins-seas-itr-uiq-dv02/providers/Microsoft.RecoveryServices/vaults/rsv-ins-seas-itr-uiq-dv02",
"status": {
"value": "Succeeded",
"localizedValue": "Succeeded"
},
"subStatus": {
"value": "",
"localizedValue": ""
},
"submissionTimestamp": "2023-10-10T10:58:57Z",
"subscriptionId": "74
*********",
"tenantId": "6d58556*****************",
"properties": {
"isComplianceCheck": "False",
"resourceLocation": "southeastasia",
"ancestors": "INS-Sandbox,GMS-INS-Syseng,GMS-Infra-systems,6d58556a-30e2-499f-b4b4-db8169da027a",
"policies": "[{"policyDefinitionId":"/providers/Microsoft.Management/managementGroups/INS-Sandbox/providers/Microsoft.Authorization/policyDefinitions/Deploy_RecoveryVault_BackupHealthMonitor_Alert","policySetDefinitionId":"/providers/Microsoft.Management/managementGroups/INS-Sandbox/providers/Microsoft.Authorization/policySetDefinitions/Alerting-LandingZone","policyDefinitionReferenceId":"ALZ_RVBackupHealthMonitor","policySetDefinitionName":"Alerting-LandingZone","policySetDefinitionDisplayName":"Deploy ALZ Landing zone alerts","policyDefinitionName":"Deploy_RecoveryVault_BackupHealthMonitor_Alert","policyDefinitionDisplayName":"Deploy RV Backup Health Monitoring Alerts","policyDefinitionEffect":"modify","policyAssignmentId":"/providers/Microsoft.Management/managementGroups/INS-Sandbox/providers/Microsoft.Authorization/policyAssignments/ALZ-Monitor_LandingZones","policyAssignmentName":"ALZ-Monitor_LandingZones","policyAssignmentDisplayName":"ALZ Monitoring Alerts for LandingZones","policyAssignmentScope":"/providers/Microsoft.Management/managementGroups/INS-Sandbox","policyExemptionIds":[]}]",
"fields": "[]",
"skippedFields": "[]",
"conflictingFields": "",
"eventCategory": "Policy",
"entity": "/subscriptions/740e28***********/resourceGroups/rgp-ins-seas-itr-uiq-dv02/providers/Microsoft.RecoveryServices/vaults/rsv-ins-seas-itr-uiq-dv02",
"message": "Microsoft.Authorization/policies/modify/action",
"hierarchy": ""
},
"relatedEvents": []

@arjenhuitema and @paulgrimley, I talked with @stdistef today and this is still an issue.

@arjenhuitema and I discussed today. I shared my reader access to the Cx issue. However I cannot see any of the remediation logs since I only have reader role on the sub, not the MG. Booked a call with Cx for next Thursday. we will see whats cooking then.

I am seeing the same issue. The remediation task fails for the following reason:

Failed to remediate resource: '/subscriptions/<deleted>/resourceGroups/<deleted>-lab0-rg/providers/Microsoft.RecoveryServices/vaults/RSVault-<deleted>'. The 'PUT' request failed with status code: 'BadRequest'. Inner Error: 'redundancySettings parameter is invalid. Please provide a valid redundancySettings', Correlation Id: 'fdfeb80a-b98f-44ce-89a9-f45419acbe5e'.

Hi @alexeyzolotukhin, thanks for your feedback. Not sure the error you get with remediation relates to any configuration set by the policy. It´s complaining about settings regarding crossRegionRestore and standardTierStorageRedundancy which are not components that the policy modifies.

What is the configuration of the particular vault? Do you have crossRegionRestore enabled?

If you create a new vault, is it compliant?

I believe we can close this issue now. What we discovered in our investigation was this.
My customer was using an old API version in his PS script used to deploy RSVs. The older version didn't have the AZ or CRR "Backup properties" offered when we looked at the non compliant vaults in the portal (or query them via Rest API.
We asked the customer to delete and recreate one using the updated PS and the problem with Policy remediation was no longer seen. SO while we will continue to investigate RSVs created without redundancy options separately, the AMBA is not the problem here.

i forgot to click close in previous comment.