Azure/bicep

DependsOn changes and breaks deployments when using feature in bicep that switches to ARM v2

Opened this issue ยท 9 comments

Bicep version
0.30.23

Describe the bug
When using something that changes the language version from v1 to v2, the dependsOn array of the resource changes and generates ARM templates that cannot be deployed.

To Reproduce
Consider this template:

param redisName string = ''
param location string = resourceGroup().location

var hasRedisCache = (redisName != '')

resource redisCache 'Microsoft.Cache/redis@2024-04-01-preview' existing = if (hasRedisCache) {
  name: redisName!
}

resource functionApp 'Microsoft.Web/sites@2023-12-01' = {
  name: 'functionapp'
  location: location

  resource functionAppConfig 'config' = {
    name: 'appsettings'
    properties: hasRedisCache == false
      ? {}
      : {
          'Cache:CacheConnection': '${redisCache.name}.redis.cache.windows.net,abortConnect=false,ssl=true,password=${redisCache.listKeys().primaryKey}'
        }
  }
}

This becomes

{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "metadata": {
    "_generator": {
      "name": "bicep",
      "version": "0.30.23.60470",
      "templateHash": "12087235466515744284"
    }
  },
  "parameters": {
    "redisName": {
      "type": "string",
      "defaultValue": ""
    },
    "location": {
      "type": "string",
      "defaultValue": "[resourceGroup().location]"
    }
  },
  "variables": {
    "hasRedisCache": "[not(equals(parameters('redisName'), ''))]"
  },
  "resources": [
    {
      "type": "Microsoft.Web/sites/config",
      "apiVersion": "2023-12-01",
      "name": "[format('{0}/{1}', 'functionapp', 'appsettings')]",
      "properties": "[if(equals(variables('hasRedisCache'), false()), createObject(), createObject('Cache:CacheConnection', format('{0}.redis.cache.windows.net,abortConnect=false,ssl=true,password={1}', parameters('redisName'), listKeys(resourceId('Microsoft.Cache/redis', parameters('redisName')), '2024-04-01-preview').primaryKey)))]",
      "dependsOn": [
        "[resourceId('Microsoft.Web/sites', 'functionapp')]"
      ]
    },
    {
      "type": "Microsoft.Web/sites",
      "apiVersion": "2023-12-01",
      "name": "functionapp",
      "location": "[parameters('location')]"
    }
  ]
}

If I change the top of the bicep file to

param redisName string?
param location string = resourceGroup().location

var hasRedisCache = (redisName != null)

the generated ARM becomes

{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
  "languageVersion": "2.0",
  "contentVersion": "1.0.0.0",
  "metadata": {
    "_generator": {
      "name": "bicep",
      "version": "0.30.23.60470",
      "templateHash": "570918317012327008"
    }
  },
  "parameters": {
    "redisName": {
      "type": "string",
      "nullable": true
    },
    "location": {
      "type": "string",
      "defaultValue": "[resourceGroup().location]"
    }
  },
  "variables": {
    "hasRedisCache": "[not(equals(parameters('redisName'), null()))]"
  },
  "resources": {
    "functionApp::functionAppConfig": {
      "type": "Microsoft.Web/sites/config",
      "apiVersion": "2023-12-01",
      "name": "[format('{0}/{1}', 'functionapp', 'appsettings')]",
      "properties": "[if(equals(variables('hasRedisCache'), false()), createObject(), createObject('Cache:CacheConnection', format('{0}.redis.cache.windows.net,abortConnect=false,ssl=true,password={1}', parameters('redisName'), listKeys(resourceId('Microsoft.Cache/redis', parameters('redisName')), '2024-04-01-preview').primaryKey)))]",
      "dependsOn": [
        "functionApp",
        "redisCache"
      ]
    },
    "redisCache": {
      "condition": "[variables('hasRedisCache')]",
      "existing": true,
      "type": "Microsoft.Cache/redis",
      "apiVersion": "2024-04-01-preview",
      "name": "[parameters('redisName')]"
    },
    "functionApp": {
      "type": "Microsoft.Web/sites",
      "apiVersion": "2023-12-01",
      "name": "functionapp",
      "location": "[parameters('location')]"
    }
  }
}

The issue with this ARM template is that the dependsOn of the function app config always contains redisCache, even if hasRedisCache is false, and there is no redisCache because of that condition.

The deployments engine should ignore any entries in dependsOn that point to a resource with a condition of false. What error are you seeing when you try to deploy the template?

I tried to put together a reproduction case that didn't require any resource, but it was successfully deployed whether using languageVersion 2.0 or not.

The repro case used these Bicep files:

main.bicep

type foo = string
param condition bool = false

module a 'mod.bicep' = if (condition) {
  name: 'a'
  params: {
    in: 'a_input'
  }
}

module b 'mod.bicep' = {
  name: 'b'
  params: {
    in: condition ? a.outputs.out : 'b_input'
  }
}

output out string = b.outputs.out

mod.bicep

param in string

output out string = in

This compiled to:

{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
  "languageVersion": "2.0",
  "contentVersion": "1.0.0.0",
  "metadata": {
    "_generator": {
      "name": "bicep",
      "version": "0.30.3.12046",
      "templateHash": "18159484721557549296"
    }
  },
  "definitions": {
    "foo": {
      "type": "string"
    }
  },
  "parameters": {
    "condition": {
      "type": "bool",
      "defaultValue": false
    }
  },
  "resources": {
    "a": {
      "condition": "[parameters('condition')]",
      "type": "Microsoft.Resources/deployments",
      "apiVersion": "2022-09-01",
      "name": "a",
      "properties": {
        "expressionEvaluationOptions": {
          "scope": "inner"
        },
        "mode": "Incremental",
        "parameters": {
          "in": {
            "value": "a_input"
          }
        },
        "template": {
          "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
          "contentVersion": "1.0.0.0",
          "metadata": {
            "_generator": {
              "name": "bicep",
              "version": "0.30.3.12046",
              "templateHash": "2122539223069228788"
            }
          },
          "parameters": {
            "in": {
              "type": "string"
            }
          },
          "resources": [],
          "outputs": {
            "out": {
              "type": "string",
              "value": "[parameters('in')]"
            }
          }
        }
      }
    },
    "b": {
      "type": "Microsoft.Resources/deployments",
      "apiVersion": "2022-09-01",
      "name": "b",
      "properties": {
        "expressionEvaluationOptions": {
          "scope": "inner"
        },
        "mode": "Incremental",
        "parameters": {
          "in": "[if(parameters('condition'), createObject('value', reference('a').outputs.out.value), createObject('value', 'b_input'))]"
        },
        "template": {
          "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
          "contentVersion": "1.0.0.0",
          "metadata": {
            "_generator": {
              "name": "bicep",
              "version": "0.30.3.12046",
              "templateHash": "2122539223069228788"
            }
          },
          "parameters": {
            "in": {
              "type": "string"
            }
          },
          "resources": [],
          "outputs": {
            "out": {
              "type": "string",
              "value": "[parameters('in')]"
            }
          }
        }
      },
      "dependsOn": [
        "a"
      ]
    }
  },
  "outputs": {
    "out": {
      "type": "string",
      "value": "[reference('b').outputs.out.value]"
    }
  }
}

Notice that in the compiled template, b depends on a, and a has a condition of false.

The error of the actual template I get is:

"Deployment template validation failed: 'The template reference 'otherRegionRedisCache' is not valid: could not find template resource or resource copy with this name. Please see https://aka.ms/arm-function-reference for usage details.

The critical part is this, if this template is deployed in v1, the ARM is:

"resources": [
  {
      "type": "Microsoft.Web/sites/config",
      "apiVersion": "2021-03-01",
      "name": "[format('{0}/{1}', parameters('functionAppName'), 'appsettings')]",
      "properties": "[union([..] if(equals(variables('hasOtherRegionRedisCache'), false()), createObject(), createObject('Cache:AlternativeCacheConnection', format('{0}.redis.cache.windows.net,abortConnect=false,ssl=true,password={1}', parameters('otherRegionRedisCacheName'), listKeys(extensionResourceId(format('/subscriptions/{0}/resourceGroups/{1}', subscription().subscriptionId, parameters('otherRegionRedisCacheResourceGroupName')), 'Microsoft.Cache/redis', parameters('otherRegionRedisCacheName')), '2021-06-01').primaryKey))), [..])]",
      "dependsOn": [
          "[resourceId('Microsoft.Web/sites', parameters('functionAppName'))]"
      ]
  },
]

With v2 it becomes this:

"resources": {
  "functionApp::functionAppConfig": {
      "type": "Microsoft.Web/sites/config",
      "apiVersion": "2021-03-01",
      "name": "[format('{0}/{1}', parameters('functionAppName'), 'appsettings')]",
      "properties": "[union([..] if(equals(variables('hasOtherRegionRedisCache'), false()), createObject(), createObject('Cache:AlternativeCacheConnection', format('{0}.redis.cache.windows.net,abortConnect=false,ssl=true,password={1}', parameters('otherRegionRedisCacheName'), listKeys(extensionResourceId(format('/subscriptions/{0}/resourceGroups/{1}', subscription().subscriptionId, parameters('otherRegionRedisCacheResourceGroupName')), 'Microsoft.Cache/redis', parameters('otherRegionRedisCacheName')), '2021-06-01').primaryKey))), [..])]",
      "dependsOn": [
          "appInsights",
          "functionApp",
          "otherRegionRedisCache",
          "redisCache",
          "sharedResourcesUid",
          "storageAccount"
      ]
  },
}

otherRegionRedisCache isn't in the template. otherRegionRedisCache is defined as

resource otherRegionRedisCache 'Microsoft.Cache/redis@2021-06-01' existing = if (hasOtherRegionRedisCache) {
  name: otherRegionRedisCacheName
  scope: resourceGroup(subscription().subscriptionId, otherRegionRedisCacheResourceGroupName)
}

and gets completely optimized away because otherRegionRedisCache.listKeys().primaryKey can be expressed as with a listKeys() and a resource id. So it appears that it's not the v2 per se, but more that resources that are optimized away keep appearing in the dependsOn.

Would you be able to share the full template? If you don't feel comfortable sharing it here, you can email it to me at <my github username>@microsoft.com.

That exact error message is usually raised when there is a reference() function whose target can't be found. The deployments engine can't really tell if that function expression is in an active branch or not, so you can see errors like that from expressions like hasOtherRegionRedisCache ? otherRegionRedisCache.properties.<something> : null

The full template is pretty massive and of a customer, so I've extracted the part that allows me to reproduce the error when I deploy it to a resource group containing a consumption app service plan.

main.bicep

param location string = resourceGroup().location

module functionapp 'function.bicep' = {
  name: 'function-module'
  params: {
    location: location
  }
}

function.bicep

param redisName string = ''
param location string

var hasRedisCache = (redisName != '')

resource redisCache 'Microsoft.Cache/redis@2024-04-01-preview' existing = if (hasRedisCache) {
  name: redisName!
}

resource functionApp 'Microsoft.Web/sites@2023-12-01' = {
  name: 'functionapp'
  location: location
  kind: 'functionapp'
  properties: {
    serverFarmId: resourceId('Microsoft.Web/serverfarms', 'ASP-functionapp')
  }

  resource functionAppConfig 'config' = {
    name: 'appsettings'
    properties: hasRedisCache == false
      ? {}
      : {
          'Cache:CacheConnection': '${redisCache.name}.redis.cache.windows.net,abortConnect=false,ssl=true,password=${redisCache.listKeys().primaryKey}'
        }
  }
}

Changing the string = '' to string? triggers the issue. I think it being inside a module is important for causing the issue.

There was a change that recently rolled out to address #2371 that did not account for dependsOn entries pointing at resources with a false condition that failed validation (and as a result were not included in the deployment). I'm working on a fix in the backend, but it'll take some time to roll out.

In the meantime, you can work around this error by updating the redisCache resource to the following:

resource redisCache 'Microsoft.Cache/redis@2024-04-01-preview' existing = if (hasRedisCache) {
  name: hasRedisCache ? redisName : 'placeholder'
}

Hey @jeskew,
I may be facing a very similar, if not the same issue in the Digital Twins module of AVM. It's a bit of a fun one as there are numerous conditional existing resources in there - yet only one of them does not work ๐Ÿ˜„

The source (which currently exists in my development branch here) is, for example, deployed via this test file and the part that causes an issue is specifically this line:

resource endpoint 'Microsoft.DigitalTwins/digitalTwinsInstances/endpoints@2023-01-31' = {
  name: name
  parent: digitalTwinsInstance
  properties: {  
    // Event Grid
    ...(properties.endpointType == 'EventGrid'
      ? {
          authenticationType: 'KeyBased'
          // Should use the commented code for simplification (allows one less user input), but this introduces a bug where all deployments not using the eventGridTopic resourceId will fail as they cannot resolve the dependency (that they're not using). Asking for the TopicEndpoints is a workaround.
          // TopicEndpoint: eventGridTopic.properties.endpoint // Introduces a breaking dependency. Would be value: E.g., https://dep-dtdmax-evgt-01.eastus-1.eventgrid.azure.net/api/events
          TopicEndpoint: properties.eventGridTopicEndpoint
          accessKey1: eventGridTopic.listkeys().key1
          accessKey2: eventGridTopic.listkeys().key2
        }
      : {})
    // (...)
  }
}

What we're looking at here is a subset of the original logic where the properties block expects a TopicEndpoint value (e.g., https://dep-dtdmax-evgt-01.eastus-1.eventgrid.azure.net/api/events) and where I'd try to get it from is an existing resource above, declared like

resource eventGridTopic 'Microsoft.EventGrid/topics@2022-06-15' existing = if (properties.endpointType == 'EventGrid') {
  name: last(split(properties.?eventGridTopicResourceId, '/'))
  scope: resourceGroup(
    split((properties.?eventGridTopicResourceId ?? '//'), '/')[2],
    split((properties.?eventGridTopicResourceId ?? '////'), '/')[4]
  )
}

Now this logic works like a charm ๐Ÿ’ช - BUT - if I dare to not use an endpointType of type 'EventGrid' (i.e., the conditions evaluate to false, I start getting the error The template reference 'eventGridTopic' is not valid: could not find template resource or resource copy with this name. much like the original author.

As you can see in the linked template, I have several other existing resources in the template (i.e., serviceBus & eventHub) that each corresponding to a different endpointType. The thing is, those work perfectely. It is only the EventGrid with its TopicEndpoint that causes the issue which is why my current workaround is to ask the user for this value (which is quite unfortunately as I 'should' be able to just grab it).

Originally I thought it's due to a bug regarding the dependencies as the main.json file looks like this

    "endpoint": {
      "type": "Microsoft.DigitalTwins/digitalTwinsInstances/endpoints",
      "apiVersion": "2023-01-31",
      "name": "[format('{0}/{1}', parameters('digitalTwinInstanceName'), parameters('name'))]",
      "properties": "[shallowMerge(createArray(createObject('endpointType', parameters('properties').endpointType, 'identity', variables('identity'), 'deadLetterSecret', tryGet(parameters('properties'), 'deadLetterSecret'), 'deadLetterUri', tryGet(parameters('properties'), 'deadLetterUri')), if(equals(parameters('properties').endpointType, 'EventGrid'), createObject('authenticationType', 'KeyBased', 'TopicEndpoint', tryGet(tryGet(reference('eventGridTopic', '2022-06-15', 'full'), 'properties'), 'endpoint'), 'accessKey1', listkeys(extensionResourceId(format('/subscriptions/{0}/resourceGroups/{1}', split(coalesce(tryGet(parameters('properties'), 'eventGridTopicResourceId'), '//'), '/')[2], split(coalesce(tryGet(parameters('properties'), 'eventGridTopicResourceId'), '////'), '/')[4]), 'Microsoft.EventGrid/topics', last(split(tryGet(parameters('properties'), 'eventGridTopicResourceId'), '/'))), '2022-06-15').key1, 'accessKey2', listkeys(extensionResourceId(format('/subscriptions/{0}/resourceGroups/{1}', split(coalesce(tryGet(parameters('properties'), 'eventGridTopicResourceId'), '//'), '/')[2], split(coalesce(tryGet(parameters('properties'), 'eventGridTopicResourceId'), '////'), '/')[4]), 'Microsoft.EventGrid/topics', last(split(tryGet(parameters('properties'), 'eventGridTopicResourceId'), '/'))), '2022-06-15').key2), createObject()), (...)]",
      "dependsOn": [
        "eventGridTopic" // look at me
      ]
    }

i.e., it has a dependency on that specific resource, but no other existing resource (like the serviceBus). But when I checked other modules that e.g., conditionally reference a KeyVault for CustomerManagedKeys, they also have a dependency, but it does not fail the deployment if the key vault is not configured. Which aligns with your previous statement that ARM should ignore dependencies if they evaluate to false (which I can confirm). Still found it interesting though.

To reproduce the issue I'd recommend to just copy the module and give the testMe test case a spin with the line TopicEndpoint: eventGridTopic.properties.endpoint commented in. Just takes about 5-7 minutes to deploy - and well fail.

If you have any guidance. please let me know.

cc: @anthony-c-martin

jeskew commented

@AlexanderSehr I think the error you're running into has the same root cause as the one reported by OP. There's a fix staged that got caught in the holiday code freeze, but it should roll out soon.

@AlexanderSehr I think the error you're running into has the same root cause as the one reported by OP. There's a fix staged that got caught in the holiday code freeze, but it should roll out soon.

This is music to my ears - or rather eyes. Anyways, very much looking forward to the next Bicep release version then :) Once a nightly is availble, feel free to give me a ping.

jeskew commented

It's a backend change included in the w3 release.