WHY, AZURE???

This is a list of MY SUBJECTIVE impressions of the Microsoft Azure Cloud, NOT the one of my employer. The list contains a lot of pain points using the cloud offerings that really should work but most are just broken AF. List items are prefixed with the name specific offering it applies to. Some points on this list might have improved by now! (Probably not...)

  • ARM: Code upload for functions not automatic. Needs do be done afterwards manually and then function restarted. Which makes it buggy sometimes because the second restart is more or less ignored by the first one triggered by ARM.
  • ARM: Deleting removed resources only possible by using one Resource Group for each template (so it can be fully managed by ARM).
  • ARM: Role assignments for deleted resources are not deleted, resulting in deployment fails because of already existing role assignments. (Somwhere I read that they will be cleaned upt at some point but that does not seem to be true since I still see one that is orphaned maaaany months ago. edit: years ago)
  • ARM: DNS TTL only possible for all records of subdomain, not individually per record.
  • PowerApps (PA): Connectors not updated automatically in apps after modifying them.
  • ARM: No good way for automatic function certificates for custom domain names. (Azure/bicep#5006, https://github.com/Azure/bicep/tree/v0.5.6/docs/examples/301/function-app-with-custom-domain-managed-certificate)
  • ARM: Code must be uploaded to blob storage manually but it is not possible to query current user to assign the permissions to do it. (https://stackoverflow.com/questions/73150997/roleassignment-with-current-user-id) You have to build a custom solution around that to first query the necessaray details and then pass that into the template as parameter for role assignments.
  • Azure: No email service??? (They are working on one but it seems it still will take years...)
  • Azure: Function Protection with AAD not working if setup with ARM during function creation (need to disable, then re-enable to start blocking requests).
  • ARM: No automatic way to create Application Registration.
  • Azure: When a cron-triggered function is triggered via the web-portal manually, it is possible to get a 503 as result if the function takes long to execute.
  • Azure: Function: KeyVault references are not re-resolved when restarting functions. Trigger by creating a dummy configuration entry.
  • Azure: ARM: Object variables in bicep that evaluate to null will break ARM because null variables are interpreted as missing. Solution: inline in bicep, inlines in ARM...
  • Azure: ARM: You can deploy a VM scale set but when you update cloud-init, the VMs are not updated automatically. You will have to find a way. Good luck!
  • Azure: Functions: Premium and Consumption plans in the same ResourceGroup can randomly fail on first deploy (persistent) https://github.com/Azure/Azure-Functions/wiki/Creating-Function-Apps-in-an-existing-Resource-Group
  • Azure: Functions: Functions secured by Microsoft Auth might simply refuse to work for some functions because of IDX10214: Audience validation failed. Audiences: 'api://816584cf-4330-4e5c-be73-f115ccc063e3'. Did not match: validationParameters.ValidAudience: '816584cf-4330-4e5c-be73-f115ccc063e3' or validationParameters.ValidAudiences: 'null'. Normally, the "guardian" of the function should know that the 'api://...' is the URI configured in the app registration, but for some stupid reason, sometimes does not work whatever you do. The only fix is to add the URI to the valid token audiences in the function config manually.
  • Power Automate: Designed ALM does not work - https://powerusers.microsoft.com/t5/Building-Flows/Flow-inside-managed-solution-cannot-be-shared-run-only-users/m-p/1611483#M179578
  • Azure: Functions: If you deploy functions via bicep and change the plan from consumption to something else, a plan is created where the target instance count is 1 but no instance is ever actually created. The fix is to manually change the scaling to another target and then back to 1. Then the stuff takes ages (multiple minutes) to apply while cycling through various 5xx errors until it finally works. (Maybe.)
  • Azure: Storage: Migration from LRS to ZRS is generally available. Except... when it's not. Like for some unimportant regions like West Europe...
  • Azure: Functions: Cold-start of consumption is extremely slow and performance generally is not good nor consistent. Pretty useless for user-facing apis.
  • Azure: Function: The plan (other than consumption) you choose does not make an obvious difference in performance. It only increases ram and cores, but that only helps to add more functions into it (since your functions probably only use single-core and not much ram).
  • Azure: Functions: The whole plan-stuff makes it potentially difficult to share plans across multiple functions and even inconvenient or impossible to share between resource groups using bicep since you have to pass references around (and possibly permissions).
  • Azure: Functions: The always-on feature is a bit weird. Why would you not have it on since you could use the infinitely cheaper consumption plan. (Except if you need to control the number of cores and memory.)
  • Azure: Functions: Even using the always-on feature there is still a cold-start time. It might be being cheap and sharing the ram which results in having to load it back from disk.
  • Azure: CDN: The combination CDN and ARM is one of the biggest crap in existence. It looks ok at first, but you quickly find out that it is very cumbersome and sometimes impossible to write bicep for CDN because the CDN ARM template makes heavy use of recursion that bicep does not (really) support. The worst part on ARM and CDN is that updating does not work most of the time. Updating adding, removing in bicep (and ARM) can result in a successful operation without any work being done. Also, it might complain about missing stuff that it should create itself as stated in the template.
  • Azure: Functions: The token audience of a secured function behaves strange in general. It should only accept the defined audiences but by some weird random chance, it sometimes works and sometimes doesn't. This applies to the function general, so either the check works after a function was created or never. (Not per request).
  • Azure: Functions: Another "quirk" similar (related?) to the token audience is, that it is sometimes possible that a protected function is not callable via the azure portal. In other words, it is possible for equally configured function A and B (the resource in azure, not the function inside a function...) with entra protection that for function A a http request can be triggered via the portal successfully makes an invocation. For function B on the other hand it is possible that you just get 401 Unauthorized without any explanation and no invocation.
  • Azure: Functions: For whatever fucking reason, gzip compression (or any other for that matter) only works on Windows functions, NOT on Linux functions. ??? https://learn.microsoft.com/en-us/answers/questions/353996/gzip-on-azure-functions-v3-not-working
  • Azure: Functions: Logs are not simply taken from stdout. Logging libraries are instrumented to extract the messages via magic. Most overengineered and complicated solution and if it does not work for some reason you are screwed. Also, that means that it only works for frameworks for which the instumentation was implemented. If you want to use something else, you can't.
  • Azure: SDK (Java): For some stupid reason, the Java SDK is not made with Java (libraries and conventions) in mind but most likely a port from C#. This results in extremely dumb circumstances with testing where the SDK classes cannot be mocked easily. It is perfectly illustrated in their own article, where they talk about the basics of unit testing at first and continue to challenges with certain design decisions. The pinnacle is that the buildup results in testing with their own stuff because you need to workaround their "decisions". https://github.com/Azure/azure-sdk-for-java/wiki/Unit-Testing
  • Graph: Calendars: It is unclear what /me/calendars returns. (https://learn.microsoft.com/en-us/answers/questions/230904/shared-calendars-in-calendar-groups-not-showing-up) Also, calendars shared by default for the whole organization cannot be retrieved because the result in 404. Calendars are only returned if shared explicitly, not implicitly. (But documentation is nowhere to be found.)
  • Power Apps: Using Datverse tables in Power Apps makes sense because where would you else store your data? It is no problem when you use the default components but when you want to do something advanced, and use a connector... then you find out that the connector for Dataverse is "Premium" which translates to your PA and now you have to pay a lot.
  • Azure: Blob storage: If you previously have deployed a blob storage with immutability and the point-in-time restore feature enabled, you are in for a ride. This was possible years/months ago to deploy, but it is now an illegal state. Deploying rom scratch does not work and if you want to change the config it fails because of a useless error message that tries to tell you that both are incompatible. Anyways, as soon as you try to change something now, your options are either to disable ponit-in-time (which is most likely what you want to keep) or disable immutability (which is not possible by design). Therefore, it is stuck in an illegal state forever and cannot be updated (or newly deployed) ever again.
  • Azure: ARM: Versioning (like Microsoft.Storage/storageAccounts@2022-09-01) is only useful if it is actually used. For some dumb reason, allowBlobPublicAccess changed from default true to false some point. Normal "update" deployments are unaffected but deployments in a new environment suddenly fail for no reason. ???
  • Azure: Support: is beyond useless. They do not understand any problem you have. Even if you provide a deplyed sample app, sample code, step-by-step reproducer, step-by-step video with spoken clarifications, ...
  • Azure: Nitpick: They say that one should not use access keys to share stuff from blob storage. That generally make sense for the default keys (why does that even exist?) but you need it if you generate short-lived download urls. Also, there is not much choice when using Functions either, because you somehow need it to access the code.
  • Azure: Functions: Timed trigger only work when the function loads them. It is somehow possible (for a new deployment) that the function new code is not loaded, therefore it does not know the timed trigger and never does something. Klicking on it in the portal however, loads the function and makes the trigger start working.

Typical conversation in forum: https://learn.microsoft.com/en-us/answers/questions/583467/automate-azure-app-registration-client-secret-rota