/vm-start-stop

This repository contains an example solution for turning on/off VMs automatically. It is often built from scratch in many organisation, so here you can find an example implementation.

Primary LanguageC#MIT LicenseMIT

Introduction

This repository is a part of '100 commitów' challenge. I will work on a project for 100 days in this challenge. This project aims to have a simple solution for starting and stopping virtual machines in Azure based on the schedule. The schedule will be defined in resource tags.

Motivation

We can save money by turning virtual machines on/off in Azure. We can turn off VMs during the night, weekends, or holidays. We can also turn on VMs during working hours. This solution can be helpful for development, testing, and staging environments.

Example calculation

The cost of the Azure environment can be calculated with Azure Pricing Calculator. Of course, the end cost depends on many factors, like networking configuration, storage, etc.

Let's assume that we have a VM with the following configuration:

Do some calculations:

  • Running VM cost: €147.37/month
  • If we are running the VM only 10 hours per day, 7 days per week (300h per month), the cost will be: €60.56/month - it means that we can save €86.81/month (58.8%).
  • If we are running the VM only 10 hours per day, 5 days per week (220h per month), the cost will be: €44.01/month - it means that we can save €103,36/month (70%).

Requirements

The solution needs to turn VMs on and off based on schedule within tags on the VM or RG levels. The tag on the RG level means that all VMs in the RG should be turned on/off simultaneously.

  • We would avoid administrative overhead. We want to prevent secret management/rotation, manual authentication, etc.
  • The solution should be monitored to see if it is working. If any problem will appear, we would like to be notified
    • Email notification
    • Slack/Discord notification
  • A single solution should cover different subscriptions.
  • The solution should work in 15-minute time frames.

Existing options

We have two options to achieve this goal with the existing Azure services:

Design

Miro board

High-level components

High-level components

Process design

Process design

Infrastucture design

Diagram test on AzureDiagrams.com

In the 5th day of the challenge, I have created a diagram of the infrastructure on AzureDiagrams.com. The diagram is available at the following link: VM Start/Stop diagram on AzureDiagrams.com

Infrastructure design on AzureDiagrams.com

The tool is handy, but it has some limitations. I will use it for the initial design, but I will use another tool for the final version.

Diagram on draw.io

The raw version of the diagram is stored in the file: docs/assets/vm-start-stop.drawio.

Infrastructure design on draw.io

Access

To allow the function app to switch VMs, the service principal (or managed identity as in our case) should have the following permissions:

  • Reader,
  • Virtual Machine Contributor.

Thanks to this, the function app can read subscriptions, and resource groups, look for VMs, and start/stop them. To avoid the manual assignment of permissions everywhere, it is recommended to use management groups.

Resource naming convention

It is an excellent practice to have a naming convention for resources. The naming convention can help identify the resource's purpose, environment, owner, etc. It supports proper resource management and cost allocation and helps to determine the resources in the logs, monitor them, etc.

Useful links:

Azure Naming Tool

The tool installation can be found in the documentation. The preferred way is to run it as a Docker container.

The global configuration file for the Azure Naming Tool is placed here: src/naming-convention/globalconfig.json.

The components configuration file for the Azure Naming Tool is placed here: src/naming-convention/componentsconfig.json.

Deployment

Deployments into Azure are done with Powershell commands and scripts stored in the src/scripts folder.

The authorization of the GitHub Actions workflow is done with Identity Federation how it is described in the documentation: Quickstart: Deploy Bicep files by using GitHub Actions

The deployment workflows are using:

The workflow environments were configured separately to deploy resources into the production and development environments. Also, it was required to allow connections based on federated identity from different branches. More info in the documentation Configure a federated identity credential on an app: GitHub Actions (point 4.).

In the workflow, we have also added conditional deployments to ensure that the resources are only deployed to the production environment from the main branch. More info Using conditions to control job execution.

Splatting

The splatting mechanism is a Powershell syntax that allows you to simplify runs of cmdlets with a significant number of parameters.

about_Splatting

Approved verbs

Approved Verbs for PowerShell Commands

Role assignments

Role assignments with Bicep are described in the documentation: Create Azure RBAC resources by using Bicep.

The principalId value is an object ID of the Enterprise Application related to the service principal.

Deployment mode

The 'Complete' mode deployment ensures that deployments manage all resources on the resource group level. More information:

WhatIf deployment check

To check the deployment before the actual deployment, the WhatIf parameter can be used. More information: ARM template deployment what-if operation

Architecture Decision Records

The Architecture Decision Records (ADR) will keep a history of architectural changes. More information about ADR can be found:

Template of ADRs: Decision record template by Michael Nygard.

[ADR001] Azure Functions Consumption plan on Windows (2024-03-24)

Status

ACCEPTED

Context

During the function app deployment, there was an error:

Requested features are not supported in region. Please try another region. (Target: /subscriptions/88a99f8e-abc3-4f87-b5d1-6582ecf72501/resourceGroups/eit-vms-plc-dev-rg-1/providers/Microsoft.Web/serverfarms/eit-vms-plc-dev-plan-1)

After checking the Products available by region documentation, it was found that the Azure Functions Consumption plan on Linux is not available in the Poland Central region.

Decision

The Azure Functions Consumption plan on Windows will be deployed in the Central region of Poland.

Consequences

It is expected to double-check if the code that runs on other platforms (like Linux) will work on Windows.

[ADR002] Removal of the App Configuration service (2024-03-26)

Status

ACCEPTED

Context

During the function app deployment, there was an error:

10:14:29 - The deployment 'vms_20240325101322' failed with error(s).
     | Showing 1 out of 1 error(s). Status Message: The subscription has
     | reached its limit of 'configurationStores' resources with the 'free'
     | SKU. (Code:SkuPerSubscriptionLimitReached)  CorrelationId:
     | beefdcbe-5a3c-489b-bb14-a9a3795a2673

The error message indicates that the subscription has reached its limit of 'configurationStores' resources with the 'free' SKU. Based on the documentation, each subscription has a limit of 1 free configuration store. The App Configuration service was added to store the configuration data. Source: Which App Configuration tier should I use?.

The App Configuration service in the Standard tier costs about 33 Euros per month.

Decision

As the App Configuration service is not required for the current solution and the application's configuration can be handled on the function app level, it will be removed.

Consequences

The application's configuration have to be handled on the function app level.

Tag design

The tag can be defined at the subscription, resource group, or VM level. The tag key is the same for all levels.

The tag key is: VM-START-STOP-SCHEDULE.

The tag value will be created in the following way:

<ON/OFF>;HH:MM-HH:MM;<TIMEZONE>;<MONDAY/TUESDAY/WEDNESDAY/FRIDAY/SATURDAY/SUNDAY/WORKWEEK/WEEK/WEEKEND>

The denominator character is: ';'.

The first part defines whether the tag should apply to the scope. The value can be ON or OFF. If the value is ON, the tag should be evaluated. If the value is OFF, the tag should be ignored.

The second part defines the time range. The time range is described in the format HH:MM-HH:MM. The time is in the 24-hour format. The - character separates the time range. The time range defines when the VM should be turned on.

The third part defines the timezone. The timezone should be defined in the IANA configuration: IANA timezone database. The summertime should be applied automatically.

The fourth part defines the days when the VM should be turned on. The value can be: MONDAY/TUESDAY/WEDNESDAY/FRIDAY/SATURDAY/SUNDAY/WORKWEEK/WEEK/WEEKEND. The values can be connected by comma, for example: WEEKEND,MONDAY,TUESDAY.

Easter eggs

There is an Easter egg for you. But this is about resting, spending time with your family, and taking care of yourself. If you are thinking about how to be a better developer, engineer, and so on, remember that you need to take care of yourself first. So, take a break, go outside, and enjoy the time with your family and friends.

Useful materials

Links

Links time zones conversion

Tools