This Ansible project provides a comprehensive framework for automating Windows system recovery, with a particular focus on handling critical scenarios such as Blue Screen of Death (BSoD) incidents. Inspired by real-world challenges like the 2024 Crowdstrike incident, this framework demonstrates how to leverage Red Hat Ansible Automation Platform for rapid, scalable recovery across diverse Windows environments. While initially conceived for BSoD scenarios, the principles and techniques showcased here can be adapted for a wide range of Windows system recovery needs, offering a flexible, automated approach to maintaining system health and minimizing downtime in both VMware and OpenShift Virtualization platforms.
The project consists of several playbooks and roles that allow you to:
- Generate a Windows Preinstallation Environment (WinPE) ISO
- Upload the WinPE ISO to your virtualization platform
- Simulate a BSoD scenario
- Boot a problematic VM into WinPE
- Apply fixes to recover from the BSoD
demo-scenario-2-vmware.mp4
Listen to an AI-generated discussion about this project:
Automating.Windows.Recovery.with.AAP.-.Pod.Cast.mp4
Note: This podcast is AI-generated based on the project's README and other related content. It's intended as an experimental, supplementary way to learn about the project. Please treat it as a fun, unofficial overview rather than a definitive technical resource.
We'd love to hear your thoughts on this AI-generated content! Join the discussion in our GitHub Discussions section.
- Did you find the podcast helpful or engaging?
- What aspects of the project would you like to hear discussed in more detail?
- How can we improve our project communication?
The following diagram illustrates the high-level architecture and workflow of the BSoD recovery process:
- Clone this repository to your local machine or Ansible control node.
- Ensure you have Ansible installed (version 2.9 or higher recommended).
- Install required collections:
ansible-galaxy collection install -r collections/requirements.yml
- Install required roles:
ansible-galaxy role install -r roles/requirements.yml
- Modify the
group_vars/all.yml
file to match your environment settings. - Choose the appropriate playbook for your scenario and run it using
ansible-playbook
.
generate_winpe.yml
: Creates a custom WinPE ISO with recovery scripts. This is where the specific fix (e.g., for CrowdStrike-like issues or other BSoD scenarios) is embedded.upload_winpe_iso.yml
: Uploads the generated WinPE ISO to the virtualization platform.
produce_bsod.yml
: Triggers a simulated BSoD on target Windows VMs.execute_winpe_recovery.yml
: Boots the VM into WinPE and executes the embedded recovery script.check_system.yml
: Performs a health check on the Windows systems after recovery.
This project was inspired by the 2024 CrowdStrike incident, where a widespread BSoD issue affected numerous Windows systems. However, the project's scope extends beyond this single incident, providing a framework for automating recovery from various BSoD scenarios.
The workflow is designed as follows:
-
WinPE Generation: The
generate_winpe.yml
playbook creates a custom WinPE ISO. This ISO includes specific scripts tailored to address particular BSoD scenarios, such as the CrowdStrike-like issue or other common BSoD causes. The recovery logic is embedded within this ISO. -
Recovery Execution: The
execute_winpe_recovery.yml
playbook is a generic orchestrator. It's responsible for:- Booting the affected VM into the custom WinPE environment
- Initiating the embedded recovery script
- Monitoring the recovery process
- Handling the VM reboot after successful recovery
This design allows for great flexibility:
- To address different BSoD scenarios, you only need to modify the recovery script embedded in the WinPE ISO during the generation phase.
- The execution playbook remains consistent across various BSoD types, providing a unified interface for recovery operations.
By separating the specific recovery logic (in the WinPE ISO) from the execution process, this project offers a scalable and adaptable solution for automated Windows system recovery.
To import this project into AAP:
- Ensure you have access to an AAP instance.
- Modify the
setup_demo.yml
playbook to match your AAP environment settings. - Run the
setup_demo.yml
playbook. This playbook will:- Create necessary inventories
- Set up job templates
- Configure projects
- Set up workflows
- Once imported, you can access and run the workflow templates from the AAP web interface.
roles/
: Contains custom roles for WinPE creation, VM operations, and BSoD fixes.group_vars/
: Stores common variables for the project.node-config/
: Contains node-specific configurations for different scenarios.collections/
: Lists required Ansible collections.aap_config/
: Holds AAP-specific configuration files.
Note: This project is for demonstration purposes only and should not be used in production environments without proper testing and validation.