ministryofjustice/modernisation-platform

Tuning Alerts from Security Hub

Closed this issue · 3 comments

User Story

As a MP Engineer
I want to tune the alerts we are receiving from Security Hub to reduce duplication
So that it's easier to see the important findings from the service

Value / Purpose

As part of ticket #8076 we've enabled alerting for security hub issues that are categorised CRITICAL or HIGH.

Lots of alerts are being sent to the #modernisation-platform-security-hub-alerts slack channel and lots of incidents are being raised in the Security Hub Alerts - Modernisation Platform PagerDuty service.

Issues:

  1. We seem to be getting multiple alerts for the same type of finding e.g. when you have a SSH port open to 0.0.0.0/0 it triggers about 4 separate alerts as we have multiple standards applied e.g. AWS Foundational Security Best Practices, CIS AWS Foundations and PCI DSS v3.2.1. Here are some example alerts of this particular issue. This could be fixed by using consolidated control settings
  2. We are getting multiple alerts because config rules associated with the standards are run periodically checking after the first time an issue is raised and triggering a follow-up alert. We could consider removing config rules or using SecHub Automation we can update statuses etc.
  3. We are getting common alerts for rules across the 5 enabled regions of the baseline.

To consider:

  • Do we turn off some of the standards due to the overlap?
  • Do we suppress specific alerts where there is overlap?
  • Do we turn on consolidated control settings?
  • Can we stop receiving alerts from other regions (currently we are alerting from all 5 enabled baseline regions) if cross-region aggregation is already enabled?
  • Should we use Security Hub Automation rules to control the statuses of findings etc. to tame the alerts?

Useful Contacts

No response

Additional Information

No response

Definition of Done

  • Review alerts for overlap/duplication
  • Determine best ways of suppressing/taming alerts based on suggestion above
  • Take action on the agreed process
  • Test/Review alerts after making changes for improvement

Findings on Tuning Duplicate Alerts:

Enable Consolidated Settings:
By enabling consolidated settings within the organisation-security account, we can reduce duplicate alerts triggered by different standards, improving alert accuracy and clarity. This setting will allow all relevant standards to aggregate alerts into a single finding, thus minimizing noise and redundancy. Implementation of this consolidated setting needs to be coordinated with various teams, and Dave has taken the responsibility to discuss this with the respective teams to ensure a smooth rollout across our organization.

Security Hub in All Regions:
Although our infrastructure is not currently deployed in all regions, we are opting to keep the AWS Security Hub enabled in these regions. This approach ensures that we still receive alerts if any new infrastructure is created in currently unused regions, particularly if these resources present any security vulnerabilities. This proactive setup will alert us of any potential issues immediately, even in regions with minimal or no active infrastructure.

Suppression of Duplicate Findings:
Currently, there is no direct way to suppress duplicate findings within AWS, which contributes to alert noise. AWS has acknowledged this limitation and is actively working on a feature to address it. However, until AWS releases this functionality, we will need to manage duplicate findings as best we can with the existing configurations.

Automation Rules Limitations:
We explored using automation rules to reduce alert volume by adjusting severity levels. However, AWS automation rules only allow us to change an alert’s severity level; they do not support changing the workflow status once alerts are delivered to Slack. This limitation restricts our options for automatically managing alerts after initial notification, as we cannot adjust workflow statuses through automation rules alone.

Workflow Status to Slack:
We considered using a Lambda function to automatically change the workflow status to “notified” after sending alerts to Slack. However, we’ve decided against this approach. Receiving these alerts daily in Slack without changing their status allows our team to consistently monitor security alerts in real time. This daily feed ensures visibility, allowing us to stay on top of new and existing issues as they arise.

EventBridge and SNS Adjustment:
To further reduce duplicate alerts, I have removed EventBridge and SNS configurations in all regions except eu-west-2, as all alerts are already being centralized there. Previously, having EventBridge and SNS configured in multiple regions caused duplicate findings, as alerts were coming in from both eu-west-2 and the respective local region. With this change, we now receive findings from a single, centralized source, significantly reducing redundant alerts.

Severity Filtering Update:
Previously, we received alerts with both high and critical severity. We have now adjusted this to receive only critical alerts. This prioritization allows us to focus on the most urgent issues first, as the shorter list makes monitoring more manageable. Once we address the critical alerts, we plan to expand our monitoring to include high-severity alerts as well.

I updated the baseline module to restrict Security Hub alerting resources to eu-west-2 only, released a new tag, and updated the module reference in the MP repository. This ensures all alerts are centralized in eu-west-2, reducing duplicates across regions.

ministryofjustice/modernisation-platform-terraform-baselines#650
#8425

I've reviewed this and am happy it's complete.