aecid-alert-aggregation

A method for grouping, clustering, and merging semi-structured alerts.

Please note that the following dependencies must be installed to run the code:

pip3 install cdifflib
pip3 install editdistance

To get started, just clone this repository and execute

python3 aggregate.py

to run the aecid-alert-aggregation with the default input files and configurations. To change the configuration, edit the aggregate_config.py file. The input files provided in this repository are alerts generated by aminer and Wazuh IDS that were used to analyze the AIT-LDSv1.1.

When running the python script, the current status of the aggregation is printed on console. In its standard configuration, the script runs for several minutes and then outputs the generated meta-alerts in the directory specified in the configuration file.

The directory 'samples' contains several examples that are useful for understanding the aggregation technique. The samples include:

sample_similarity.py similarities of sample alerts
sample_group_similarity.py similarities of sample alert groups
sample_merge.py aggregation of sample alerts
sample_group_merge.py aggregation of sample alert groups
sample_hierarchical_clustering.py execution of the hierarchical clustering method on sample data
sample.py execution of incremental meta-alert generation on sample data (corresponds to scenario 2 in paper)

The directory 'evaluation' contains several scripts that measure the performance of the approach. Note that the respective configurations are inside the scripts instead of the aggregate_config.py file. Evaluation scripts include:

mds.py generates a similarity matrix for multi-dimensional scaling
hierarchical_clustering.py generates an R script for plotting a dendrogram
evaluate.py uses unsupervised clustering for meta-alert generation
cross_validation.py uses supervised training for alert classification
noise_evaluate.py measures the robustness of the approach

Copy any of the sample and evaluation scripts into the main directory to execute it, e.g.:

cp samples/sample.py ./sample.py
python3 sample.py

The output will be generated on the console or in the respective directory in data/out.

More information on the aecid-alert-aggregation is provided in the following paper:

Landauer M., Skopik F., Wurzenberger M., Rauber A.: Dealing with Security Alert Flooding: Using Machine Learning for Domain-independent Alert Aggregation. ACM Transactions on Privacy and Security, 25(3), 1-36. [PDF]

AbdelliNasredine/aecid-alert-aggregation

aecid-alert-aggregation