/COAD

Combination Optimization Anomaly Detection

Primary LanguagePythonApache License 2.0Apache-2.0

The dataset and code implementation of COAD

Congratulations! We have published our research paper on the conference of APSEC2023!

Here is the link: https://conf.researchr.org/details/apsec-2023/apsec-2023-technical-track/39/Effective-Anomaly-Detection-for-Microservice-Systems-with-Real-Time-Feature-Selection

The data folder

The folder data contains the dataset produced by a real deployed microservice system hipstershop.

There are three testbeds' data: dataset1, dataset2, dataset3.

Each testbeds' data contains days of multivariate time series. (For example: instance1 is a day's data).

Each day's data folder contains three files: ground_truth.csv, test_df.csv, and train_df.csv.

Ground_truth.csv contains the true faults happening time; Train_df.csv contains the normal state data for training anomaly detection models; Test_df.csv contains the data for anomaly detection (i.e., it contains the injected true faults as the same as the true faults in the Ground_truth.csv).

The functions folder

The main realization of COAD is in the folder detection; the helper functions are in the folder utils.

The result and result_all folders

The result folder is the path of the output of the anomaly detection; the result_all folder is the history result repository, because we ran the experiment serveral times, and the history results are moved from the folder result to result_all.

The script folder

This folder is as important as the folder 'functions', becasue it contains the entrance of COAD's experiment.

The folder analyze provide the script for analyzing the anomaly detection results (For example, output the M-value or R-value of a result).

The folder derection provide the script for executing the anomaly detection for a day's data: you can choose to use COAD or the original anomaly detection algorithm; you can decide which testbed or which day to detect anomalies; you can choose the anomaly detection algorithm and the metaheuristic algorithm.

The folder plan records the experiment's plan.