/Data-Drift

Data Drift Analysis and Anomaly detection tools

Primary LanguageJupyter Notebook

Data Drift Analysis and Anomaly detection tools

INTRODUCTION

In the repository are reported methodologies of Anomaly and Drift Detection univariate and multivariate, tested on two different Datasets: KDDCUP99 and Bike sharing.

KDDCUP99

A Dataset usually used as a benchmark in the task of anomaly detection contains information about computer characteristics for the discrimination of possible cyber attacks. A more accurate description can be found here.

BIKE DATASET

Datasets are usually used as baselines in regression and time series forecasting tasks. Contains geographic and meteorological information for the hourly forecast of the number of bikes rented. A more accurate description can be found here.

METHODOLOGY

  • Univariate_ PSI, Test Kolmogorov-Smirnov
  • Multivariate: PCA Reconstrucion, VAE Reconstruction, MC-Dropout

Contribution

Lorenzo Loschi