The goal of this project is to experiment a software architecture to detect anomalies in timeseries data. Main concerns would include :
- timeseries data extraction
- prediction / anomaly detection model development and training
Here is a list of frameworks that may be of interest :
- Deep Learning for Java : for building prediction / anomaly detection model
- Spring Cloud Stream : for building a data-driven architecture
- Spring Cloud Data Flow : for creating and orchestrating data pipelines
The dataset used to experiment our architecture contains Internet traffic data (in bits) from an ISP. Aggregated traffic in the United Kingdom academic network backbone. It was collected between 19 November 2004, at 09:30 hours and 27 January 2005, at 11:11 hours. Data collected at five minute intervals.
$ cd anomaly-detection-train
$ mvn spring-boot:run
So as to reuse trained model, two files are saved after the training process :
- anomaly-detection-network-model_<version>.zip : trained neural network model
- anomaly-detection-data-normalizer_<version> : data normalizer used for data engineering
See dl4j - Saving and Loading a Neural Network, for details on saving and loading a neural network.
Making predictions will consist in building a data flow from a source of internet traffic observation to the prediction service.
Data flow will basically involved 3 message-driven microservice applications :
- anomaly-detection-source-file, to stream internet traffic observation
- anomaly-detection-predict microservice application, to make prediction based on input streamed internet traffic observation
- anomaly-detection-sink, to display prediction
You will need Kafka [used as messaging middleware] to be installed and running.
$ cd anomaly-detection-predict
$ mvn spring-boot:run
$ cd anomaly-detection-sink
$ mvn spring-boot:run
This will make the data flow.
$ cd anomaly-detection-source-file
$ mvn spring-boot:run
$ mvn clean package