SecuML is a Python tool that aims to foster the use of Machine Learning in Computer Security. It is distributed under the GPL2+ license. It allows security experts to train models easily and comes up with a web user interface to visualize the results and interact with the models. SecuML can be applied to any detection problem. It requires as input numerical features representing each instance. It supports binary labels (malicious vs. benign) and categorical labels which represent families of malicious or benign behaviours.
- Training and analysing a detection model before deployment
- Collecting a labelled dataset with a reduced workload thanks to active learning
- Exploring a dataset interactively with rare category detection
- Clustering data
- Projecting data
- Computing descriptive statistics of each feature
See the documentation for more detail.
- rabbit-mq server (>= 3.3.5) (only for active learning and rare category detection)
- Python packages :
- celery (>= 3.1.13) (only for active learning and rare category detection)
- flask (>= 0.10.1)
- flask_sqlalchemy (>= 1.0)
- metric-learn (>= 0.3.0)
- numpy (>= 1.8.2)
- pandas (>= 0.14.1)
- scikit-learn (>= 0.18.1)
- sqlalchemy (>= 1.0.12)
SecuML requires an access to a database (MySQL or PostgreSQL) where the user has the following permissions: SELECT, INSERT, UPDATE, DELETE.
- MySQL server (>= 5.5.49)
- Python package :
- mysql.connector (>= 2.1.3)
- PostgreSQL server (>= 9.4.13)
- Python package :
- psycopg2 (>= 2.5.4)
The required librairies can be dowloaded with the script download_libraries
.
The environment variable ̀SECUMLCONF
must be set to the path of the configuration file which must follow the following format (see SecuML_travis_conf.yml
):
input_data_dir: <directory containing the input datasets>
output_data_dir: <directory where the results of the experiments are stored>
db_uri: <URI of the database>
- Beaugnon, Anaël, Pierre Chifflier, and Francis Bach. "ILAB: An Interactive Labelling Strategy for Intrusion Detection." International Symposium on Research in Attacks, Intrusions, and Defenses. Springer, Cham, 2017.
- Bonneton, Anaël. "Machine Learning for Computer Security Experts using Python & scikit-learn", PyParis, 2017.
- Bonneton, Anaël, and Antoine Husson. "Le Machine Learning confronté aux contraintes opérationnelles des systèmes de détection.", SSTIC, 2017.
- Anaël Beaugnon (anael.beaugnon@ssi.gouv.fr)
- Pierre Collet (pierre.collet@ssi.gouv.fr)
- Antoine Husson (antoine.husson@ssi.gouv.fr)