/SecuML

Machine Learning for Computer Security

Primary LanguagePythonGNU General Public License v2.0GPL-2.0

SecuML

SecuML is a Python tool that aims to foster the use of Machine Learning in Computer Security. It is distributed under the GPL2+ license. It allows security experts to train models easily and comes up with a web user interface to visualize the results and interact with the models. SecuML can be applied to any detection problem. It requires as input numerical features representing each instance. It supports binary labels (malicious vs. benign) and categorical labels which represent families of malicious or benign behaviours.

Features

See the documentation for more detail.

Requirements

  • rabbit-mq server (>= 3.3.5) (only for active learning and rare category detection)
  • Python packages :
    • celery (>= 3.1.13) (only for active learning and rare category detection)
    • flask (>= 0.10.1)
    • flask_sqlalchemy (>= 1.0)
    • metric-learn (>= 0.3.0)
    • numpy (>= 1.8.2)
    • pandas (>= 0.14.1)
    • scikit-learn (>= 0.18.1)
    • sqlalchemy (>= 1.0.12)

Database

SecuML requires an access to a database (MySQL or PostgreSQL) where the user has the following permissions: SELECT, INSERT, UPDATE, DELETE.

MySQL database
  • MySQL server (>= 5.5.49)
  • Python package :
    • mysql.connector (>= 2.1.3)
PostgreSQL database
  • PostgreSQL server (>= 9.4.13)
  • Python package :
    • psycopg2 (>= 2.5.4)

JS and CSS libraries

The required librairies can be dowloaded with the script download_libraries.

Configuration

The environment variable ̀SECUMLCONF must be set to the path of the configuration file which must follow the following format (see SecuML_travis_conf.yml):

input_data_dir: <directory containing the input datasets>
output_data_dir: <directory where the results of the experiments are stored>
db_uri: <URI of the database>

Papers and Presentations

Authors