Machine Learning based web attacks detection.
Webhawk is a Machine Leatning powered Web attack detection system. It uses your web logs as training data. Webhawk offers a REST API that makes it easy to integrate within your SoC ecosystem. To train a detection model and use it as an extra security level in your organization, follwo the following steps.
Rename copy to settings.conf and have fill it with the required parameters as the following.
[MODEL]
model:MODELS/the_model_you_will_train.pkl
[FEATURES]
features:length,param_number,return_code,size,upper_cases,lower_cases,special_chars,depth
$ python encode.py -a -l ./DATA/raw-http-logs-samples/aug_sep_oct_2021.log -d ./DATA/labeled-data-samples/aug_sep_oct_2021.csv
Use the http log data from May to July 2021 to train a model, and test it with the data from August to October 2021.
$ python train.py -a 'dt' -t ./DATA/labeled-data-samples/may_jun_jul_2021.csv -v ./DATA/labeled-data-samples/aug_sep_oct_2021.csv
$ python predict.py -l '198.72.227.213 - - [16/Dec/2018:00:39:22 -0800] "GET /self.logs/access.log.2016-07-20.gz HTTP/1.1" 404 340 "-" "python-requests/2.18.4"'
In order to use the API to need first to launch it's server as the following
$ python3 -m uvicorn api:app --reload --host 0.0.0.0 --port 8000
You can use the following code which based on Python 'requests' (the same in test_api.py) to make a prediction using the REST API
import requests
import json
headers = {
'accept': 'application/json',
'Content-Type': 'application/json',
}
data = {
'http_log_line': '187.167.57.27 - - [15/Dec/2018:03:48:45 -0800] "GET /honeypot/Honeypot%20-%20Howto.pdf HTTP/1.1" 200 1279418 "http://www.secrepo.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/61.0.3163.128 Safari/534.24 XiaoMi/MiuiBrowser/9.6.0-Beta"'
}
response = requests.post('http://127.0.0.1:8000/predict', headers=headers, data=json.dumps(data))
print (response.text)
It will return the following:
{"prediction":"0","proba":"0.9975490196078431","log_line":"187.167.57.27 - - [15/Dec/2018:03:48:45 -0800] \"GET /honeypot/Honeypot%20-%20Howto.pdf HTTP/1.1\" 200 1279418 \"http://www.secrepo.com/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/61.0.3163.128 Safari/534.24 XiaoMi/MiuiBrowser/9.6.0-Beta\""}
To launch the prediction server using docker
$ docker compose build
$ docker compose up
The data you will find in DATA folder comes from https://www.secrepo.com.
Details on how this tools is build could be found at http://enigmater.blogspot.fr/2017/03/intrusion-detection-based-on-supervised.html