This is a modification of an original work by isao takaesu (@13o-bbr-bbq) for a planned competition event at DreamPort. The compeition can be found online HERE. TO REPEAT the original work was not done by DreamPort engineers, but by the very talented Isao Takaesu. The original project URL is:
https://github.com/13o-bbr-bbq/machine_learning_security
I strongly encourage you to review his original README
DreamPort is working on an updated version of this fantastic project to successfully exploit hosts in a self-contained but larger virtualized network and this requred multiple updates to the original work product. For instance, it has trouble with Metasploitable 3. I move the DeepExploit folder out of the sub-directory to a separate project to help with organization.
I have conducted extensive testing of the original DeepExploit work against Kali and Ubuntu Linux and have found that the original product does not function anywhere but has the best out of the box performance with Ubuntu 18.04 Desktop. The biggest issues I had that required changes related to Scrapy.
Other than this we made 1 significant change to how regular expression searching is made. When you search an HTML response with an inline attribute that is of a substantial size, the system will hang. This will result in the authenticated RPC connection to Metasploit timing out. I have changed the manner by which this is performed. In addition, I also added a timeout decorator to another instance of regular expression matching.
All of this being said there are a number of reasons why the project needs work. First it has trouble recognizing additional target types in the current state. You would need to understand how the original author performed work on identifying products using intelligent string matching with regular expressions. In addition, you would need to understand the time limitations imposed by conducting recon through a proxychains tunnel of a Meterpreter routed connection. This slows down the system considerably.
If you simply checkout and run the instructions below to attempt to starting using Deep Exploit you will find the website crawler does not work. This related to the following fiile:
/usr/local/lib/python3.6/dist-packages/scrapy/core/downloader/tls.py
As of 29 Sept there is a strange legacy code in this file that relates to SSLv3. Ironically not 3 or 4 days ago (at the time of this writing) the Scrapy team (great product btw) made an update to the most current release of scrapy to finally remove this code. The drawback? Upgrading to the latest version via pip and a git-based install will tell you that this version requires python 3.7 and above. I will only certify this system works with Python 3.6.
We provide a copy of the original requirements.txt file used in the upstream project within the doc/ sub-directory. As we stated above, the assumption is that you are working with Ubuntu Linux 18.04 for and testing or research you may do.
The following versions of Linux are believed to be supported:
- Ubuntu 18.04.3 LTS
The following setup process should make your Ubuntu 18 machine capable of running DeepExploit:
cd ~
mkdir workspace && cd workspace
curl https://raw.githubusercontent.com/rapid7/metasploit-omnibus/master/config/templates/metasploit-framework-wrappers/msfupdate.erb > msfinstall
chmod +x msfinstall
sudo ./msfinstall
msfdb init
sudo apt install -y autoconf git vim htop net-tools wireshark build-essential libssh2-1-dev libssl-dev python3-pip python3.6-venv rust-all
git clone https://github.com/nmap/nmap.git
cd nmap/
./configure
make && sudo make install
cd ~/workspace
git clone https://github.com/rofl0r/proxychains-ng.git
cd proxychains-ng
./configure
make && sudo make install
sudo curl -o /usr/local/share/nmap/scripts/banner-plus.nse https://gist.githubusercontent.com/littleairmada/b04319742c29efe44d5662d842c20e1c/raw/c500449760e7a97f780d0b3627dac37823168a00/banner-plus.nse
sudo curl -o /usr/local/share/nmap/scripts/elasticsearch.nse https://raw.githubusercontent.com/theMiddleBlue/nmap-elasticsearch-nse/master/elasticsearch.nse
cd ~/workspace
sudo python3 -m pip install virtualenv
git clone https://github.com/TheDreamPort/deep_exploit.git
cd deep_exploit
python3 -m venv virtualenv
source ./virtualenv/bin/activate
pip install setuptools-rust==1.1.2 wheel==0.37.1
pip install -r requirements.txt
sudo apt install -y nvidia-cuda-toolkit
sudo cp doc/script/msfrpcd.service /lib/systemd/system
sudo systemctl daemon-reload
sudo systemctl enable msfrpcd
sudo systemctl start msfrpcd.service
Although the project does not work against Kali Linux I did not want to lose the original research on how to at least make it functional.
sudo python3.10 -m pip install virtualenv==20.16.3
sudo apt install python3.10-venv
python3.10 -m venv virtualenv
source ./virtualenv/bin/activate
./virtualenv/bin/pip3.10 install -r requirements.txt
./virtualenv/bin/python3.10 DeepExploit.py -h
sudo apt install -y nvidia-cuda-toolkit
sudo systemctl enable postgresql
sudo systemctl start postgresql
msfdb init
sudo curl -o /usr/share/nmap/scripts/elasticsearch.nse https://raw.githubusercontent.com/theMiddleBlue/nmap-elasticsearch-nse/master/elasticsearch.nse
sudo curl -o /usr/share/nmap/scripts/banner-plus.nse https://gist.githubusercontent.com/littleairmada/b04319742c29efe44d5662d842c20e1c/raw/c500449760e7a97f780d0b3627dac37823168a00/banner-plus.nse
echo "export LD_LIBRARY_PATH="/usr/share/TensorRT-7.2.3.4/lib:/usr/share/cudnn-linux-x86_64-8.5.0.96_cuda11-archive/lib"" >> /home/$USER/.zsh
See the project's wiki for installation, usage and changelog.
DeepExploit is fully automated penetration test tool linked with Metasploit.
DeepExploit identifies the status of all opened ports on the target server and executes the exploit at pinpoint using Machine Learning. It's key features are following.
-
Efficiently execute exploit.
DeepExploit can execute exploits at pinpoint (minimum 1 attempt) using Machine Learning. -
Deep penetration.
If DeepExploit succeeds the exploit to the target server, it further executes the exploit to other internal servers. -
Self-learning.
DeepExploit can learn how to exploitation by itself (uses Reinforcement Learning).
It is not necessary for humans to prepare learning data. -
Learning time is very fast.
Generally, reinforcement learning takes a lot of time.
So, DeepExploit uses distributed learning by multi agents.
We adopted an advanced machine learning model called A3C. -
Powerful intelligence gathering
To gather the information of software operated on the target server is very important for successful the exploitation. DeepExploit can identify product name and version using following methods.- Port scanning
- Machine Learning (Analyze HTTP responses gathered by Web crawling)
- Contents exploration
Current DeepExploit's version is a beta.
But, it can fully automatically execute following actions:
- Intelligence gathering.
- Threat modeling.
- Vulnerability analysis.
- Exploitation.
- Post-Exploitation.
- Reporting.
By using our DeepExploit, you will benefit from the following.
For pentester:
(a) They can greatly improve the test efficiency.
(b) The more pentester uses DeepExploit, DeepExploit learns how to method of exploitation using machine learning. As a result, accuracy of test can be improve.
For Information Security Officer:
(c) They can quickly identify vulnerabilities of own servers. As a result, prevent that attackers attack to your servers using vulnerabilities, and protect your reputation by avoiding the negative media coverage after breach.
Since attack methods to servers are evolving day by day, there is no guarantee that yesterday's security countermeasures are safety today. It is necessary to quickly find vulnerabilities and take countermeasures. Our DeepExploit will contribute greatly to keep your safety.
Note |
---|
If you are interested, please use them in an environment under your control and at your own risk. And, if you execute the DeepExploit on systems that are not under your control, it may be considered an attack and you may have legally liability for your action. |
DeepExploit consists of the machine learning model (A3C) and Metasploit.
The A3C executes exploit to the target servers via RPC API.
The A3C is developped by Keras and Tensorflow that famous ML framework based on Python. It is used to self-learn exploit's way using deep reinforcement learning. The self-learned's result is stored to learned data that reusable.
Metasploit is most famous penetration test tool in the world. It is used to execute an exploit to the target servers based on instructions from the A3C.
DeepExploit learns how to exploitation by itself using advanced machine learning model called A3C.
The A3C consists of multiple neural networks.
This model receives the training server information such as the OS type, product name, product version, etc as inputs of neural network, and outputs the payload according to the input information. The point is, exploitation is successful when this model outputs a optimal payload according to the input information.
In training, this model executes more than 10,000 exploits to the training servers via Metasploit while changing the combination of the input information. This model is updating the weights of the neural network according to the exploitation results (rewards), which will gradually optimized the neural network.
After training, this model can output the optimal payload according to the input information.
In order to shorten the training time, training is executed by multithreading.
Therefore, learning by using various training servers, DeepExploit can execute accurate exploit according to various situations.
So, DeepExploit uses training servers such as metasploitable3, metasploitable2, owaspbwa for learning.
- ex) Training servers
metasploitable2
metasploitable3
others
DeepExploit gathers the target server information such as OS type, Opened port, Product name, Product version using Nmap. As a Nmap result, DeepExploit can extract below information.
- ex) Nmap result.
Idx | OS | Port# | product | version |
---|---|---|---|---|
1 | Linux | 21 | vsftpd | 2.3.4 |
2 | Linux | 22 | ssh | 4.7p1 |
3 | Linux | 23 | telnet | - |
4 | Linux | 25 | postfix | - |
5 | Linux | 53 | bind | 9.4.2 |
6 | Linux | 80 | apache | 2.2.8 |
7 | Linux | 5900 | vnc | 3.3 |
8 | Linux | 6667 | irc | - |
9 | Linux | 8180 | tomcat | - |
As a Nmap result, if web ports such as 80, 8180 are opened, then the DeepExploit executes below examination.
- Contents discovery.
By the DeepExploit executes contents exploration, it can identify Web products using found product's default contents.
ex) Contents exploration result.
Idx | Port# | found content | product |
---|---|---|---|
1 | 80 | /server-status | apache |
2 | 80 | /wp-login.php | wordpress |
3 | 8180 | /core/misc/drupal.init.js | drupal |
4 | 8180 | /CFIDE/ | coldfusion |
- Analysis of HTTP responses.
The DeepExploit gathers numerous HTTP responses from Web Apps on the Web Port using Scrapy.
And, by the DeepExploit analyzes gathered HTTP responses using Signature (string matching pattern) and Machine Learning, it can identify Web products.
HTTP response sample is below.
HTTP/1.1 200 OK
Date: Tue, 06 Mar 2018 06:56:17 GMT
Server: OpenSSL/1.0.1g
Content-Type: text/html; charset=UTF-8
Set-Cookie: f00e68432b68050dee9abe33c389831e=0eba9cd0f75ca0912b4849777677f587; path=/;
Etag: "409ed-183-53c5f732641c0"
…snip…
<form action="/example/confirm.php">
By the DeepExploit uses Signature, it can easily identify two products that OpenSSL and PHP.
Server: OpenSSL/1.0.1g
confirm.php
It is very easy.
In addition, by the DeepExploit uses Machine Learning, it can identify more products that Joomla! and Apache.
Set-Cookie: f00e68432b68050dee9abe33c389831e=0eba9cd0f75ca0912b4849777677f587;
This is feature of Joomla!.
Etag: "409ed-183-53c5f732641c0"
This is feature of Apache.
Beforehand, by the DeepExploit learned these features using Machine Learning (Naive Bayes), it can identify products that couldn't identify by Signature.
The DeepExploit execute the exploit to the first target server using trained data and identified product information.
It can execute exploits at pinpoint (minimum 1 attempt).
If the DeepExploit succeeds the exploitation, then session will be open between DeepExploit and first target server.
The DeepExploit executes the pivoting using opened session in Step 2.
Afterwards, the DeepExploit that do not have direct connection to the internal server can execute exploits through the first server (=compromised server). As a result, the DeepExploit is repeating Step1 to Step3 in the internal server through compromised server.
The DeepExploit generates a scan report that summarizes vulnerabilities.
Report sample is below.
- Hardware
- OS: Kali Linux 2018.2
- CPU: Intel(R) Core(TM) i7-6500U 2.50GHz
- GPU: None
- Memory: 8.0GB
- Software
- Metasploit Framework 4.16.48-dev
- Python 3.6.5rc1
- beautifulsoup4==4.6.0
- docopt==0.6.2
- Jinja2==2.10
- Keras==2.1.6
- msgpack-python==0.5.6
- numpy==1.13.3
- pandas==0.23.0
- tensorflow==1.8.0
MBSD Blog
Sorry, now Japanese only.
English version is coming soon.
- β-VAE: LEARNING BASIC VISUAL CONCEPTS WITH A CONSTRAINED VARIATIONAL FRAMEWORK
- Understanding disentangling in β-VAE
Isao Takaesu
takaesu235@gmail.com
https://twitter.com/bbr_bbq