Bacground: The recent rise in embedded system technologies has insti- gated a significant increase in the development and deployment of Internet-of-Things (IoT) devices, including including routers, webcams, and network printers, causing security concerns.
DevTag is a tool that recognizes information about IoT devices, including a rule-based approach and a model-based approach.
The input is the remote host's banner in the application-layer protocol, and the output is the tag of the remote host. The Tag format is the <device_type, vendor, product_info>.
And here is Devtag web link.
As far, we use three popular sources (listed by Table 1) to generate rules of IoT devices, including NMAP, ZTAG, and ARE.
Source | Original Format | How the rules are stored | Protocol |
---|---|---|---|
NMAP | Regex-> Device Tag | File | FTP, HTTP, RTSP, Telnet |
ZTAG | String/Regex-> Device Tag | Script | FTP, HTTP, Telnet |
ARE | String -> Device Tag | File | FTP, HTTP, RTSP, Telnet |
Note that those rules use different formats and name conventions for IoT devices. To integrate consistent rules, we revise name conventions for all IoT rules and use a unified format to represent them.
<String/Regex> -> <device_type, vendor, product_info>
If a banner of host is matched with <String/Regex> of rule, DevTag provides a tag to this host.
python __main__.py -p <protocol> -f <filename> -T <all/part> -dType <device type> -ven <vendor name>
Parameter | Help |
---|---|
protocol | FTP, HTTP, RTSP, Telnet |
filename | JSON file (banners of hosts) |
all/part | indicates what rules to use |
device type | uses rules belong to this type |
vendor name | uses rules belong to this vendor |
The tool is implemented in Python 3. To install needed packages use:
pip3 install -r requirements.txt
This part provides the following models:
TextCNN, TextRNN, TextRCNN, TextRNN_Att, DPCNN.
If retraining on the original data set in this project, you only need to execute:
python run.py --type train --model <model name> --embedding <random/ pre_trained>
If the data set is updated, you need to perform the following steps: first extract the pre-trained word vector, and then select the model for training.
python utils.py
python run.py --type train --model <model name> --embedding <random/ pre_trained>
In addition, we provide a method for training word vectors based on train.txt
:
python get_wordvector.py
python run.py --type test --model <model name> --file <path name>
The path name is the file path of your test data.