/CenProbe

Actively probing censorship devices

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

CenProbe

DOI

This repository contains a set of scripts for actively probing and collecting data about particular network devices, including those that perform censorship. In here, we include a variety of scripts for collecting network features and banners from nmap, zgrab2, and performing light clustering and classification against the resulting data.

Requirements

  • Install Python v3.9 or higher.
  • Install Jupyter Notebook, for instance using pip install jupyter-notebook.
  • Install the [zgrab2](go get github.com/zmap/zgrab2) command-line tool.
  • (Optional) Install and configure the censys-python package.

Pipeline overview

# Input: host_list.txt
# Input: analyzed_output.csv

# 1. Nmap hosts
mkdir nmap_results
sudo python3.9 scripts/nmap_hosts.py examples/host_list.txt nmap_results

# 2. Fetch Censys data
python3.9 scripts/fetch_censys_data.py --filename examples/host_list.txt --outfile examples/censys_results.json

# 3. Summarize nmap, CenTrace, and Censys data
python3.9 scripts/nmap_analysis.py --dir nmap_results --censys examples/censys_results.json --outfile examples/nmap_analysis.csv

# 4. Banner grabs using zgrab
python3.9 scripts/prep_zgrab.py --nmap_csv examples/nmap_analysis.csv | zgrab2 multiple -c config/zgrab.ini > examples/zgrab_data.jsonl

First, we identify the IP addresses that we want to probe and collect fingerprints from, for instance, using CenTrace.

1. Nmap hosts

sudo python scripts/nmap_hosts.py <HOST_LIST> <OUTDIR>

Requires sudo. This script wraps around nmap and parallelizes OS fingerprinting for a list of hosts. HOST_LIST should be the name of a file containing newline-separated IP addresses. OUTDIR should be the directory to store each of these results.

2. (Optional) Fetch Censys data

python scripts/fetch_censys_data.py --filename <HOST_LIST> --outfile <OUT_FILENAME>

Requires configuration of censys-python library via censys config. HOST_LIST should be the name of a file containing newline-separated IP addresses. OUT_FILENAME should be the directory to store each of these results. If --outfile is omitted, will write to stdout.

3. Generate Nmap data summary

python scripts/nmap_analysis.py --dir <NMAP_DIR> --censys <CENSYS_FILE> --probes <CENTRACE_CSV> --outfile <OUT_CSV>

Summarizes relevant results from Nmap, CenTrace, and Censys into a single CSV. NMAP_DIR generated from step 1., CENSYS_FILE generated from 2., and CENTRACE_CSV generated from CenTrace. --censys and --probes are optional arguments and can be omitted. If --outfile is omitted, will write to stdout.

4. Banner grabs using Zgrab.

python scripts/prep_zgrab.py --nmap_csv <NMAP_CSV> | zgrab multiple -c zgrab.ini > <ZGRAB_OUT>

Requires zgrab CLI tool to be installed. prep_zgrab.py can also output to a file instead of stdout with --outfile argument.

Clustering

Clustering and analysis are done via Jupyter Notebooks located in notebooks/. These Notebooks expect the data from CenTrace, CenProbe, and CenFuzz aggregated into one large table, joined on the probed IP address. To access the full dataset for this project, please email ramaks@umich.edu or monaw@princeton.edu.

Run jupyter notebook.

The cluster notebook can be run with the resulting .pkl file. It then perform various supervised and unsupervised learning tasks with different subsets of the data.

Disclaimer

Running CenProbe scripts from your machine may place you at risk if you use it within a highly censoring regime. CenProbe takes actions that try to trigger censoring middleboxes multiple times, and try to interfere with the functioning of the middlebox. Therefore, please exercice caution while using the tool, and understand the risks of running CenProbe before using it on your machine. Please refer to our paper for more information.

Data

The banner grabbing and active probing measurement data from the study in our paper can be found here.

Citation

If you use the CenProbe tool or data, please cite the following publication:

@inproceedings{sundararaman2022network,<br>
title = {Network Measurement Methods for Locating and Examining Censorship Devices},<br>
author = {Sundara Raman, Ram and Wang, Mona and Dalek, Jakub and Mayer, Jonathan and Ensafi, Roya},<br>
booktitle={In ACM International Conference on emerging Networking EXperiments and Technologies (CoNEXT)},<br>
year={2022}

Licensing

This repository is released under the GNU General Public License (see LICENSE).

Contact

Email addresses: censoredplanet@umich.edu, ramaks@umich.edu, monaw@princeton.edu, jakub@citizenlab.ca, jonathan.mayer@princeton.edu, and ensafi@umich.edu

Contributors

Mona Wang

Ram Sundara Raman