This tool downloads corpora published by OSS-Fuzz.
The code was tested with Python 3.8.16
under Ubuntu 20.04
.
Contributions are welcomed :)
- get the code
git clone https://github.com/VoodooChild99/oss-fuzz-crawler.git
- install dependencies
pip install -r requirements.txt
OR
pip install requests toml rich
- run
crawler.py
usage: crawler.py [-h] [-s] -d DIRECTORY [-m MAX_RETRIES] corpuses
OSS-Fuzz Public Corpora Crawler
positional arguments:
corpuses The TOML file containing corpuses to download
optional arguments:
-h, --help show this help message and exit
-s, --skip-existed Download corpuses only when it's not in local
-d DIRECTORY, --directory DIRECTORY
Directory where the corpuses are stored locally
-m MAX_RETRIES, --max-retries MAX_RETRIES
Max retires when downloading corpuses, always retry if not specified
corpora.toml already covers several OSS-Fuzz projects used by FuzzBench.
You can add more corpuses into corpora.toml as follows:
project = [ "target1", "target2" ]