/alphadiff-dataset

This is a deep learning dataset for cross-version binary code similarity detection.

Apache License 2.0Apache-2.0

AlphaDiff Dataset

This is a deep learning dataset for cross-version binary code similarity detection.

Usage

Download

In order to clone it, you will need git-lfs. You can follow the steps:

  1. Install git-lfs as noted on https://www.atlassian.com/git/tutorials/git-lfs#installing-git-lfs

  2. git lfs clone https://github.com/twelveand0/alphadiff-dataset.git

Unzip

  1. On Linux, you can unzip it by the following commands:
>> cd alphadiff-dataset
>> cat cat dataset.z01 dataset.z02 dataset.z03 dataset.z04 dataset.z05 dataset.z06 dataset.z07 dataset.z08 dataset.z09 dataset.zip > complete.zip
>> unzip complete.zip 
>> unzip data.zip

ps: because the original ZIP file is splited into multi-parts, you should first cancatenate the parts in order together.

  1. On Windows, you can just right-click the dataset.zip file and select extract....

  2. On Max, I have not tried.

Data Format

coming soon...