Understanding the AI-powered Binary Code Similarity Analysis

In this paper, we perform a systematic evaluation of the state-of-the-art AI-powered binary code similarity detection (BinSD) approaches on both general binary diffing and two representative downstream applications. According to the findings and implications of our study, we shed light on several key real-world research questions in this problem domain. Specifically, we find that currently, due to the significant binary changes across architectures and optimization levels, the problem of BinSD has not been well addressed. Moreover, the use of some embedding neural networks and evaluation methodologies is questionable and still needs further improvements. Based on the comprehensive experimental results and in-depth analysis, we provide several promising future directions for advancing BinSD. We hope the release of our datasets, benchmarks and implementation can facilitate the development of BinSD.

Docker images

The evaluated BinSD systems run in the following docker images. The docker images can be downloaded from here: image1 and image2.

Dataset

The Dataset can be downloaded from here: basic-dataset and application-dataset.

How to use

To facilitate the reproducibility of our findings, we release all the datasets, benchmarks, and implementation as docker images as shown below.

Asm2Vec

Asteria

BinaryAI-skipt

BinaryAI-bert2

Focus

Focus-skip

Gemini

Gemini-skip

MGMN

VulSeeker

VulSeeker-skip

SAFE

UFE-mean

UFE-attention

UFE-rnn