/RoPGen

Primary LanguagePython

RoPGen: Towards Robust Code Authorship Attribution via Automatic Coding Style Transformation

Zhen Li, Guenevere Chen, Chen Chen, Yayi Zou, Shouhuai Xu. RoPGen: Towards Robust Code Authorship Attribution via Automatic Coding Style Transformation. In Proceedings of the 44th International Conference on Software Engineering (ICSE ’22), May 21–29, 2022, Pittsburgh, PA, USA.


We leverage 23 coding style attributes and propose two automatic coding style imitation and hiding attacks.

We propose an innovative framework RoPGen, which learns authors’ coding style patterns that are hard for attackers to manipulate. The key idea is to incorporate data augmentation and gradient augmentation to learn robust coding style patterns.

We use four datasets for evaluation: the first two are used in the literature and the last two are introduced in our work.

(1) GitHub-C dataset

(2) GCJ-Java dataset

(3) GitHub-C dataset

(4) GCJ-Java dataset