/sourcerer-malware-detection

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Sourcerer for malware detection

The code for Sourcerer comes from this GitHub repo.

This repo provides code that lightly modifies Sourcerer in two ways:

  1. It includes a tokenizer for JavaScript files
  2. Instead of detecting clones, it's used to detect malware signatures

The use case is to find malicious packages in open-source package registries like npm.

For details of Sourcerer, see the ICSE 2016 paper. For malware samples, have a look at the Backstabber's Knife dataset.