RS-Del: Robustness Certificates for Sequence Classifiers via Randomized Deletion

This repository hosts the implementation of our submission to NeurIPS 2023 titled "RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers via Randomized Deletion".

📂 Directory Structure

├── configs
│   ├── certify-exp                   # Configs for evaluation step
│   ├── models                        # Configs for malware detection models
│   └── repeat-forward-exp            # Configs for sampling step
├── data
│   ├── binaries                      # Executables for training and evaluation
│   └── {test,train,valid}.csv        # CSV files for data partitioning
├── docker                            # Docker deployment files
├── outputs                           # Directory for experimental outputs
├── run_scripts                       # Shell scripts for running experiment steps
└── src                               # Source code directory
    ├── torchmalware                  # Python package with core implementations
    ├──                      # Script for training models
    ├──         # Script for sampling perturbed inputs
    ├──    # Script for computing FPR curve
    └── # Script for computing certified radius

🚀 Getting Started

1. Model Training

  • Train the smoothed model using data augmentation via src/
  • Example: See run_scripts/
python src/ --conf configs/models/sample_config.yaml

2. Prediction, Certification & Calibration Sampling

  • Save base model confidence scores via src/
  • Example: See run_scripts/
python src/ --conf configs/repeat-forward-exp/sample_config.yaml

3. False-Positive Rate Calibration (Optional)

  • Vary the decision threshold and compute the FPR via src/
  • Example: See run_scripts/
python src/ --path model/checkpoint.pth --repeat-conf configs/repeat-forward-exp/sample_config.yaml

4. Certification

  • Compute the certified radius via src/
  • Example: See run_scripts/
python src/ --repeat-conf configs/repeat-forward-exp/sample_config.yaml --certify-conf configs/certify-exp/sample_config.yaml

🐳 Docker Deployment

Execute the steps in the provided Docker container.

git clone $REPO_NAME $DEST
cd $DEST/run_scripts
chmod +x ./
./ -p $SH_PATH -m $MEM -c $NUM_CORES -g $GPU_ID
  • For sequential execution of all steps (1-4), use run_scripts/ (Not recommended due to long running time).

📊 Reproducing Experiments

For reproducing experiments on your dataset, follow the instructions in data/

📄 License

This project is licensed under the MIT License - see the file for details.

Cite us as

  author    = {Huang, Zhuoqun and Marchant, Neil and Lucas, Keane and Bauer, Lujo and Ohrimenko, Olya and Rubinstein, Benjamin I. P.},
  title     = {{RS-Del}: Edit Distance Robustness Certificates for Sequence Classifiers via Randomized Deletion},
  year      = {2023},
  booktitle = {Advances in Neural Information Processing Systems},
  series    = {NeurIPS},