This repo contains the implementations of several linguistic steganography methods in paper "Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding" published in EMNLP 2020.
You need to install all dependent librarys in requirements.txt
file. Besides, you need to download the gpt2-medium
model (345M parameter) from transformers library
We put all four datasets mentioned in the paepr into the datasets/
folder.
block_baseline.py
: implementations of baseline methodBin-LM
in the paper.huffman_baseline.py
: implementations of baseline methodRNN-Stega
in the paper.arithmetic_baseline.py
: implementations of baseline methodArithmetic
in the paper.saac.py
: implementations of our proposed methodSAAC
in the paper.
You can run all steganography methods in two modes:
run_single_end2end.py
: a script to run though the entire steganography pipeline (i.e., encryption -> encoding -> decoding -> decryption) onone plaintext
.run_batch_encode.py
: a script to run the encryption+encoding steps ona batch of plaintexts
.
Example commands are included in run_all.sh
.
@inproceedings{Shen2020SAAC,
title={Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding},
author={Jiaming Shen and Heng Ji and Jiawei Han},
booktitle={EMNLP},
year={2020}
}