This repo contains the implementations of several linguistic steganography methods in paper "Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding" published in EMNLP 2020.
You need to install all dependent librarys in requirements.txt file. Besides, you need to download the gpt2-medium model (345M parameter) from transformers library
We put all four datasets mentioned in the paepr into the datasets/ folder.
block_baseline.py: implementations of baseline methodBin-LMin the paper.huffman_baseline.py: implementations of baseline methodRNN-Stegain the paper.arithmetic_baseline.py: implementations of baseline methodArithmeticin the paper.saac.py: implementations of our proposed methodSAACin the paper.
You can run all steganography methods in two modes:
run_single_end2end.py: a script to run though the entire steganography pipeline (i.e., encryption -> encoding -> decoding -> decryption) onone plaintext.run_batch_encode.py: a script to run the encryption+encoding steps ona batch of plaintexts.
Example commands are included in run_all.sh.
@inproceedings{Shen2020SAAC,
title={Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding},
author={Jiaming Shen and Heng Ji and Jiawei Han},
booktitle={EMNLP},
year={2020}
}