Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding

Jiaming Shen, Heng Ji, Jiawei Han


Abstract
Linguistic steganography studies how to hide secret messages in natural language cover texts. Traditional methods aim to transform a secret message into an innocent text via lexical substitution or syntactical modification. Recently, advances in neural language models (LMs) enable us to directly generate cover text conditioned on the secret message. In this study, we present a new linguistic steganography method which encodes secret messages using self-adjusting arithmetic coding based on a neural language model. We formally analyze the statistical imperceptibility of this method and empirically show it outperforms the previous state-of-the-art methods on four datasets by 15.3% and 38.9% in terms of bits/word and KL metrics, respectively. Finally, human evaluations show that 51% of generated cover texts can indeed fool eavesdroppers.
Anthology ID:
2020.emnlp-main.22
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Editors:
Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
303–313
Language:
URL:
https://aclanthology.org/2020.emnlp-main.22
DOI:
10.18653/v1/2020.emnlp-main.22
Bibkey:
Cite (ACL):
Jiaming Shen, Heng Ji, and Jiawei Han. 2020. Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 303–313, Online. Association for Computational Linguistics.
Cite (Informal):
Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding (Shen et al., EMNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.emnlp-main.22.pdf
Video:
 https://slideslive.com/38938685
Code
 mickeystroller/StegaText