Dataset

Related Works

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533, 2016.
Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. The limitations of deep learning in adversarial settings. In Security and Privacy (EuroS&P), 2016 IEEE European Symposium on, pp. 372–387. IEEE, 2016.
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. In ICLR 2014, 2014.

Yonatan Belinkov and Yonatan Bisk. 2018. Synthetic and natural noise both break neural machine transla- tion. In Proceedings of ICLR.
Adversarial Examples for Natural Language Classification Problems
HotFlip: White-Box Adversarial Examples for Text Classification
Adversarial Examples for Evaluating Reading Comprehension Systems
Papernot, N., McDaniel, P., Swami, A., & Harang, R. (2016, November). Crafting adversarial input sequences for recurrent neural networks. In Military Communications Conference, MILCOM 2016-2016 IEEE (pp. 49-54). IEEE.
Deceiving Google’s Perspective API Built for Detecting Toxic Comments
Towards Crafting Text Adversarial Samples
Deep Text Classification Can be Fooled
Generating Natural Language Adversarial Examples

Yoon Kim. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014.
Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. ¨ Neural computation, 9(8):1735–1780, 1997.

Accuracy, precision, recall
Loss function analysis
- Increment of loss
- Change of classifier confidence (probability) (i.e. sigmoid output)

Imperceptibility analysis
- Human evaluation
- Quantitative measurement: thought vectors
Sentence error analysis
- Syntactic error: word replacement incur grammatical error
- Semantic error: meaning of the sentence change after word replacement
- Counterfactual error: some fact in the sentence is incorrect after word replacement

Research question, data collection, related work (adversarial learning), Experiment (@Erica)
Related work (adversarial learning for NLP x5), Experiment (@Alicia)
Related work (adversarial learning for NLP x3), Evaluation, Experiment, Future work (@Tobey)