My thesis is about: Enhancing transferability to protect against textual adversarial examples.
The steps of my research are as follows:
- Select one NLP task (for example text classification)
- Select two models for solving NLP task
- Select one attack method
- Compute attack success rate for two models
- Compute attack transferability from one model to another
- Enhance the transferability of attack
- Adversarial training by enhanced transferable attack method
- Selected task: sentiment analysis in
rotten_tomatoes
dataset - Selected models:
textattack/xlnet-base-cased-rotten-tomatoes
andtextattack/roberta-base-rotten-tomatoes
- Selected attack method:
BAEGarg2019
- :)
- Compute attack transferability from
textattack/xlnet-base-cased-rotten-tomatoes
totextattack/roberta-base-rotten-tomatoes