Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood Ensemble
Abstract:
Despite neural networks have achieved prominent performance on many natural language processing (NLP) tasks, they are vulnerable to adversarial examples. In this paper, we propose Dirichlet Neighborhood Ensemble (DNE), a randomized smoothing method for training a robust model to defense substitution-based attacks. During training, DNE f...More
Code:
Data:
Tags
Comments