Self-Attention-based Data Augmentation Method for Text Classification.

Mehari Yohannes Hailemariam,Steven J. Lynden,Akiyoshi Matono,Toshiyuki Amagasa

ICMLC（2023）

引用 0|浏览2

暂无评分

摘要

Text classification, where textual data is analyzed to gain meaningful information, has many applications in information extraction and data management. Recently, deep-learning models have been applied with success to this problem; however, they require sufficient labeled training data to produce a robust model, and performance suffers in low-resource domains where sufficient training data is unavailable and collecting or creating labeled training data is challenging in terms of cost, energy, and time. To address this problem, we propose an effective data augmentation approach for text classification. Our method employs a self-attention mechanism to augment the text, where we alter and substitute, in some scenarios, words with the highest attention score and, in some cases, words with low scores. Experimental results show that our method performs at least as well as current approaches in most scenarios and outperforms current approaches in some cases by as much as seven percent.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要