Improving a text classifier using text augmentation: road traffic content from Twitter
2023 20th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)(2023)
摘要
The purpose of this study is to develop a more effective method for categorizing Thai-language tweets related to traffic. The categorization consists of five categories. Previous studies have utilized CNN and BERT for classification, but have faced the challenge of needing balanced data for improved performance. To address this, we propose the use of BPEmb to augmentation the data and calculate cosine similarity. The subsequent step will be to create a balanced dataset to train a combination of CNN and bi-LSTM models for tweet classification. Our experiment demonstrates a significant improvement in tweet classification with a 14.3% increase in F1-score compared to the baseline method.
更多查看译文
关键词
Deep Learning,Text augmentation,Twitter Data Analytics,Road Traffic Incident,Text Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要