Entity-Sensitive Attention and Fusion Network for Entity-Level Multimodal Sentiment Classification

IEEE/ACM Transactions on Audio, Speech, and Language Processing(2020)

引用 112|浏览467
暂无评分
摘要
Entity-level (aka target-dependent) sentiment analysis of social media posts has recently attracted increasing attention, and its goal is to predict the sentiment orientations over individual target entities mentioned in users’ posts. Most existing approaches to this task primarily rely on the textual content, but fail to consider the other important data sources (e.g., images, videos, and user profiles), which can potentially enhance these text-based approaches. Motivated by the observation, we study entity-level multimodal sentiment classification in this article, and aim to explore the usefulness of images for entity-level sentiment detection in social media posts. Specifically, we propose an Entity-Sensitive Attention and Fusion Network (ESAFN) for this task. First, to capture the intra-modality dynamics, ESAFN leverages an effective attention mechanism to generate entity-sensitive textual representations, followed by aggregating them with a textual fusion layer. Next, ESAFN learns the entity-sensitive visual representation with an entity-oriented visual attention mechanism, followed by a gated mechanism to eliminate the noisy visual context. Moreover, to capture the inter-modality dynamics, ESAFN further fuses the textual and visual representations with a bilinear interaction layer. To evaluate the effectiveness of ESAFN, we manually annotate the sentiment orientation over each given entity based on two recently released multimodal NER datasets, and show that ESAFN can significantly outperform several highly competitive unimodal and multimodal methods.
更多
查看译文
关键词
Visualization,Social network services,Sentiment analysis,Task analysis,Context modeling,Neural networks,Speech processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要