EmoInt-Trans: A Multimodal Transformer for Identifying Emotions and Intents in Social Conversations

IEEE/ACM Transactions on Audio, Speech, and Language Processing(2023)

引用 1|浏览53
暂无评分
摘要
In the natural language processing community, open-domain conversational agents, also known as chatbots, are gaining popularity. One of the difficulties is getting them to communicate in an emotionally intelligent manner. To generate dialogues, current neural response generation methods depend solely on end-to-end learning from large scale conversation data. Therefore, we introduce a large-scale multi Emotion and Intent guided Multimodal Dialogue (EmoInt-MD) dataset labelled with 32 emotions and 15 empathetic intents having 32 k dialogues taken from different movie genres. We propose a novel multi-task multimodal contextual Transformer framework for simultaneously identifying the emotions and intents in a given utterance utilizing audio and visual features in addition to the textual information. Experimental analysis proves that the proposed framework outperforms several unimodal and multimodal baselines on the EmoInt-MD dataset. This dataset along with our baseline and proposed framework implementations will be made publicly available for research purposes.
更多
查看译文
关键词
Emotion,empathetic intent,transformers,fusion,context
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要