Multimodal Sentimental Analysis Using Hierarchical Fusion Technique

2023 IEEE 4th Annual Flagship India Council International Subsections Conference (INDISCON)（2023）

引用 0|浏览8

暂无评分

摘要

Multimodal sentiment analysis encompasses modalities beyond text, including visual and audio data, enabling combinations of two or three modalities. It has diverse applications in domains like virtual assistants, YouTube movie reviews, news videos, and emotion recognition, including depression monitoring. Sentiment analysis involves extracting opinions from public discourse to enhance various operations. In this paper, using the Multimodal Opinion-level Sentiment Intensity (MOSI) dataset, which consists of video, audio, and text data, for the sentimental analysis of multimodal data is proposed. One of the significant challenges in this analysis is accurately determining the sentiment (optimistic or pessimistic) when the data is presented in the form of a tone. The proposed technique analyzes sentiments by focusing on audio and text data. Feature extraction is performed using Mel-Frequency Cepstral Coefficient (MFCC) for audio and Word2vec for text. These extracted characteristics are combined using hierarchical fusion and a Convolutional Autoencoder (CAE) is employed to obtain bottleneck features. Experimental results demonstrate an accuracy of 62.45% for audio data, 73% for text data, and 76.09% for the fusion of audio and text using the Long Short-Term Memory (LSTM) classifier.

查看译文

关键词

Sentiment analysis,Mel-frequency cepstral coefficient,Word2vec,Hierarchical fusion,Convolutional Autoencoder,long short-term memory

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要