Enhancing masked facial expression recognition with multimodal deep learning
Multimedia Tools and Applications(2024)
摘要
Facial expression recognition (FER) is an essential field for intelligent human-computer interaction, but the COVID-19 pandemic has made unimodal techniques less effective due to masks. Multimodal approaches that combine information from multiple modalities are more robust at recognizing emotions from facial expressions. The need to accurately recognize human emotions based on facial expressions is still significant. The study proposed a multimodal methodology based on deep learning for facial recognition under masks and vocal expressions. The proposed approach used two standard datasets, M-LFW-F and CREMA-D to capture facial and vocal emotional cues. The resulting dataset was used to train a multimodal neural network using fusion techniques that outperformed unimodal methods. The proposed approach achieved an accuracy of 79.05%, while the unimodal approach achieved 68.76%, demonstrating that the proposed approach outperforms unimodal techniques in facial expression recognition under masked conditions. This highlights the potential of multimodal techniques for improving FER in challenging scenarios.
更多查看译文
关键词
Deep neural network,Multi-modal architecture,Fusion techniques,Facial expression under mask,Masked and vocal expressions
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络