Towards Domain-Specific Cross-Corpus Speech Emotion Recognition Approach
CoRR(2023)
摘要
Cross-corpus speech emotion recognition (SER) poses a challenge due to
feature distribution mismatch, potentially degrading the performance of
established SER methods. In this paper, we tackle this challenge by proposing a
novel transfer subspace learning method called acoustic knowledgeguided
transfer linear regression (AKTLR). Unlike existing approaches, which often
overlook domain-specific knowledge related to SER and simply treat cross-corpus
SER as a generic transfer learning task, our AKTLR method is built upon a
well-designed acoustic knowledge-guided dual sparsity constraint mechanism.
This mechanism emphasizes the potential of minimalistic acoustic parameter
feature sets to alleviate classifier overadaptation, which is empirically
validated acoustic knowledge in SER, enabling superior generalization in
cross-corpus SER tasks compared to using large feature sets. Through this
mechanism, we extend a simple transfer linear regression model to AKTLR. This
extension harnesses its full capability to seek emotiondiscriminative and
corpus-invariant features from established acoustic parameter feature sets used
for describing speech signals across two scales: contributive acoustic
parameter groups and constituent elements within each contributive group. Our
proposed method is evaluated through extensive cross-corpus SER experiments on
three widely-used speech emotion corpora: EmoDB, eNTERFACE, and CASIA. The
results confirm the effectiveness and superior performance of our method,
outperforming recent state-of-the-art transfer subspace learning and deep
transfer learning-based cross-corpus SER methods. Furthermore, our work
provides experimental evidence supporting the feasibility and superiority of
incorporating domain-specific knowledge into the transfer learning model to
address cross-corpus SER tasks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要