End-To-End Deep Learning Framework For Speech Paralinguistics Detection Based On Perception Aware Spectrum

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION(2017)

引用 20|浏览25
暂无评分
摘要
In this paper, we propose an end-to-end deep learning framework to detect speech paralinguistics using perception aware spectrum as input. Existing studies show that speech under cold has distinct variations of energy distribution on low frequency components compared with the speech under 'healthy' condition. This motivates us to use perception aware spectrum as the input to an end-to-end learning framework with small scale dataset. In this work, we try both Constant Q Transform (CQT) spectrum and Gammatone spectrum in different end-to end deep learning networks. where both spectrums are able to closely mimic the human speech perception and transform it into 2D images. Experimental results show the effectiveness of the proposed perception aware spectrum with end-to-end deep learning approach on Interspecch 2017 Computational Paralinguistics Cold sub-Challenge. The final fusion result of our proposed method is 8% better than that of the provided baseline in terms of UAR.
更多
查看译文
关键词
computational paralinguistics, speech under cold, deep learning, perception aware spectrum
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要