BIT-MI Deep Learning-based Model to Non-intrusive Speech Quality Assessment Challenge in Online Conferencing Applications

Miao Liu,Jing Wang, Liang Xu,Jianqian Zhang,Shicong Li,Fei Xiang

Conference of the International Speech Communication Association (INTERSPEECH)（2022）

引用 0|浏览1

暂无评分

摘要

This paper presents the details of the BIT-MI deep learning-based model submitted to the ConferencingSpeech challenge 2022. Due to the large time and labor costs of subjective tests, the challenge aims to promote the non-intrusive objective quality assessment research for speech communication and targets for effective evaluation on the speech quality of online conferencing applications. We propose a novel deep learning-based model involving a new convolution neural network (CNN) architecture, a bidirectional long short term memory (BLSTM), an average pooling and a range clipping method. Meanwhile, we construct a two-parts target function combining the mean square error (MSE) and pearson correlation coefficient (PCC) between predictions and labels in order to jointly optimize the performance of the assessment model from both aspects. Experiment results show that the proposed model significantly outperforms the official baseline system both on the validation and test set.

查看译文

关键词

speech quality assessment,deep learning,target function

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要