Synthesis of device-independent noise corpora for speech quality assessment

Hannes Gamper, Lyle Corbin,David Johnston,Ivan J. Tashev

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC)（2016）

引用 4|浏览11

暂无评分

摘要

The perceived quality of speech captured in the presence of background noise is an important performance metric for communication devices, including portable computers and mobile phones. For a realistic evaluation of speech quality, a device under test (DUT) needs to be exposed to a variety of noise conditions either in real noise environments or via noise recordings, typically delivered over a loudspeaker system. However, the test data obtained this way is specific to the DUT and needs to be re-recorded every time the DUT hardware changes. Here we propose an approach that uses device-independent spatial noise recordings to generate device-specific synthetic test data that simulate in-situ recordings. Noise captured using a spherical microphone array is combined with the directivity patterns of the DUT, referred to here as device-related transfer functions (DRTFs), in the spherical harmonics domain. The performance of the proposed method is evaluated in terms of the predicted signal-to-noise ratio (SNR) and the predicted mean opinion score (PMOS) of the DUT under various noise conditions. The root-mean-squared errors (RMSEs) of the predicted SNR and PMOS are on average below 4 dB and 0.28, respectively, across the range of tested SNRs, target source directions, noise types, and spherical harmonics decomposition methods. These experimental results indicate that the proposed method may be suitable for generating device-specific synthetic corpora from device-independent in-situ recordings.

查看译文

关键词

Speech quality,PMOS,PESQ,DRTF,spherical harmonics,microphone array,noise corpus

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要