Non linear time compression of clear and normal speech at high rates

Cassia Valentini-Botinhao,Mirjam Wester,Junichi Yamagishi,Markus Toman,Michael Pucher,Dietmar Schabus

arXiv: Audio and Speech Processing（2019）

引用 23|浏览37

暂无评分

摘要

We compare a series of time compression methods applied to normal and clear speech. First we evaluate a linear (uniform) method applied to these styles as well as to naturally-produced fast speech. We found, in line with the literature, that unprocessed fast speech was less intelligible than linearly compressed normal speech. Fast speech was also less intelligible than compressed clear speech but at the highest rate (three times faster than normal) the advantage of clear over fast speech was lost. To test whether this was due to shorter speech duration we evaluate, in our second experiments, a range of methods that compress speech and silence at different rates. We found that even when the overall duration of speech and silence is kept the same across styles, compressed normal speech is still more intelligible than compressed clear speech. Compressing silence twice as much as speech improved results further for normal speech with very little additional computational costs.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要