Synthetic Speech References For Automatic Pathological Speech Intelligibility Assessment

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING(2020)

引用 2|浏览24
暂无评分
摘要
Automatic pathological speech intelligibility measures are crucial to assist the clinical diagnosis and treatment of speech disorders. The recently proposed pathological short-time objective intelligibility (P-ESTOI) measure was shown to be very advantageous, yielding a high performance for several speech pathologies. However, to assess the intelligibility of an utterance from a patient, P-ESTOI relies on the availability of recordings of the same utterance by several healthy speakers such that an intelligible reference model can be created. Such recordings are not always easily available, limiting the practical applicability of P-ESTOI. To be able to use P-ESTOI in such scenarios, in this paper we propose to use synthetic speech generated by state-of-the-art high-quality text-to-speech systems to create an intelligible reference model. Experimental results on a database of Cerebral Palsy patients show that the performance of P-ESTOI using synthetic speech references is comparable to using natural speech references, making P-ESTOI a flexible measure which does not require healthy speech recordings and which outperforms state-of-the-art pathological speech intelligibility measures.
更多
查看译文
关键词
P-ESTOI, TTS, Cerebral Palsy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要