A Generalization of the Chomsky-Halle Phonetic Representation using Real Numbers for Robust Speech Recognition in Noisy Environments.

IDEAS(2023)

引用 1|浏览5
暂无评分
摘要
Speech recognition is difficult when the speech signal is weak or occurs in a noisy environment. This paper presents an efficient and robust method that can reconstruct the standard pronunciation of English phonemes and words given a weak or noisy signal. The reconstruction is based on a novel representation of the reconstruction task as a problem of data retrieval from a database in two different cases: (1) when the phonemes are represented in the database as binary tuples and the input is also a binary tuple from which deletion errors occur, and (2) when the phonemes are represented in the database and in the input as tuples of real values ranging between 0 and 1. In the latter case, the input phoneme could contain both a higher or lower value than the standard phoneme in the database that is intended by the speaker. For case (2) a theorem is proven regarding when the data retrieval can be expected to be reliable.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要