SPEECH RECOGNITION ENGINEERING ISSUES IN SPEECH TO SPEECH TRANSLATION SYSTEM DESIGN FOR LOW RESOURCE LANGUAGES AND DOMAINS

Shrikanth Narayanan,Panayiotis G. Georgiou,Abhinav Sethy,Dagen Wang,Murtaza Bulut,Shiva Sundaram,Emil Ettelaie,Sankaranarayanan Ananthakrishnan,Horacio Franco,Kristin Precoda,Dimitra Vergyri,Jing Zheng,Wen Wang,Martin Graciarena,Victor Abrash,Michael Frandsen,Colleen Richey,SRI International

ICASSP（2006）

引用 16|浏览140

暂无评分

摘要

Engineering automatic speech recognition (ASR) for speech to speech (S2S) translation systems, especially targeting languages and domains that do not have readily available spoken language resources, is immensely challenging due to a number of reasons. In addition to contending with the conventional data-hungry speech acoustic and language modeling needs, these designs have to accommodate vary- ing requirements imposed by the domain needs and characteristics, target device and usage modality (such as phrase-based, or sponta- neous free form interactions, with or without visual feedback) and huge spoken language variability arising due to socio-linguistic and cultural differences of the users. This paper, using case studies of creating speech translation systems between English and languages such as Pashto and Farsi, describes some of the practical issues and the solutions that were developed for multilingual ASR development. These include novel acoustic and language modeling strategies such as language adaptive recognition, active-learning based language mod- eling, class-based language models that can better exploit resource poor language data, efficient search strategies, including N-best and confidence generation to aid multiple hypotheses translation, use of dialog information and clever interface choices to facilitate ASR, and audio interface design for meeting both usability and robustness re- quirements.

查看译文

关键词

language translation,natural languages,speech recognition,English,Farsi,Pashto,active-learning based language modeling,automatic speech recognition,class-based language models,data-hungry speech acoustic,low resource languages,speech to speech translation system design,spoken language resources,spoken language variability

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要