Globaltimit: Acoustic-Phonetic Datasets For The World'S Languages

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES(2018)

引用 8|浏览31
暂无评分
摘要
Although the TIMIT acoustic-phonetic dataset ([1], [2]) was created three decades ago. it remains in wide use, with more than 20000 Google Scholar references, and more than 1000 since 2017. Despite TIMIT's antiquity and relatively small size, inspection of these references shows that it is still used in many research areas: speech recognition, speaker recognition, speech synthesis, speech coding, speech enhancement, voice activity detection, speech perception, overlap detection and source separation, diagnosis of speech and language disorders, and linguistic phonetics, among others.Nevertheless, comparable datasets are not available even for other widely-studied languages, much less for under documented languages and varieties. Therefore, we have developed a method for creating TIMIT-like datasets in new languages with modest effort and cost, and we have applied this method in standard Thai, standard Mandarin Chinese, English from Chinese L2 learners, the Guanzhong dialect of Mandarin Chinese, and the Ga language of West Africa. Other collections are planned or underway.The resulting datasets will be published through the LDC, along with instructions and open-source tools for replicating this method in other languages, covering the steps of sentence selection and assignment to speakers, speaker recruiting and recording, proof-listening, and forced alignment.
更多
查看译文
关键词
speech datasets, acoustic phonetics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要