Automatic Detection of Prosodic Focus in American English

INTERSPEECH(2019)

引用 1|浏览9
暂无评分
摘要
Focus, which is usually modulated by prosodic prominence, highlights a particular element within a sentence for emphasis or contrast. Despite its importance in communication, it has received little attention in the field of speech recognition. This paper developed an automatic detection system of prosodic focus in American English, using telephone-number strings. Our data were 100 10-digit phone number strings read by 5 speakers (3 females and 2 males). We extracted 18 prosodic features from each digit within the strings and one categorical variable and trained a Random Forest model to detect where the focused digit is within a given string. We also compared the model performance to human judgment rates from a perception experiment with 67 native speakers of American English. Our final model shows 92% of accuracy in detecting the location of prosodic focus, which is slightly lower than the human perception (97.2%) but much better than the chance level (10%). We discuss the predictive features in our model and potential features to add in the future study.
更多
查看译文
关键词
focus, prosody, machine learning, speech recognition, American English
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要