Bottom-Up Unsupervised Word Discovery via Acoustic Units.

Saurabhchand Bhati,Chunxi Liu,Jesús Villalba,Jan Trmal,Sanjeev Khudanpur,Najim Dehak

GlobalSIP（2019）

引用 1|浏览72

暂无评分

摘要

Unsupervised term discovery is the task of identifying and grouping reoccurring word-like patterns from the untranscribed audio data. It facilitates unsupervised acoustic model training in zero resource setting where no or minimal transcribed speech is available. In this paper, we investigate two-step bottom-up approaches for unsupervised discovery of word-like units. The first step discovers phone-like acoustic units from data and the second step combines the basic acoustic blocks to identify word-like units. We investigated Embedded Segmental K-means and Nested Hierarchical Pitman-Yor (PYR) model as bottom-up strategies. ESK-Means iteratively selects boundaries from an initial set to arrive at the word boundaries. The final performance critically depends on the quality of the initial boundaries. We used a segmentation method that discovers boundaries much closer to actual boundaries. PYR model has been used for word segmentation from space removed text data, and here we use it for word discovery from unsupervised acoustic units. The term discovery performance is evaluated on the Zero Resource 2017 challenge dataset, which consists of around 70 hours of unlabelled data. Our systems outperformed the baseline systems on all the languages without language-specific parameter tuning. We performed comprehensive experiments of the system parameters on the system performance.

查看译文

关键词

Unsupervised learning, Spoken term discovery, Zerospeech challenge, Pitman Yor model

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要