Bayesian Joint-Sequence Models For Grapheme-To-Phoneme Conversion

Mirko Hannemann,Jan Trmal,Lucas Ondel,Santosh Kesiraju,Lukás Burget

2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)（2017）

引用 27|浏览46

暂无评分

摘要

We describe a fully Bayesian approach to grapheme-to-phoneme conversion based on the joint-sequence model (JSM). Usually, standard smoothed n-gram language models (LM, e.g. Kneser-Ney) are used with JSMs to model graphone sequences (joint grapheme-phoneme pairs). However, we take a Bayesian approach using a hierarchical Pitman-Yor-Process LM. This provides an elegant alternative to using smoothing techniques to avoid over-training. No held-out sets and complex parameter tuning is necessary, and several convergence problems encountered in the discounted Expectation-Maximization (as used in the smoothed JSMs) are avoided. Every step is modeled by weighted finite state transducers and implemented with standard operations from the OpenFST toolkit. We evaluate our model on a standard data set (CMUdict), where it gives comparable results to the previously reported smoothed JSMs in terms of phoneme-error rate while requiring a much smaller training/testing time. Most importantly, our model can be used in a Bayesian framework and for (partly) un-supervised training.

查看译文

关键词

Bayesian approach, joint-sequence models, weighted finite state transducers, letter-to-sound, grapheme-to-phoneme conversion, hierarchical Pitman-Yor-Process

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要