Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis

ICASSP(2013)

引用 61|浏览40
暂无评分
摘要
This paper presents a new spectral modeling method for statistical parametric speech synthesis. In contrast to the conventional methods in which high-level spectral parameters, such as mel-cepstra or line spectral pairs, are adopted as the features for hidden Markov model (HMM) based parametric speech synthesis, our new method directly models the distribution of the lower-level, un-transformed or raw spectral envelopes. Instead of using single Gaussian distributions, we adopt restricted Boltzmann machines (RBM) to represent the distribution of the spectral envelopes at each HMM state. We anticipate these will give superior performance in modeling the joint distribution of high-dimensional stochastic vectors. The spectral parameters are derived from the spectral envelope corresponding to the estimated mode of each context-dependent RBM and act as the Gaussian mean vector in the parameter generation procedure at synthesis time. Our experimental results show that the RBM is able to model the distribution of the spectral envelopes with better accuracy and generalization ability than the Gaussian mixture model. As a result, our proposed method can significantly improve the naturalness of the conventional HMM-based speech synthesis system using mel-cepstra.
更多
查看译文
关键词
restricted boltzmann machine,line spectral pairs,speech synthesis,boltzmann machines,context-dependent rbm,hmm,spectral analysis,statistical parametric speech synthesis,mel-cepstra,parameter generation procedure,restricted boltzmann machines,gaussian mean vector,high-dimensional stochastic vector joint distribution,hidden markov model,spectral envelope modelling method,spectral envelope,high-level spectral parameters,hidden markov models,gaussian mixture model,single gaussian distributions,vectors,context modeling,gaussian distribution,speech
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要