Effectiveness Of Multiscale Fractal Dimension For Improvement Of Frame Classification Rate
2015 23rd European Signal Processing Conference (EUSIPCO)(2015)
摘要
We propose to use multiscale fractal dimension (FD)-based features for phoneme classification task at frame-level. During speech production, turbulence is created and hence vortices (generated due to presence of separated airflow) may travel along the vocal tract and excite vocal tract resonators. This turbulence and in effect, the embedded features of different phoneme classes, can be captured by invariant property of multiscale FD. To capture complementary information, feature-level fusion of proposed feature with state-of-the-art Mel Frequency Cepstral Coefficients (MFCC) is attempted and found to be effective. In particular, single-hidden layer neural nets were trained to compute the frame classification rate. Proposed feature was able to reduce the error rate by over 1.6 % from MFCC features on TLliIT database. This is supported by significant reduction in % [ER (i.e., 0.327 % to 4.795 %)(1).
更多查看译文
关键词
fractal dimension,multiscale analysis,phoneme-based frame classification,nonlinearity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络