Performance Analysis Of Distributed Speech Recognition Using Analysis-By-Synthesis Frame Reduced Front End Under Packet Loss Conditions

2015 IEEE International Conference on Systems, Man, and Cybernetics（2015）

引用 1|浏览1

暂无评分

摘要

We proposed an analysis-by-synthesis (AbS) frame dropping algorithm for the front end of a distributed speech recognition (DSR) system that preserves rapidly changing frames for being more related to speech perception but discards slowly changing frames for providing little information. When applying DSR over error prone packet-switched networks, speech data will inevitably suffer from frame loss since packets may be lost or delayed due to congestion at routers. We further employed a model adaptation error concealment decoder at the back-end for compensating the mismatch between the pre-trained models and the test data, which contain missing frames caused by frame dropping at the front end and packet loss over the transmitted channel. This approach, for convenience, is denoted as AbS-MA. In the decoding process of AbS-MA, the transition probabilities of the hidden Markov models are dynamically adapted according to the time difference between successive observations. Experiments on the recognition of Mandarin digits were conducted to investigate the effectiveness of the proposed AbSMA method for a wide range of combinations of frame rates and packet loss conditions. The performance of the proposed AbSMA approach was compared with a baseline approach, in which the error concealment was implemented by an interpolation as the estimate of the missing frame of the received observations at the back-end. The experimental results show that AbS-MA is not only superior to the baseline in word accuracy but also significantly reduces the computation time.

查看译文

关键词

Distributed speech recognition (DSR),hidden Marko),model (HMM),variable frame rate (VFR),frame dropping,packet loss

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要