Recognition of packet loss speech using the most reliable reduced-frame-rate data

Lee-Min Lee,Fu-Rong Jean,Tan-Hsu Tan,Jen-Hsiang Chou

SMC（2014）

引用 2|浏览3

暂无评分

摘要

In a client-server distributed speech recognition (DSR) application, speech features are extracted and quantized at the client-end, and are sent to a remote back-end server for recognition. Although the bandwidth constrains are mostly eliminated, data packets may be lost over error prone channels. In order to reduce the performance degradation because of frame missing, a frequently used error concealment approach is to restore a full frame rate (FFR) observation sequence for recognition at the back-end. In this paper, an alternative approach is proposed to deal with observations with lost frames. This approach at first extracts the most reliable reconstructed reduced-frame-rate (RFR) observation sequence from the received data at the back-end, and then decodes it with an adapted hidden Markov model (HMM) that compensates the mismatch between the FFR trained model and the RFR test data. Experimental results show that a DSR system using the proposed method can achieve the same level of accuracy as an FFR data reconstruction method and significantly lessens the computation time. From the viewpoint of user capacity of a DSR system, we find that the proposed method is capable of serving much more client users without any extra cost of installing new equipment.

查看译文

关键词

hidden Markov model (HMM),data packets,user capacity,frequently-used error concealment approach,FFR trained model,speech feature extraction,speech recognition,automatic speech recognition (ASR),mismatch compensates,performance degradation reduction,FFR data reconstruction method,quantisation (signal),HMM,reliable-reconstructed reduced-frame-rate observation sequence,data compensation,FFR observation sequence restore,computation time,speech feature quantization,feature extraction,RFR test data,bandwidth constrains,full frame rate (FFR) speech,packet loss speech recognition,remote back-end server,client users,reduced frame rate (RFR) speech,distributed speech recognition (DSR),client-server systems,client-server distributed speech recognition application,data decoding,client-server DSR application,full-frame rate observation sequence restoration,hidden Markov models,hidden Markov model,error prone channels,client server

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要