Query-By-Example Keyword Spotting Using Long Short-Term Memory Networks

Guoguo Chen,Carolina Parada,Tara N. Sainath

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)（2015）

引用 197|浏览248

暂无评分

摘要

We present a novel approach to query-by-example keyword spotting (KWS) using a long short-term memory (LSTM) recurrent neural network-based feature extractor. In our approach, we represent each keyword using a fixed-length feature vector obtained by running the keyword audio through a word-based LSTM acoustic model. We use the activations prior to the softmax layer of the LSTM as our keyword-vector. At runtime, we detect the keyword by extracting the same feature vector from a sliding window and computing a simple similarity score between this test vector and the keyword vector. With clean speech, we achieve 86% relative false rejection rate reduction at 0.5% false alarm rate when compared to a competitive phoneme posteriorgram with dynamic time warping KWS system, while the reduction in the presence of babble noise is 67%. Our system has a small memory footprint, low computational cost, and high precision, making it suitable for on-device applications.

查看译文

关键词

query-by-example keyword spotting,long short-term memory network,LSTM recurrent neural network-based feature extractor,fixed-length feature vector,word-based LSTM acoustic model,softmax layer,sliding window,relative false rejection rate reduction,dynamic time warping KWS system,babble noise

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要