Query-By-Example Keyword Spotting Using Long Short-Term Memory Networks

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2015)

引用 197|浏览248
暂无评分
摘要
We present a novel approach to query-by-example keyword spotting (KWS) using a long short-term memory (LSTM) recurrent neural network-based feature extractor. In our approach, we represent each keyword using a fixed-length feature vector obtained by running the keyword audio through a word-based LSTM acoustic model. We use the activations prior to the softmax layer of the LSTM as our keyword-vector. At runtime, we detect the keyword by extracting the same feature vector from a sliding window and computing a simple similarity score between this test vector and the keyword vector. With clean speech, we achieve 86% relative false rejection rate reduction at 0.5% false alarm rate when compared to a competitive phoneme posteriorgram with dynamic time warping KWS system, while the reduction in the presence of babble noise is 67%. Our system has a small memory footprint, low computational cost, and high precision, making it suitable for on-device applications.
更多
查看译文
关键词
query-by-example keyword spotting,long short-term memory network,LSTM recurrent neural network-based feature extractor,fixed-length feature vector,word-based LSTM acoustic model,softmax layer,sliding window,relative false rejection rate reduction,dynamic time warping KWS system,babble noise
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要