TempAdaCos: Learning Temporally Structured Embeddings for Few-Shot Keyword Spotting with Dynamic Time Warping
CoRR(2023)
摘要
Few-shot keyword spotting (KWS) systems often utilize a sliding window of fixed size. Because of the varying lengths of different keywords or their spoken instances, choosing the right window size is a problem: A window should be long enough to contain all necessary information needed to recognize a keyword but a longer window may contain irrelevant information such as multiple words or noise and thus makes it difficult to reliably detect on- and offsets of keywords. In this work, TempAdaCos, an angular margin loss for obtaining embeddings with temporal structure, that can be used to detect keywords with dynamic time warping is proposed. In experiments conducted on KWS-DailyTalk, a few-shot keyword spotting (KWS) dataset presented in this work, it is shown that using these embeddings outperforms using other representations or a sliding window. Furthermore, it is shown that using time-reversed segments of the keywords while training the system improves the performance.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要