Quefrency Domain Features with Residual Networks for Spoof Speech Detection

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC(2023)

引用 0|浏览3
暂无评分
摘要
Despite of having such high-end solutions for spoof speech detection task, generalization approach to the unseen and convolutive complex signals still remains a challenge? The solution should handle the complexity of signals for both replay and synthetic speech signals. The objective of this paper is to extract robust quefrency-domain features along with residual networks for SSD. In particular, we focus on the replay signal that carries the characteristics of the intermediate recording and playback device and hence, the end result is the summation of the complex signal. This makes the signal very challenging and thus, the features must be capabale enough to discriminate the differences. With the recent Neural network-based approach the synthetic signal generation are very much similar to the genuine signal which make the SSD task more complicated. We investigate the short-time cepstrum features for the SSD task on ASVspoof 2019 Challenge database. The cepstrogram results on the PA task, indicates that it achieves better performance for the replay detection. For geniune speech signal, the periodic pattern are easilt observed in the cepstrogram that are missing for the replay as well as for SS and VC signals. This observation is also reflected in the performance evaluation resulting in EER of 2.08 % and t-DCF of 0.04 on evaluation set of LA task obtained with LFCC features and Resnet architecture. Whereas, for PA task with cepstrogram features the least EER we achieved is 1.07 % with 0.056 t-DCF, respectively.
更多
查看译文
关键词
Automatic Speaker Verification,Spoof,Quefrency,Cepstrogram
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要