Speech Recognition Model Compression

Madhumitha Sakthi,Ahmed H. Tewfik,Raj Pawate

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING（2020）

引用 6|浏览16

暂无评分

摘要

Deep Neural Network-based speech recognition systems are widely used in most speech processing applications. To achieve better model robustness and accuracy, these networks are constructed with millions of parameters, making them storage and compute-intensive. In this paper, we propose Bin & Quant (B&Q), a compression technique using which we were able to reduce the Deep Speech 2 speech recognition model size by 7 times for a negligible loss in accuracy. We have shown that our algorithm is generally beneficial based on its effectiveness across two other speech recognition models and the VGG16 model. In this paper, we have empirically shown that Recurrent Neural Networks (RNNs) are more sensitive to model parameter perturbation than Convolutional Neural Networks (CNNs), followed by fully connected(FC) networks. Using our B&Q technique, we have shown that we can establish parameter sharing across layers instead of just within a particular layer.

查看译文

关键词

speech recognition, model compression, deep neural networks

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要