Voice Keyword Recognition Based on Spiking Convolutional Neural Network for Human-Machine Interface
2020 THE 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT AUTONOMOUS SYSTEMS (ICOIAS'2020)(2020)
Abstract
In this paper, a spiking convolutional neural network (SCNN) model for voice keyword recognition is presented. The model consists of an input pre-processing layer, a spiking neural network (SNN) layer with build-in filter bank and the convolutional neural network (CNN) layers. A 16-channel infinite impulse response (IIR) filter bank with energy detector extracts power from the voice signal band and converts it to spikes via the SNN layer. The spiking rate in a defined time window is used as the inputs to the following CNN layers for classification. The network is trained using a voice digit dataset, while the weights of the convolutional layers are adjusted through the training of spike-integration results obtained from the spiking layer. This model has been implemented for voice keyword recognition and achieved 96.0 % accuracy. The combination of SNN and CNN reduces the overall number of layer and neuron in the system without compromise in classification accuracy. It is suitable for low power hardware implementation in edge devices for human machine interface (HMI) applications.
MoreTranslated text
Key words
voice recognition,spiking neural network,convolution neural network
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined