Spatial-temporal Data Compression of Dynamic Vision Sensor Output with High Pixel-level Saliency using Low-precision Sparse Autoencoder

Ahmed Hasssan,Jian Meng,Yu Cao,Jae-sun Seo

2022 56th Asilomar Conference on Signals, Systems, and Computers（2022）

引用 1|浏览22

暂无评分

摘要

Imaging innovations such as dynamic vision sensor (DVS) can significantly reduce the image data volume by tracking only the changes in events. However, when DVS camera itself moves around (e.g. self-driving cars), the DVS output stream is not sparse enough to achieve the desired hardware efficiency. In this work, we investigate designing a compact sparse auto encoder model to largely compress event-based DVS output. The proposed encoder-decoder-based autoencoder design is a shallow convolutional neural network (CNN) architecture with two convolution and inverse-convolution layers with only $\sim 10\mathrm{k}$ parameters. We implement quantization-aware training on our proposed model to achieve 2-bit and 4-bit precision. Moreover, we implement unstructured pruning on the encoder module to achieve >90 % active pixel compression at the latent space. The proposed autoencoder design has been validated against multiple benchmark DVS-based datasets including DVS-MNIST, N-Cars, DVS-IBM Gesture, and Prophesee Automotive Gen1 dataset. We achieve low accuracy drop of 2%, 3%, and 3.8% compared to the uncompressed baseline, with 7.08%, 1.36%, and 5.53 % active pixels in the images from the decoder (compression ratio of $13.1\times, \ 29.1\times$ , and $18.1\times$ ) for DVS-MNIST, N-Cars, and DVS-IBM Gesture datasets, respectively. For the Prophesee Automotive Gen1 dataset, we achieve a minimal mAP drop of 0.07 from the baseline with 9% active pixels in the images from the decoder (compression ratio of $11.9\times$ ).

查看译文

关键词

Neural network training,Dynamic vision sensor,Sparse autoencoder,Neural network pruning,Quantization,Object recognition,detection

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要