High Efficient 3D Convolution Feature Compression

IEEE Transactions on Circuits and Systems for Video Technology（2022）

引用 0|浏览1

暂无评分

摘要

In this paper, a high efficient 3D convolution feature compression method is proposed. This method is mainly used to compress the three-dimensional convolution deep features extracted by video analysis. Gather the compressed features to the cloud server instead of all video features. This method can solve the problem that the data aggregation of video big data analysis requires a large amount of network bandwidth. Quantization is a common method for feature compression, but the existing quantization-based methods often carry out model training and quantization in stages, which makes the robustness of quantization results poor. To solve this problem, the method proposed in this paper is to apply the feature quantization operation directly to the network and train it with the analysis task, and use the parameter iterative optimization method to solve the non-differentiable problem of quantization operation. Different from the deep features extracted from images or single objects, the 3D convolution features extracted from video clips have high time-domain redundancy. In this paper, by serializing the three-dimensional convolution features, the time-domain prediction coding method is used to remove the time-domain redundancy of the three-dimensional convolution features, to improve the feature compression ratio. The experimental results show that this method can only use 1 bit to represent the elements in the three-dimensional convolution deep feature. When the analysis accuracy loss is no more than 1%, the feature compression ratio can reach 4500 times compared with the original feature data, and the data transmission can be reduced by 96%.

查看译文

关键词

Deep Feature Compression,Quantization,3D CNN,Action Recognition

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要