Perceptual Coding for Compressed Video Understanding: A New Framework and Benchmark

arxiv(2022)

引用 0|浏览20
暂无评分
摘要
Most video understanding methods are learned on high-quality videos. However, in most real-world scenarios, the videos are first compressed before the transportation and then decompressed for understanding. The decompressed videos are degraded in terms of perceptual quality, which may degenerate the downstream tasks. To address this issue, we propose the first coding framework for compressed video understanding, where another learnable perceptual bitstream is introduced and simultaneously transported with the video bitstream. With the sophisticatedly designed optimization target and network architectures, this new stream largely boosts the perceptual quality of the decoded videos yet with a small bit cost. Our framework can enjoy the best of both two worlds, (1) highly efficient content-coding of industrial video codec and (2) flexible perceptual-coding of neural networks (NNs). Finally, we build a rigorous benchmark for compressed video understanding over four different compression levels, six large-scale datasets, and two popular tasks. The proposed Dual-bitstream Perceptual Video Coding framework Dual-PVC consistently demonstrates significantly stronger performances than the baseline codec under the same bitrate level.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要