Nonlinear Transform Coding for VVC Intra Coding
PCS(2024)
Abstract
Modern hybrid video codecs like Versatile Video Coding (VVC) heavily rely on transform coding tools. Given a prediction signal at the encoder, the residual is transformed using trigonometric transforms. Rate-distortion-optimized quantization (RDOQ) and entropy coding of the transformed residual is well-understood due to the orthogonality and the energy compaction of these transforms. Within this setting, there is considerable success in optimizing secondary orthogonal transforms. The most prominent example is the Low-Frequency Non-Separable Transform (LFNST) in VVC. However, training nonlinear transforms without re-designing the RDOQ and entropy coding stage is a hard problem. In learned image compression, variational autoencoders have shown impressive results, but they use their own entropy model, remain difficult to train for small blocks and RDOQ is nontrivial for them. This paper describes a novel design of a nonlinear transform network for block-based video coding. Given a transform block, a fully-connected neural network predict coefficients from previously reconstructed ones and the adherent block boundary, such that only the residual coefficients need to be transmitted. Furthermore, another neural network filters the entire transform block before the inverse transform is applied and the intra prediction signal is added. Against the Versatile Video Coding Test Model 14.2 (VTM-14.2), luma bit-rate savings of approximately 1.9 % are reported for the All-Intra configuration.
MoreTranslated text
Key words
Versatile Video Coding (VVC),Transform Coding,Learned Video Compression,Machine Learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined