LSSVC: A Learned Spatially Scalable Video Coding Scheme.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society(2024)

Cited 0|Views8
No score
Traditional block-based spatially scalable video coding has been studied for over twenty years. While significant advancements have been made, the scope for further improvement in compression performance is limited. Inspired by the success of learned video coding, we propose an end-to-end learned spatially scalable video coding scheme, LSSVC, which provides a new solution for scalable video coding. In LSSVC, we propose to use the motion, texture, and latent information of the base layer (BL) as interlayer information for compressing the enhancement layer (EL). To reduce interlayer redundancy, we design three modules to leverage the upsampled interlayer information. Firstly, we design a contextual motion vector (MV) encoder-decoder, which utilizes the upsampled BL motion information to help compress high-resolution MV. Secondly, we design a hybrid temporal-layer context mining module to learn more accurate contexts from the EL temporal features and the upsampled BL texture information. Thirdly, we use the upsampled BL latent information as an interlayer prior for the entropy model to estimate more accurate probability distribution parameters for the high-resolution latents. Experimental results show that our scheme surpasses H.265/SHVC reference software by a large margin. Our code is available at
Translated text
Key words
Learned video coding,spatial scalability,scalable video coding,contextual MV encoder-decoder,hybrid temporal-layer context mining,interlayer prior
AI Read Science
Must-Reading Tree
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined