Plug-and-Play Versatile Compressed Video Enhancement


CVPR 2025


Huimin Zeng      Jiacheng Li      Zhiwei Xiong     
University of Science and Technology of China    

TL;DR: We present a codec-aware framework for versatile compressed video enhancement, which adaptively enhances input videos of different compression levels and supports a wide range of downstream vision tasks.

Abstract

As a widely adopted technique in data transmission, video compression effectively reduces the size of files, making it possible for real-time cloud computing. However, it comes at the cost of visual quality, posing challenges to the robustness of downstream vision models. In this work, we present a versatile codec-aware enhancement framework that reuses codec information to adaptively enhance videos under different compression settings, assisting various downstream vision tasks without introducing computation bottleneck. Specifically, the proposed codec-aware framework consists of a compression-aware adaptation (CAA) network that employs a hierarchical adaptation mechanism to estimate parameters of the frame-wise enhancement network, namely the bitstream-aware enhancement (BAE) network. The BAE network further leverages temporal and spatial priors embedded in the bitstream to effectively improve the quality of compressed input frames. Extensive experimental results demonstrate the superior quality enhancement performance of our framework over existing enhancement methods, as well as its versatility in assisting multiple downstream tasks on compressed videos as a plug-and-play module.

Motivation

We observe that standard video codecs (e.g., H.264) provide rich codec information such as Constant Rate Factor (CRF), motion vectors, and partition maps. CRF reflects hierarchical quality adjustment at both the sequence and frame levels, allowing our model to dynamically adjust parameters for inputs of different compression levels. Partition maps indicate spatial complexity, enabling region-aware refinement, while motion vectors provide temporal alignment cues with minimal overhead. By leveraging the codec information, the enhancement framework is able to flexibly enhance videos across compression levels, support downstream tasks, and maintain computational efficiency.


Framework

As shown above, the proposed method comprises a compression-aware adaption (CAA) network and a bitstream-aware enhancement (BAE). The CAA network employs a hierarchical compression adaptation mechanism to estimate parameters for the frame-adaptive BAE network, which then aggregates intra-frame information and performs region-aware refinement to enhance the input compressed frame.

Results

Video Enhancement

Downstream Tasks

Video Object Segmentation

Video Super-Resolution

Flow Estimation

BibTeX

@article{PnP-VCVE,
    author    = {Zeng, Huimin and Li, Jiacheng and Xiong, Zhiwei},
    title     = {Plug-and-Play Versatile Compressed Video Enhancement},
    journal   = {Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2025},
}

Acknowledgements

We acknowledge funding from the National Natural Science Foundation of China under Grants 62131003 and 62021001.