ISCAS 2022

Abstract

Recently, there is an increasing interest in the neural network-based video coding, including end-to-end and hybrid schemes. To foster the research in this emerging field and provide a benchmark, we propose this Grand Challenge (GC). In this GC, different neural network-based coding schemes will be evaluated according to their coding efficiency and innovations in methodologies. Two tracks will be evaluated, including the hybrid and end-to-end solutions. In the hybrid solutions, deep network-based coding tools shall be used with traditional video coding schemes. In the end-to-end solutions, the whole video codec system shall be built primarily upon deep networks.

Participants shall express their interest to participate in this Grand Challenge by sending an email to the organizer Dr. Yue Li and are invited to submit their proposals as ISCAS papers. The papers will be regularly reviewed and, if accepted, must be presented at ISCAS 2022. The submission instructions for Grand Challenge papers will be communicated by the organizers.

Rationale

In recent years, deep learning-based image/video coding schemes have achieved remarkable progress. As two representative approaches, hybrid solutions and end-to-end solutions have both been investigated extensively. The hybrid solution adopts deep network-based coding tools to enhance traditional video coding schemes while the end-to-end solution builds the whole compression scheme based on deep networks. In spite of the great advancement of these solutions, there are still numerous challenges remaining to be addressed

How to harmonize a deep coding tool with a hybrid video codec, for example: how to take compression into consideration when - developing a deep tool for pre-processing
How to exploit long-term temporal dependency in an end-to-end framework for video coding
How to leverage automated machine learning-based network architectures optimization for higher coding efficiency
How to perform efficient bit allocation with deep learning frameworks
How to achieve the global minimum in rate-distortion trade-offs, for example, to take the impact of the current step on later - frames into account, possibly by using reinforcement learning
How to achieve better complexity-performance trade-offs

In view of these challenges, several activities towards improving deep-learning-based image/video coding schemes have been initiated. For example, there is a special section on “Learning-based Image and Video Compression” in TCSVT, July 2020, a special section on “Optimized Image/Video Coding Based on Deep Learning “ in OJCAS, December 2021, and the “Challenge on Learned Image Compression (CLIC)” at CVPR, which has been organized annually since 2018. In hopes of encouraging more innovative contributions towards resolving the aforementioned challenges in the ISCAS community, we propose this grand challenge.

Requirements, Evaluation

Training Data Set

It is recommended to use the following training data.

UVG dataset: http://ultravideo.cs.tut.fi/
CDVL dataset: https://cdvl.org/
Additional training data are also allowed to be used given that they are described in the submitted document.

Test Specifications

In the test, the proposals will be evaluated on multiple YUV 4:2:0 test sequences in the resolution of 1920x1080. There is no constraint on the reference structure. Note that the neural network must be used in the decoding process.

Evaluation Criteria

The test sequences will be released according to the timeline and the results will be evaluated with the following criteria:

The decoded sequences will be evaluated in 4:2:0 color format.
PSNR (6*PSNRY + PSNRU + PSNRV)/8 will be used to evaluate the distortion of the decoded pictures.
Average Bjøntegaard delta bitrates (BDR) calculated using [1] for all test sequences will be gathered to compare the coding efficiency.
An anchor of HM 16.22 [2] coded with QPs = {22, 27, 32, 37} under random access configuration defined in the HM common test conditions [3] will be provided. The released anchor data will include the bit-rates corresponding to the four QPs for each sequence. It is required that the proposed method should generate four bit-streams for each sequence, targeting the anchor bit-rates corresponding to the four QPs. Additional constraints are listed as follows:

A. For each sequence, the range of four real bit-rates shall be [90% * the lowest anchor bit-rate, 110% * the highest anchor bit-rate].
B. Only one single decoder shall be utilized to decode all the bitstreams.
C. The intra period in the proposed submission shall be no larger than that used by the anchor in generating the validation and test sequences.

Proposed Documents

A docker container with the executable scheme must be submitted for results generation and cross-check. Each participant is invited to submit an ISCAS paper, which must describe the following items in detail.

The methodology
The training data set
Detailed rate-distortion data (comparison with the provided anchor is encouraged)

Complexity analysis of the proposed solutions is encouraged for the paper submission.

Important Dates

Oct. 08, 2021: The organizers release the validation set as well as the corresponding test information to those who have expressed interest (For example, frame rates and intra periods) and template for performance reporting (with rate-distortion points for the validation set)
Nov. 08, 2021: Deadline of paper submission (to be aligned with Special Sessions in case of extension) for participants
Nov. 22, 2021: Participants upload docker container wherein only one single decoder shall be utilized for the decoding of all the bitstreams
Nov. 26, 2021: Organizers release the test sequences (including frame rate, corresponding rate-distortion points, etc.)
Dec. 15, 2021: Participants upload compressed bitstreams and decoded YUV files
Dec. 21, 2021: Deadline of fact sheets submission for participants
Jan. 14, 2022: Paper acceptance notification
Feb. 05, 2022: Camera-ready paper submission deadline
TBA: Paper presentation at ISCAS 2022
TBA: Awards announcement (at the ISCAS 2022 banquet)

Awards

ByteDance will sponsor the awards of this grand challenge. Three categories of awards are expected to be presented. Two top-performance awards will be granted according to the performance, for the hybrid track and the end-to-end track, respectively. In addition, to foster the innovation, a top-creativity award will be given to the most inspiring scheme recommended by a committee group and it is only applicable to participants whose papers are accepted by ISCAS 2022. The winner of each award (if any) will receive a $5000 USD prize.

References

[1] G. Bjøntegaard, “Calculation of average PSNR differences between RD-Curves,” ITUT SG16/Q6, Doc. VCEG-M33, Austin, Apr. 2001.
[2] https://vcgit.hhi.fraunhofer.de/jvet/HM/-/tree/HM-16.22
[3] Common Test Conditions and Software Reference Configurations for HM (JCTVC-L1100)