Spatial Scalability with AV1: A Comparison between Scalable AV1 and MPEG-5 LCEVC in Video Quality and Complexity

ABSTRACT 

The rapid evolution of video streaming demands efficient scalable delivery methods, as they enable several attractive applications, providing flexible experiences to end users, typically resulting in a reduction in the bitrate/storage needs for applications that require multiple instances of the same video. In broadcast, scalability allows to avoid simulcast of independent channels serving HD and UHD users with a single stream, in a multi-conferencing scenario it facilitates support of heterogeneous devices and networks; other application areas include augmented adaptive streaming, scalable video messaging low latency pixel streaming and cloud gaming.  

This paper delves into a comparison of two alternative approaches that offer AV1-compressed video scalability, enhancing the delivery of 1080p base layer videos to UHD through an enhancement layer. 

The enhancement methods analysed are two: 

Scalable AV1 (SVC AV1): AV1 compression standard utilizes rescale feature to enable scalable video. The decoded frames of the base layer (BL) are provided to the enhancement layer (EL) as additional reference frames. These would be upscaled and the encoder adaptively decides whether to use the upscaled reference frames from the base layer or whether to use the previously encoded frames from the enhancement layer. SVC AV1 however comes with the added penalty of requiring the decoder to first decode the base layer followed by a 2nd pass decoding the enhancement layer. 

MPEG-5 Part 2 LCEVC (Low Complexity Enhancement Video Coding):  The standard uses a codec-agnostic EL that is combined with BL [1], in this case SVT-AV1 [2], yielding an enhanced video stream. The LCEVC enhanced stream typically improves the complexity vs. quality trade-off of the enhanced single-layer (SL) codec allowing quality enhancements for LCEVC-supported devices, without compromising the experience on non-LCEVC devices. 

This research dives into the advantages and challenges of these scalable methods. AV1 Scalable video encoding is known to suffer from efficiency loss compared to SL encoding of the full resolution video. Scalable encoding HD+UHD is expected to require a somewhat higher bitrate compared to SL UHD encoding. This study provides assessment scores comparing SL UHD AV1 encodings performed with two base codecs (AMD AV1 in hardware and SVT-AV1 in software), along with their respective SL HD renditions (upscaled to UHD [3]), with two scalable HD+UHD AV1 encodings, i.e., SVC AV1 and LCEVC. The primary focus areas of assessment are: 

  • Objective quality assessment (VMAF, VMAF_NEG, PSNR [4]) 
  • Rate-distortion (RD) curves 
  • Visual quality assessment, e.g., upscaled base vs. enhanced 
  • Encoding complexity.