Tunable early CU size decision for depth map intra coding in 3D-HEVC using unsupervised learning
Introduction
With the rapid development of three-dimensional (3D) video acquisition and display technologies, 3D videos are increasingly popular for viewers. Because 3D videos can provide more immersive and real-world visual experiences, they are widely anticipated in various video applications such as Free-viewpoint TV [1] and 3D movies. 3D videos are commonly presented as the multi-view video plus depth (MVD) format [2], which include multiple texture views and their corresponding depth maps. Meanwhile, vivid virtual views can be synthesized by using Depth-Image-Based-Rendering (DIBR) [3]. Thus, high efficient compression is extremely important for 3D video data to save storage space and transmission bandwidth. To efficiently encode single texture video, High Efficiency Video Compression (HEVC) standard [4]-[5] has been developed by the Joint Collaborative Team on Video Coding (JCT-VC), which achieves higher compression efficiency than earlier H.264/AVC standard. To encode 3D video data, an advanced 3D video coding standard, which is actually a 3D extension of HEVC (3D-HEVC) [6]-[7], is developed by the Joint Collaborative Team on 3D Video Coding (JCT-3V).
3D-HEVC inherits some coding tools such as quadtree Coding Tree Unit (CTU) partitioning of HEVC [8]-[9]. The coding frame is firstly divided into many CTUs with the same size of 64×64. Then, each CTU can be further recursively divided into smaller coding units (CUs). Fig. 1 (a) shows an example of the optimal CTU partitioning. The CU sizes support 64×64, 32×32, 16×16 and 8×8, and their corresponding CU depths are “0”, “1”, “2”, and “3”, respectively. The 3D video format includes multiple texture views and associated depth maps [10], especially for depth map, it is actually a grayscale image in which each pixel represents the geometrical information of the scene. Depth map and texture view are different in nature, since the former is composed of large smooth regions with sharp edges, whereas the latter contains complex content. The edge distortions of depth map coding might also lead to the degradation of synthesized views. To preserve the quality of sharp edges and further improve coding efficiency, in addition to supporting 35 intra prediction modes, as shown in Fig. 1 (b), several new coding techniques are adopted for depth map coding such as Depth Intra Skip (DIS) [11], Depth Modeling Modes (DMM) [12] and Segment-wise Direct Component Coding (SDC) [13]. Especially for the new intra prediction mode DMM, it divides a CU block into two non-rectangular regions, and each region is indicated by a constant value. DMM supports two partition types, wedgelet segmentation and arbitrary contour segmentation, as shown in Fig. 1 (c) and (d).
These advance coding techniques can achieve high compression efficiency for depth map coding in 3D-HEVC, but also lead to huge computational complexity. This brings enormous challenges for real-time applications such as ultra HD video on mobile devices with limited computational resources. Thus, it is necessary to investigate faster coding techniques to reduce the encoding complexity of 3D-HEVC while simultaneously keeping negligible encoding loss.
Though there exist many fast CU size decision works for depth map intra coding to reduce the encoding complexity of 3D-HEVC [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35], the different tradeoff between coding loss and coding time saving could be further investigated due to the following reason. For the user's specific application preference, some application scenarios require lower coding complexity, while other application scenarios may need to reduce coding complexity with coding quality is almost lossless simultaneously. In this work, to meet the user's specific application preference, we propose a tunable early CU size decision scheme for depth map intra coding in 3D-HEVC to achieve different tradeoff between coding loss and coding time saving. The novelties and the contributions of the proposed scheme are summarized as follows:
- •
A simple yet effective clustering based early CU size decision approach is proposed for 3D-HEVC depth map intra coding, in which only one valid feature is selected.
- •
It is an unsupervised learning method that adaptively updates the cluster center in each coding frame to adapt to the texture characteristics of different coding frames.
- •
To meet users' preference for specific application, tunable trade-off between coding time saving and coding loss is achieved by introducing the similarity distance.
The rest of this paper is organized as follows. Section 2 reviews the related works on coding time reduction for 3D-HEVC depth map intra coding. Section 3 describes the motivation and statistical analyses. Section 4 presents the early intra CU size decision scheme for 3D-HEVC depth map coding. Experimental results are provided in Section 5. Section 6 concludes this paper.
Section snippets
Related works
Many fast intra coding algorithms have been designed for HEVC [36], [37], [38]. Though they can greatly reduce the intra coding complexity of HEVC, they are not appropriate for depth map intra coding in 3D-HEVC. The reasons behind this are two-folds. Firstly, texture view and depth map have different content properties. Secondly, distinct coding tools such as DMM and SDC are introduced into depth map intra coding. Therefore, to reduce the complexity of depth map intra coding, many researchers
Motivation and statistical analyses
To analyze the characteristic of depth map intra coding in 3D-HEVC, two preliminary experiments are conducted with the unmodified HTM16.0 under All-Intra (AI) configuration. The test conditions are summarized as follows: four video sequences with different spatial resolutions are used, which include 1024×768 (“Balloons”, “Newspaper”) and 1920×1088 (“Poznan_Hall2” and “Shark”). Four pairs of quantization parameters (QPs) are (25, 34), (30, 39), (35, 42) and (40, 45) for them, respectively. Note
Proposed early CU size decision algorithm
Early intra CU size decision is usually regarded as a classification problem, and supervised machine learning is used for this purpose. However, few works treat early intra CU size decision as a clustering problem. Actually, clustering can be regarded as an unsupervised machine learning method, which can divide a set of data into one cluster or multiple clusters according to the similarity characteristics of the data. In this work, we treat the early CU size decision as a clustering problem,
Test conditions
To evaluate the proposed tunable CU size decision method for depth map intra coding in 3D-HEVC, the proposed scheme is integrated into the 3D-HEVC reference software HTM16.0. We use the CTC [42] and eight test sequences recommended by JCT-3V for experiments. The executable files can be downloaded via the link.1 The details of eight test sequences are reported in Table 1. Note that each test sequence contains three texture views and their corresponding depth
Conclusion
In this paper, unsupervised learning based tunable early CU size decision scheme is proposed for depth map intra coding in 3D-HEVC. Three clustering models are proposed for the clustering of 64×64, 32×32 and 16×16 CU. In the clustering process, only RD cost is extracted as a feature to represent the characteristics of the data. In addition, the center of clusters is adaptively obtained by using intra learning method in each coding frame. In order to meet the user's specific application
CRediT authorship contribution statement
Yue Li: Conceptualization, Methodology, Writing-Original draft preparation. Gaobo Yang: Supervision, Writing-Reviewing. Aiping Qu: Data curation. Yapei Zhu: Validation.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (No. 62001209, 61972143), the Natural Science Foundation of Hunan Province, China (No. 2020JJ5496).
Yue Li received the M.S. and Ph.D. degrees from Central South University and Hunan University in 2013 and 2018, respectively. He is a lecturer in the Computer School, University of South China. His current research interests include video coding, point cloud compression.
References (43)
- et al.
Complexity reduction in the HEVC/H265 standard based on smooth region classification
Digit. Signal Process.
(2018) - et al.
Adaptive CU split prediction and fast mode decision for 3D-HEVC texture coding based on just noticeable difference model
Digit. Signal Process.
(2020) - et al.
A reduced computational effort mode-level scheme for 3D-HEVC depth maps intra-frame prediction
J. Vis. Commun. Image Represent.
(2018) - et al.
Self-learning residual model for fast intra CU size decision in 3D-HEVC
Signal Process. Image Commun.
(2020) - et al.
Fast intra mode decision and fast CU size decision for depth video coding in 3D-HEVC
Signal Process. Image Commun.
(2019) - et al.
Sum-of-gradient based fast intra coding in 3D-HEVC for depth map sequence (SOG-FDIC)
J. Vis. Commun. Image Represent.
(2017) - et al.
Machine learning based video coding optimizations: a survey
Inf. Sci.
(2020) - et al.
A free-viewpoint television system for horizontal virtual navigation
IEEE Trans. Multimed.
(2018) - et al.
3D high-efficiency video coding for multi-view video and depth data
IEEE Trans. Image Process.
(2013) - et al.
3D-TV System with Depth-Image-Based Rendering: Architecture, Techniques and Challenges
(2014)
Overview of the high efficiency video coding (HEVC) standard
IEEE Trans. Circuits Syst. Video Technol.
Overview of the multiview and 3D extensions of high efficiency video coding
IEEE Trans. Circuits Syst. Video Technol.
A 3D-HEVC fast mode decision algorithm for real-time applications
ACM Trans. Multimed. Comput. Commun. Appl.
Frame-level bit allocation optimization based on video content characteristics for HEVC
ACM Trans. Multimed. Comput. Commun. Appl.
Adaptive inter CU depth decision for HEVC using optimal selection model and encoding parameters
IEEE Trans. Broadcast.
3D-CE1: depth intra skip (DIS) mode
Depth intra coding for 3D video based on geometric primitives
IEEE Trans. Circuits Syst. Video Technol.
Generic segment-wise DC for 3D-HEVC depth intra coding
Fast mode decision based on gradient information in 3D-HEVC
IEEE Access
Fast depth modeling mode selection for 3D HEVC depth intra coding
Fast intra mode decision for depth map coding in 3D-HEVC
J. Real-Time Image Process.
Cited by (4)
Efficient CU Decision Algorithm for VVC 3D Video Depth Map Using GLCM and Extra Trees
2023, Electronics (Switzerland)Fast Algorithm for CU Size Decision Based on Ensemble Clustering for Intra Coding of VVC 3D Video Depth Map
2023, Electronics (Switzerland)A Complexity Reduction Method for Intra Prediction Method in HEVC Standard
2022, Majlesi Journal of Electrical Engineering
Yue Li received the M.S. and Ph.D. degrees from Central South University and Hunan University in 2013 and 2018, respectively. He is a lecturer in the Computer School, University of South China. His current research interests include video coding, point cloud compression.
Gaobo Yang is a professor in the School of Information Science and Engineering, Hunan University, China. He obtained his Masters and Ph.D. degrees from East China Jiaotong University and Shanghai University in 2001 and 2004, respectively. From August 2010 to August 2011, he made an academic visit to the University of Surrey, UK. He has published more than 60 papers in international journals. His research interests include video information security and multimedia communication.
Aiping Qu was born in Hunan, China, in 1982. He received the B.S. and M.S. degrees in mathematics science from Hunan University, in 2005 and 2010, respectively, and the Ph.D. degree in computer science from Wuhan University, in 2015. He is currently an Associate Professor of computer science with the University of South China. He has published over 20 papers in journals and conferences. His research interests include optimization, machine learning, and medical image analysis.
Yapei Zhu received the B.S. degree from Hengyang Normal University, China, in 2010 and the M.S.degree from Ningbo University in 2013. She is currently a Lecturer with Hengyang Normal University. Her research interest concentrates on image/video compression.