Skip to main content

Advertisement

Log in

BLINC: lightweight bimodal learning for low-complexity VVC intra-coding

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

The latest video coding standard, versatile video coding (VVC), achieves almost twice coding efficiency compared to its predecessor, the high efficiency video coding (HEVC). However, achieving this efficiency (for intra coding) requires 31 × computational complexity compared to HEVC, which makes it challenging for low power and real-time applications. This paper, proposes a novel machine learning approach that jointly and separately employs two modalities of features, to simplify the intra coding decision. To do so, first a set of features are extracted that use the existing DCT core of VVC, to assess the texture characteristics, and forms the first modality of data. This produces high-quality features with almost no extra computational overhead. The distribution of intra modes at the neighboring blocks is also used to form the second modality of data, which provides statistical information about the frame, unlike the first modality. Second, a two-step feature reduction method is designed that reduces the size of feature set, such that a lightweight model with a limited number of parameters can be used to learn the intra mode decision task. Third, three separate training strategies are proposed (1) an offline training strategy using the first (single) modality of data, (2) an online training strategy that uses the second (single) modality, and (3) a mixed online–offline strategy that uses bimodal learning. Finally, a low-complexity encoding algorithms is proposed based on the proposed learning strategies. Extensive experimental results show that the proposed methods can reduce up to 24% of encoding time, with a negligible loss of coding efficiency. Moreover, it is demonstrated how a bimodal learning strategy can boost the performance of learning. Lastly, the proposed method has a very low computational overhead (0.2%), and uses existing components of a VVC encoder, which makes it much more practical compared to competing solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. ITU-T and ISO/IEC JTC 1. Versatile video coding, ITU-T H.266 and ISO/IEC 23090-3 (VVC) (2020)

  2. Sullivan, G.J., Ohm, J., Han, W., Wiegand, T.: Overview of the high efficiency video coding. IEEE Trans. Circuits Syst. Video Technol. 22, 1649–1668 (2012)

    Article  Google Scholar 

  3. Bossen, F., Boyce, J., Suehring, K., Li, X., Seregin, V.: JVET common test conditions and software reference configurations for SDR video. Jt. Video Expert. Team ITU-T SG 16 WP 3 ISO/IEC JTC 1/SC 29/WG 11, 14th meeting (2019)

  4. Pakdaman, F., Adelimanesh, M.A., Gabbouj, M., Hashemi, M.R.: Complexity analysis of next-generation VVC encoding and decoding. In: IEEE International Conference on Image Processing (ICIP), pp. 3134–3138 (2020)

  5. Ozer, J.: Which codecs does YouTube use? https://streaminglearningcenter.com/codecs/which-codecs-does-youtube-use.html. Accessed 15 Jan 2022

  6. ISO/IEC JTC 1. High effic. coding media delivery heterogeneous environment—Part 12 image file format, ISO/IEC 23008-122017 (2017)

  7. Xu, L., Kwong, S., Zhang, Y., Zhao, D.: Low-complexity encoder framework for window-level rate control optimization. IEEE Trans. Ind. Electron. 60, 1850–1858 (2013)

    Article  Google Scholar 

  8. Alaoui-Fdili, O., Fakhri, Y., Corlay, P., Coudoux, F.-X., Aboutajdine, D.: Energy consumption analysis and modelling of a H.264/AVC intra-only based encoder dedicated to WVSNs. In: IEEE International Conference on Image Processing. IEEE, pp. 1189–1193 (2014)

  9. Apple ProRes. https://apple.com/final-cut-pro/docs/Apple_ProRes_White_Paper.pdf

  10. Chen, J., Ye, Y., Kim, S.H.: Algorithm description for versatile video coding and test model 7 (VTM 7). Jt. Video Expert. Team ITU-T SG 16 WP 3 ISO/IEC JTC 1/SC 29/WG 11 16th Meeting. Geneva, Oct. 2019 (2019)

  11. Pakdaman, F., Yu, L., Hashemi, M.R., Ghanbari, M., Gabbouj, M.: SVM based approach for complexity control of HEVC intra coding. Signal Process. Image Commun. 93, 116177 (2021)

    Article  Google Scholar 

  12. Laude, T., Ostermann, J.: Deep learning-based intra prediction mode decision for HEVC. In: Picture Coding Symposium (PCS) (2017)

  13. Dong, X., Shen, L., Yu, M., Yang, H.: Fast intra mode decision algorithm for versatile video coding. IEEE Trans. Multim. 24, 400–414 (2022)

    Article  Google Scholar 

  14. Zhang, T., Sun, M.T., Zhao, D., Gao, W.: Fast intra-mode and CU size decision for HEVC. IEEE Trans. Circuits Syst. Video Technol. 27, 1714–1726 (2017)

    Article  Google Scholar 

  15. Chen, Z., Shi, J., Li, W.: Learned fast HEVC intra coding. IEEE Trans. Image Process. 29, 5431–5446 (2020)

    Article  Google Scholar 

  16. Zhu, L., Zhang, Y., Pan, Z., Wang, R., Kwong, S., Peng, Z.: Binary and multi-class learning based low complexity optimization for HEVC encoding. IEEE Trans. Broadcast. 63, 547–561 (2017)

    Article  Google Scholar 

  17. Hosseini, E., Pakdaman, F., Hashemi, M.R., Ghanbari, M.: A computationally scalable fast intra coding scheme for HEVC video encoder. Multim. Tools Appl. 78, 11607–11630 (2019)

    Article  Google Scholar 

  18. Baltrusaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41, 423–443 (2019)

    Article  Google Scholar 

  19. Pakdaman, F.: Complexity reduction and control techniques for power-constrained video coding. Tampere University (2020)

  20. Usman, M., Khan, K., Shafique, M., Henkel, J.: An adaptive complexity reduction scheme with fast prediction unit decision for HEVC intra encoding. In: IEEE international conference on image processing. pp. 1578–1582 (2013)

  21. Liu, X., Li, Y., Liu, D., Wang, P., Yang, L.T.: An adaptive CU size decision algorithm for HEVC intra prediction based on complexity classification using machine learning. IEEE Trans. Circuits Syst. Video Technol. 29, 144–155 (2019)

    Article  Google Scholar 

  22. Min, B., Cheung, R.C.C.: A fast CU size decision algorithm for the HEVC intra encoder. IEEE Trans. Circuits Syst. Video Technol. 25, 892–896 (2015)

    Article  Google Scholar 

  23. Grellert, M., Zatt, B., Bampi, S., Cruz, L.A.S.: Fast coding unit partition decision for HEVC using support vector machines. IEEE Trans. Circuits Syst. Video Technol. 29, 1741–1753 (2019)

    Article  Google Scholar 

  24. Zhang, Q., Guo, R., Jiang, B., Su, R.: Fast CU decision-making algorithm based on DenseNet network for VVC. IEEE Access. 9, 119289–119297 (2021)

    Article  Google Scholar 

  25. Nami, S., Pakdaman, F., Hashemi, M.R.: Juniper: a JND-based perceptual video coding framework to jointly utilize saliency and JND. In: IEEE International Conference on Multimedia and Expo Workshops. pp. 1–6 (2020)

  26. Zhao, J., Cui, T., Zhang, Q.: Fast CU partition decision strategy based on human visual system perceptual quality. IEEE Access. 9, 123635–123647 (2021)

    Article  Google Scholar 

  27. Xu, M., Li, T., Wang, Z., Deng, X., Yang, R., Guan, Z.: Reducing complexity of HEVC: a deep learning approach. IEEE Trans. Image Process. 27, 5044–5059 (2018)

    Article  MathSciNet  Google Scholar 

  28. Tissier, A., Hamidouche, W., Vanne, J., Galpin, F., Menard, D.: CNN oriented complexity reduction of VVC intra encoder. In: Proceedings of International Conference Image Process. ICIP. 2020-Octob. pp. 3139–3143 (2020)

  29. Tech, G., Pfaff, J., Schwarz, H., Helle, P., Wieckowski, A., Marpe, D., Wiegand, T.: CNN-based parameter selection for fast VVC intra-picture encoding. In: IEEE International Conference on Image Processing. pp. 2109–2113 (2021)

  30. Cao, J., Tang, N., Wang, J., Liang, F.: Texture-based fast CU size decision and intra mode decision algorithm for VVC. In: Lectures Notes in Computer Science. pp. 739–751 (2020)

  31. Zhang, Q., Wang, Y., Huang, L., Jiang, B.: Fast CU partition and intra mode decision method for H.266/VVC. IEEE Access 8, 117539–117550 (2020)

    Article  Google Scholar 

  32. Yao, Y., Wang, J., Du, C., Zhu, J., Xu, X.: A support vector machine based fast planar prediction mode decision algorithm for versatile video coding. Multim. Tools Appl. 2022, 1–18 (2022)

    Google Scholar 

  33. Yang, S.H., Hsiao, S.J.: H.266/VVC fast intra prediction using Sobel edge features. Electron. Lett. 57, 11–13 (2021)

    Article  Google Scholar 

  34. Lei, J., Li, D., Pan, Z., Sun, Z., Kwong, S., Hou, C.: Fast intra prediction based on content property analysis for low complexity HEVC-based screen content coding. IEEE Trans. Broadcast. 63, 48–58 (2017)

    Article  Google Scholar 

  35. Saldanha, M., Sanchez, G., Marcon, C., Agostini, L.: Learning-based complexity reduction scheme for VVC intra-frame prediction. In: International Conference on Visual Communications and Image Processing. pp. 1–5 (2021)

  36. Pakdaman, F., Hashemi, M.-R., Ghanbari, M.: Fast and efficient intra mode decision for HEVC, based on dual-tree complex wavelet. Multim. Tools Appl. 76, 9891–9906 (2017)

    Article  Google Scholar 

  37. Jamali, M., Coulombe, S.: Fast HEVC intra mode decision based on RDO cost prediction. IEEE Trans. Broadcast. 65, 109–122 (2018)

    Article  Google Scholar 

  38. Hosseini, E., Pakdaman, F., Hashemi, M.R., Ghanbari, M.: Fine-grain complexity control of HEVC intra prediction in battery-powered video codecs. J. Real-Time Image Process. 18, 03–618 (2021)

    Article  Google Scholar 

  39. Ding, W., Shen, W., Shi, Y., Yin, B.: A fast intra-mode decision scheme for HEVC. In: Proceedings—2014 International Conference on Digital Home, ICDH 2014. pp. 70–73 (2014)

  40. Shang, X., Wang, G., Fan, T., Li, Y.: Fast CU size decision and PU mode decision algorithm in HEVC intra coding. In: International Conference on Image Processing. pp. 1593–1597 (2015)

  41. Ben Jdidia, S., Belghith, F., Sallem, A., Jridi, M., Masmoudi, N.: Hardware implementation of PSO-based approximate DST transform for VVC standard. J. Real-Time Image Process. 2021, 1–15 (2021)

    Google Scholar 

  42. Ryu, S., Kang, J.: Machine learning-based fast angular prediction mode decision technique in video coding. IEEE Trans. Image Process. 27, 5525–5538 (2018)

    Article  MathSciNet  Google Scholar 

  43. Yao, Y., Li, X., Lu, Y.: Fast intra mode decision algorithm for HEVC based on dominant edge assent distribution. Multim. Tools Appl. 75, 1963–1981 (2016)

    Article  Google Scholar 

  44. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  45. Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: ICLR. pp. 1–11 (2016)

  46. Raufmehr, F., Salehi, M.R., Abiri, E.: A frame-level MLP-based bit-rate controller for real-time video transmission using VVC standard. J. Real-Time Image Process. 18, 751–763 (2020)

    Article  Google Scholar 

  47. Møller, M.F.: A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw. 6, 525–533 (1993)

    Article  Google Scholar 

  48. Bjontegaard, G.: Calculation of average PSNR differences between RD-curves. ITU-T Q.6/SG16, Doc. VCEG-M33, 15th Meeting. Austin, Texas (2001)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Farhad Pakdaman.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pakdaman, F., Adelimanesh, M. & Hashemi, M. BLINC: lightweight bimodal learning for low-complexity VVC intra-coding. J Real-Time Image Proc 19, 791–807 (2022). https://doi.org/10.1007/s11554-022-01223-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-022-01223-1

Keywords

Navigation