skip to main content
10.1145/3595916.3626444acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation

Published:01 January 2024Publication History

ABSTRACT

Recently, neural network-based image compression techniques have demonstrated remarkable compression performance. The use of context-adaptive entropy models greatly enhances the rate-distortion (R-D) performance by effectively capturing spatial redundancy in latent representations. However, latent representations still contain some spatial correlations(e.g. same spatial structure), it needs to be eliminated by further processing. And many compression models are single-rate model, which is difficult to cover a big range of bitrate. In order to address this issue, we propose a novel variable-rate image compression algorithm that efficiently leverages bi-resolution spatial-channel information through learned mechanisms. In this paper, we first proposed a BRP network to divide our latent representations and side information into HR and LR components, eliminating the spatial redundancy in same location. Combining the spatial-channel context, we proposed a BSC context model, including a decreasing-granularity checkerboard pattern and channel grouping based on cosine slicing strategy. To cover a wide range of bitrate, we take a weight map as input to control bit allocation, achieving multiple compression rates. Our experimental results show that our method provides a better rate-distortion trade-off than BPG, JPEG and other recent image compression methods based on deep learning.

References

  1. 2021. Versatile Video Coding Reference Software Version 12.1 (VTM-12.1). https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tags/VTM-12.1.Google ScholarGoogle Scholar
  2. Mohammad Akbari, Jie Liang, Jingning Han, and Chengjie Tu. 2021. Learned bi-resolution image coding using generalized octave convolutions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 6592–6599.Google ScholarGoogle ScholarCross RefCross Ref
  3. Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018).Google ScholarGoogle Scholar
  4. Jean Bégaint, Fabien Racapé, Simon Feltman, and Akshay Pushparaja. 2020. Compressai: a pytorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029 (2020).Google ScholarGoogle Scholar
  5. Fabrice Bellard. 2015. BPG image format. URL https://bellard. org/bpg 1, 2 (2015), 1.Google ScholarGoogle Scholar
  6. Gisle Bjontegaard. 2001. Calculation of average PSNR differences between RD-curves. ITU SG16 Doc. VCEG-M33 (2001).Google ScholarGoogle Scholar
  7. Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis, Marcus Rohrbach, Shuicheng Yan, and Jiashi Feng. 2019. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. In Proceedings of the IEEE/CVF international conference on computer vision. 3435–3444.Google ScholarGoogle ScholarCross RefCross Ref
  8. Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. 2020. Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7939–7948.Google ScholarGoogle ScholarCross RefCross Ref
  9. Rich Franzen. 1999. Kodak lossless true color image suite. source: http://r0k. us/graphics/kodak 4, 2 (1999), 9.Google ScholarGoogle Scholar
  10. Zongyu Guo, Zhizheng Zhang, Runsen Feng, and Zhibo Chen. 2021. Causal contextual prediction for learned image compression. IEEE Transactions on Circuits and Systems for Video Technology 32, 4 (2021), 2329–2341.Google ScholarGoogle ScholarCross RefCross Ref
  11. Zongyu Guo, Zhizheng Zhang, Runsen Feng, and Zhibo Chen. 2021. Soft then hard: Rethinking the quantization in neural image compression. In International Conference on Machine Learning. PMLR, 3920–3929.Google ScholarGoogle Scholar
  12. Dailan He, Ziming Yang, Weikun Peng, Rui Ma, Hongwei Qin, and Yan Wang. 2022. Elic: Efficient learned image compression with unevenly grouped space-channel contextual adaptive coding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5718–5727.Google ScholarGoogle ScholarCross RefCross Ref
  13. Dailan He, Yaoyan Zheng, Baocheng Sun, Yan Wang, and Hongwei Qin. 2021. Checkerboard context model for efficient learned image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14771–14780.Google ScholarGoogle ScholarCross RefCross Ref
  14. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  15. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 740–755.Google ScholarGoogle ScholarCross RefCross Ref
  16. Ming Lu, Fangdong Chen, Shiliang Pu, and Zhan Ma. 2022. High-efficiency lossy image coding through adaptive neighborhood information aggregation. arXiv preprint arXiv:2204.11448 (2022).Google ScholarGoogle Scholar
  17. David Minnen, Johannes Ballé, and George D Toderici. 2018. Joint autoregressive and hierarchical priors for learned image compression. Advances in neural information processing systems 31 (2018).Google ScholarGoogle Scholar
  18. Athanassios Skodras, Charilaos Christopoulos, and Touradj Ebrahimi. 2001. The JPEG 2000 still image compression standard. IEEE Signal processing magazine 18, 5 (2001), 36–58.Google ScholarGoogle ScholarCross RefCross Ref
  19. Myungseo Song, Jinyoung Choi, and Bohyung Han. 2021. Variable-rate deep image compression through spatially-adaptive feature transform. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2380–2389.Google ScholarGoogle ScholarCross RefCross Ref
  20. Gregory K Wallace. 1992. The JPEG still picture compression standard. IEEE transactions on consumer electronics 38, 1 (1992), xviii–xxxiv.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yiming Wang, Qian Huang, Bin Tang, Huashan Sun, and Xiaotong Guo. 2023. FGC-VC: Flow-Guided Context Video Compression. In 2023 IEEE International Conference on Image Processing (ICIP). IEEE, 3175–3179.Google ScholarGoogle Scholar
  22. Yibo Yang, Robert Bamler, and Stephan Mandt. 2020. Improving inference for neural image compression. Advances in Neural Information Processing Systems 33 (2020), 573–584.Google ScholarGoogle Scholar
  23. Jing Zhao, Bin Li, Jiahao Li, Ruiqin Xiong, and Yan Lu. 2021. A universal encoder rate distortion optimization framework for learned compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1880–1884.Google ScholarGoogle ScholarCross RefCross Ref
  24. Yinhao Zhu, Yang Yang, and Taco Cohen. 2021. Transformer-based transform coding. In International Conference on Learning Representations.Google ScholarGoogle Scholar

Index Terms

  1. End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia
      December 2023
      745 pages
      ISBN:9798400702051
      DOI:10.1145/3595916

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 January 2024

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate59of204submissions,29%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia
    • Article Metrics

      • Downloads (Last 12 months)70
      • Downloads (Last 6 weeks)10

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format