ABSTRACT
Recently, neural network-based image compression techniques have demonstrated remarkable compression performance. The use of context-adaptive entropy models greatly enhances the rate-distortion (R-D) performance by effectively capturing spatial redundancy in latent representations. However, latent representations still contain some spatial correlations(e.g. same spatial structure), it needs to be eliminated by further processing. And many compression models are single-rate model, which is difficult to cover a big range of bitrate. In order to address this issue, we propose a novel variable-rate image compression algorithm that efficiently leverages bi-resolution spatial-channel information through learned mechanisms. In this paper, we first proposed a BRP network to divide our latent representations and side information into HR and LR components, eliminating the spatial redundancy in same location. Combining the spatial-channel context, we proposed a BSC context model, including a decreasing-granularity checkerboard pattern and channel grouping based on cosine slicing strategy. To cover a wide range of bitrate, we take a weight map as input to control bit allocation, achieving multiple compression rates. Our experimental results show that our method provides a better rate-distortion trade-off than BPG, JPEG and other recent image compression methods based on deep learning.
- 2021. Versatile Video Coding Reference Software Version 12.1 (VTM-12.1). https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tags/VTM-12.1.Google Scholar
- Mohammad Akbari, Jie Liang, Jingning Han, and Chengjie Tu. 2021. Learned bi-resolution image coding using generalized octave convolutions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 6592–6599.Google ScholarCross Ref
- Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018).Google Scholar
- Jean Bégaint, Fabien Racapé, Simon Feltman, and Akshay Pushparaja. 2020. Compressai: a pytorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029 (2020).Google Scholar
- Fabrice Bellard. 2015. BPG image format. URL https://bellard. org/bpg 1, 2 (2015), 1.Google Scholar
- Gisle Bjontegaard. 2001. Calculation of average PSNR differences between RD-curves. ITU SG16 Doc. VCEG-M33 (2001).Google Scholar
- Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis, Marcus Rohrbach, Shuicheng Yan, and Jiashi Feng. 2019. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. In Proceedings of the IEEE/CVF international conference on computer vision. 3435–3444.Google ScholarCross Ref
- Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. 2020. Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7939–7948.Google ScholarCross Ref
- Rich Franzen. 1999. Kodak lossless true color image suite. source: http://r0k. us/graphics/kodak 4, 2 (1999), 9.Google Scholar
- Zongyu Guo, Zhizheng Zhang, Runsen Feng, and Zhibo Chen. 2021. Causal contextual prediction for learned image compression. IEEE Transactions on Circuits and Systems for Video Technology 32, 4 (2021), 2329–2341.Google ScholarCross Ref
- Zongyu Guo, Zhizheng Zhang, Runsen Feng, and Zhibo Chen. 2021. Soft then hard: Rethinking the quantization in neural image compression. In International Conference on Machine Learning. PMLR, 3920–3929.Google Scholar
- Dailan He, Ziming Yang, Weikun Peng, Rui Ma, Hongwei Qin, and Yan Wang. 2022. Elic: Efficient learned image compression with unevenly grouped space-channel contextual adaptive coding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5718–5727.Google ScholarCross Ref
- Dailan He, Yaoyan Zheng, Baocheng Sun, Yan Wang, and Hongwei Qin. 2021. Checkerboard context model for efficient learned image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14771–14780.Google ScholarCross Ref
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 740–755.Google ScholarCross Ref
- Ming Lu, Fangdong Chen, Shiliang Pu, and Zhan Ma. 2022. High-efficiency lossy image coding through adaptive neighborhood information aggregation. arXiv preprint arXiv:2204.11448 (2022).Google Scholar
- David Minnen, Johannes Ballé, and George D Toderici. 2018. Joint autoregressive and hierarchical priors for learned image compression. Advances in neural information processing systems 31 (2018).Google Scholar
- Athanassios Skodras, Charilaos Christopoulos, and Touradj Ebrahimi. 2001. The JPEG 2000 still image compression standard. IEEE Signal processing magazine 18, 5 (2001), 36–58.Google ScholarCross Ref
- Myungseo Song, Jinyoung Choi, and Bohyung Han. 2021. Variable-rate deep image compression through spatially-adaptive feature transform. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2380–2389.Google ScholarCross Ref
- Gregory K Wallace. 1992. The JPEG still picture compression standard. IEEE transactions on consumer electronics 38, 1 (1992), xviii–xxxiv.Google ScholarDigital Library
- Yiming Wang, Qian Huang, Bin Tang, Huashan Sun, and Xiaotong Guo. 2023. FGC-VC: Flow-Guided Context Video Compression. In 2023 IEEE International Conference on Image Processing (ICIP). IEEE, 3175–3179.Google Scholar
- Yibo Yang, Robert Bamler, and Stephan Mandt. 2020. Improving inference for neural image compression. Advances in Neural Information Processing Systems 33 (2020), 573–584.Google Scholar
- Jing Zhao, Bin Li, Jiahao Li, Ruiqin Xiong, and Yan Lu. 2021. A universal encoder rate distortion optimization framework for learned compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1880–1884.Google ScholarCross Ref
- Yinhao Zhu, Yang Yang, and Taco Cohen. 2021. Transformer-based transform coding. In International Conference on Learning Representations.Google Scholar
Index Terms
- End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation
Recommendations
Simple bit-plane coding for lossless image compression and extended functionalities
PCS'09: Proceedings of the 27th conference on Picture Coding SymposiumA simple lossy-to-lossless bit-plane coding of still images is presented to integrate several functionality extensions including selective tile partitioning, progressive transmission, ROI transmission, accuracy scalability, and others. The mean squared ...
Conditional Entropy Coding of VQ Indexes for Image Compression
DCC '97: Proceedings of the Conference on Data CompressionVector quantization (VQ) is a source coding methodology with provable rate-distortion optimality. However, despite more than two decades of intensive research, VQ theoretical promise is yet to be fully realized in image compression practice. Restricted ...
Progressive scalable interactive region-of-interest image coding using vector quantization
We have developed novel progressive scalable region-of-interest (ROI) image compression schemes with rate-distortion-complexity tradeoff based on vector quantization. Residual vector quantization (RVQ) equips the encoder with a multi-resolution ...
Comments