Abstract
Data-free quantization compresses the neural network to low bit-width without access to original training data. Most existing data-free quantization methods cause severe performance degradation due to inaccurate activation clipping range and quantization error. In this paper, we present a simple yet effective data-free quantization method with accurate activation clipping and adaptive batch normalization. Accurate activation clipping (AAC) improves the model accuracy by exploiting accurate activation information from the full-precision model. Adaptive batch normalization (ABN) firstly proposes to address the quantization error from distribution changes by updating the batch normalization layer adaptively. Extensive experiments demonstrate that the proposed data-free quantization method can yield surprising performance, achieving 64.33% top-1 accuracy of 4-bit ResNet18 on ImageNet dataset, with 3.7% absolute improvement outperforming the existing state-of-the-art methods.
Similar content being viewed by others
References
Wang C-Y, Bochkovskiy A, Liao H-YM (2021) Scaled-YOLOv4: scaling cross stage partial network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 13029–13038
Liu Z, Mao H, Wu C-Y, Feichtenhofer C, Darrell T, Xie S (2022) A convnet for the 2020s. arXiv preprint arXiv:2201.03545
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR)
Cai Y, Yao Z, Dong Z, Gholami A, Keutzer K (2020) Zeroq: a novel zero shot quantization framework. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Xu S, Li H, Zhuang B, Liu J, Cao J, Liang C, Tan M (2020) Generative low-bitwidth data free quantization. In: European Conference on Computer Vision, Springer, pp 1–17
Zhang X, Qin H, Ding Y, Gong R, Yan Q, Tao R, Li Y, Yu F, Liu X (2021) Diversifying sample generation for accurate data-free quantization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15658–15667
Choi K, Hong D, Park N, Kim Y, Lee J (2021) Qimera: data-free quantization with synthetic boundary supporting samples. Adv Neural Inf Process Syst 34
Zhong Y, Lin M, Nan G, Liu J, Zhang B, Tian Y, Ji R (2021) Intraq: learning synthetic images with intra-class heterogeneity for zero-shot network quantization. arXiv preprint arXiv:2111.09136
Nagel M, Baalen MV, Blankevoort T, Welling M (2019) Data-free quantization through weight equalization and bias correction. In: 2019 IEEE/CVF international conference on computer vision (ICCV)
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: CVPR
Deng J (2009) Imagenet : a large-scale hierarchical image database. Proc. CVPR, 2009
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Comput Sci
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. IEEE, 2818–2826
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
You A, Li X, Zhu Z, Tong Y (2019) TorchCV: a PyTorch-based framework for deep learning in computer vision. https://github.com/donnyyou/torchcv
Funding
This research was funded by Key Research and Development Program of Zhejiang Province of China (2021C02037) and National Key Research and Development Program of China (2022YFC3602601).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Experiments and analysis were performed by YH and LZ. The first draft of the manuscript was written by YH and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
We have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Implementation Details of Toy Experiment
In the toy experiment shown in Fig. 4, we let the model to be an identity transform. Given a target label, we calculate different loss functions and backpropagate to the input, which will make the score of target class higher during iteration. We run both experiments for 300 iterations. The algorithm is summarized below.
Appendix B: Additional Experiment on CIFAR-10
In this section, we demonstrate the performance improvement of our method on CIFAR-10 dataset with ResNet20. Notably, even without fine-tuning, our method achieves accuracy that is closely comparable to the full precision model for 8-bit quantization. As a result, further fine-tuning of the model is deemed unnecessary due to the already remarkable accuracy achieved (Table 6).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
He, Y., Zhang, L., Wu, W. et al. Data-Free Quantization with Accurate Activation Clipping and Adaptive Batch Normalization. Neural Process Lett 55, 10555–10568 (2023). https://doi.org/10.1007/s11063-023-11338-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-023-11338-6