Skip to main content
Log in

Local feature semantic alignment network for few-shot image classification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The goal of few-shot learning is to use a small number of labeled samples to train a machine learning model and then classify the unlabeled samples. Recent works, especially the methods based on image local feature representation in metric learning have achieved superior performance by utilizing the local invariant features and their rich discriminative information. However, the learned local features in the existing methods are not aligned when calculating their similarities, resulting in larger intra-class divergence and smaller inter-class divergence. In fact, the dominant object (local feature) of one image should only compare with the semantically relevant local feature of the other image. To address these issues, this paper proposes a few-shot learning approach (SANet) based on semantic alignment of local features. Specifically, we firstly obtain the local features of the query and support images by using a feature extraction module, and then compute the relation matrices of these local features. Using the above relation matrices, we respectively design an intra-class divergence rectification (intraDR) module and an inter-class divergence rectification (interDR) module to implement the local feature alignment and reduce the effect of the noise local features. The experimental results on multiple datasets show that, by aligning the local features, the proposed model can effectively minimize the intra-class divergence while maximizing the inter-class divergence, thus achieving better classification performance. The code for this paper can be accessed via https://github.com/SongQCode/SANet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5

Similar content being viewed by others

Database Availability

The data used in this study is sourced from a publicly available dataset. The download address of the dataset can be obtained through https://github.com/SongQCode/SANet.

References

  1. Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Proceedings of the 31st international conference on neural information processing systems, pp 4080–4090

  2. Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 1199–1208. https://doi.org/10.1109/cvpr.2018.00131

  3. Li W, Wang L, Xu J, Huo J, Gao Y, Luo J (2019) Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7260–7268. https://doi.org/10.1109/cvpr.2019.00743

  4. Zhang C, Cai Y, Lin G, Shen C (2020) Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12203–12213. https://doi.org/10.1109/cvpr42600.2020.01222

  5. Li W, Xu J, Huo J, Wang L, Gao Y, Luo J (2019) Distribution consistency based covariance metric networks for few-shot learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 8642–8649. https://doi.org/10.1609/aaai.v33i01.33018642

  6. Huang H, Wu Z, Li W, Huo J, Gao Y (2021) Local descriptor-based multi-prototype network for few-shot learning. Pattern Recognit 116:107935. https://doi.org/10.1016/j.patcog.2021.107935

    Article  Google Scholar 

  7. Shen Z, Liu Z, Qin J, Savvides M, Cheng K-T (2021) Partial is better than all: revisiting fine-tuning strategy for few-shot learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 9594–9602. https://doi.org/10.1609/aaai.v35i11.17155

  8. Royle JA, Dorazio RM, Link WA (2007) Analysis of multinomial models with unknown index using data augmentation. J Comput Graph Stat 16(1):67–85. https://doi.org/10.1198/106186007x181425

    Article  MathSciNet  Google Scholar 

  9. Chen Z, Fu Y, Zhang Y, Jiang Y-G, Xue X, Sigal L (2018) Semantic feature augmentation in few-shot learning. 86(89):2. arXiv:1804.05298

  10. Tian S, Li W, Ning X, Ran H, Qin H, Tiwari P (2023) Continuous transfer of neural network representational similarity for incremental learning. Neurocomputing 545:126300. https://doi.org/10.1016/j.neucom.2023.126300

    Article  Google Scholar 

  11. Koch G, Zemel R, Salakhutdinov R, et al (2015) Siamese neural networks for one-shot image recognition. In: ICML deep learning workshop, vol 2

  12. Chowdhury RR, Bathula DR (2022) Influential prototypical networks for few shot learning: A dermatological case study. In: 2022 IEEE 19th international symposium on biomedical imaging (ISBI), pp 1–4. https://doi.org/10.1109/isbi52829.2022.9761403

  13. Tolstikhin, IO, Sriperumbudur, BK, Schölkopf, B (2016) Minimax estimation of maximum mean discrepancy with radial kernels. Proceedings of the 30st international conference on neural information processing systems, 1938–1946

  14. Dong C, Li W, Huo J, Gu Z, Gao Y (2021) Learning task-aware local representations for few-shot learning. In: Proceedings of the twenty-ninth international conference on international joint conferences on artificial intelligence, pp 716–722. https://doi.org/10.24963/ijcai.2020/100

  15. Chen H, Li H, Li Y, Chen C (2022) Multi-scale adaptive task attention network for few-shot learning. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp 4765–4771. https://doi.org/10.1109/icpr56361.2022.9955637

  16. Hao F, He F, Cheng J, Wang L, Cao J, Tao D (2019) Collect and select: Semantic alignment metric learning for few-shot learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8460–8469. https://doi.org/10.1109/iccv.2019.00855

  17. Zagoruyko S, Komodakis N (2016) Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv:1612.03928https://doi.org/10.48550/arXiv.1612.03928

  18. Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: International conference on learning representations

  19. Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp 1126–1135

  20. Li Z, Zhou F, Chen F, Li H (2017) Meta-sgd: Learning to learn quickly for few-shot learning. arXiv:1707.09835https://doi.org/10.48550/arXiv.1707.09835

  21. Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) Meta-learning with memory-augmented neural networks. In: International conference on machine learning, pp 1842–1850

  22. Jamal MA, Qi G-J (2019) Task agnostic meta-learning for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11719–11727. https://doi.org/10.1109/cvpr.2019.01199

  23. Sun Q, Liu Y, Chua T-S, Schiele B (2019) Meta-transfer learning for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 403–412 https://doi.org/10.1109/CVPR.2019.00049

  24. Liu Y, Zheng T, Song J, Cai D, He X (2022) Dmn4: Few-shot learning via discriminative mutual nearest neighbor neural network. In: Proceedings of the AAAI conference on artificial intelligence, vol 36, pp 1828–1836. https://doi.org/10.1609/aaai.v36i2.20076

  25. Ning X, Tian W, He F, Bai X, Sun L, Li W (2023) Hyper-sausage coverage function neuron model and learning algorithm for image classification. Pattern Recognit 136:109216. https://doi.org/10.1016/j.patcog.2022.109216

    Article  Google Scholar 

  26. Liao X, Yu Y, Li B, Li Z, Qin Z (2019) A new payload partition strategy in color image steganography. IEEE Trans Circuits Syst Video Technol 30(3):685–696. https://doi.org/10.1109/TCSVT.2019.2896270

    Article  Google Scholar 

  27. Liao X, Li K, Zhu X, Liu KR (2020) Robust detection of image operator chain with two-stream convolutional neural network. IEEE J Sel Top Signal Process 14(5):955–968. https://doi.org/10.1109/JSTSP.2020.3002391

    Article  Google Scholar 

  28. Liao X, Yin J, Chen M, Qin Z (2020) Adaptive payload distribution in multiple images steganography based on image texture features. IEEE Trans Dependable Secure Comput 19(2):897–911. https://doi.org/10.1109/TDSC.2020.3004708

    Article  Google Scholar 

  29. Khosla A, Jayadevaprakash N, Yao B, Li F-F (2011) Novel dataset for fine-grained image categorization: Stanford dogs. In: Proceedings of CVPR workshop on fine-grained visual categorization (FGVC), vol 2

  30. Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 554–561. https://doi.org/10.1109/iccvw.2013.77

  31. Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset

  32. Gogoi M, Tiwari S, Verma S (2022) Adaptive prototypical networks. arXiv:2211.12479https://doi.org/10.48550/arXiv.2211.12479

  33. Nguyen HQ, Nguyen CQ, Le DD, Pham HH (2023) Enhancing few-shot image classification with cosine transformer. IEEE Access 79659–79672. https://doi.org/10.1109/ACCESS.2023.3298299

  34. Nakamura A, Harada T (2019) Revisiting fine-tuning for few-shot learning. arXiv:1910.00216https://doi.org/10.48550/arXiv.1910.00216

  35. Garcia V, Bruna J (2017) Few-shot learning with graph neural networks. arXiv:1711.04043https://doi.org/10.48550/arXiv.1711.04043

  36. Zheng Z, Feng X, Yu H, Li X, Gao M (2023) Bdla: Bi-directional local alignment for few-shot learning. Appl Intell 53(1):769–785. https://doi.org/10.1007/s10489-022-03479-3

    Article  Google Scholar 

  37. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 770–778. https://doi.org/10.1109/cvpr.2016.90

  38. Wu J, Chang D, Sain A, Li X, Ma Z, Cao J, Guo J, Song Y-Z (2022) Bi-directional feature reconstruction network for fine-grained few-shot image classification. arXiv:2211.17161https://doi.org/10.1609/aaai.v37i3.25383

  39. Huang H, Zhang J, Zhang J, Xu J, Wu Q (2020) Low-rank pairwise alignment bilinear network for few-shot fine-grained image classification. IEEE Trans Multimed 23:1666–1680. https://doi.org/10.1109/tmm.2020.3001510

    Article  Google Scholar 

  40. Gidaris S, Komodakis N (2018) Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4367–4375. https://doi.org/10.1109/cvpr.2018.00459

  41. Mishra N, Rohaninejad M, Chen X, Abbeel P (2017) A simple neural attentive meta-learner. arXiv:1707.03141https://doi.org/10.48550/arXiv.1707.03141

Download references

Acknowledgements

This work was supported partially by the National Natural Science Foundation of China under grant number 62006126 and 61872190, the Natural Science Foundation of Jiangsu Province under grant number BK20200740, the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under grant number 20KJB520004, Natural Science Research Start-up Foundation of Recruiting Talents of Nanjing University of Posts and Telecommunications under grant number NY219150

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Chen.

Ethics declarations

Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, P., Song, Q., Chen, L. et al. Local feature semantic alignment network for few-shot image classification. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18212-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-18212-0

Keywords

Navigation