Abstract
In real-world industrial applications, learning to recognize novel visual categories from a few samples is challenging and promising. Although some efforts have been made in the academic field for few-shot classification studies, there is still a lack of high-precision fine-grained few-shot classification models in some specific fields, especially in the fine-grained agricultural field. As far as we know, this study is the first work on meta-learning few-shot classification for fine-grained plant disease classification (specific to disease severity). We propose a multi-perspective hybrid attention meta-learning model based on a Batch Nuclear-norm constraint. The model explores discriminative features by focusing on key regions, and the hybrid attention module is divided into two sub-modules, soft attention model and patch-hard attention model. The discriminability and diversity constraint module is introduced in the loss function to constrain the Batch Nuclear-norm of the classification matrix, which improves the discriminative properties of the classification model and increases its diversity at the same time. In this paper, a large number of experiments have been carried out on multiple datasets. The experimental results demonstrate that our work has better performance than state-of-the-art models. It can be said that our work is a valuable supplement to the domain-specific industrial application models.
Similar content being viewed by others
Data availability
Due to the nature of this research, participants of this study did not agree for their data to be shared publicly, so supporting data are not available.
References
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances neural information processing systems. 28
Joshi D, Singh TP, Joshi AK (2022) Deep learning-based localization and segmentation of wrist fractures on x-ray radiographs. Neural Comput Appl 34(21):19061–19077
Wadhawan A, Kumar P (2020) Deep learning-based sign language recognition system for static signs. Neural Comput Appl 32:7957–7968
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. 25
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML Deep learning workshop. vol. 2, p. 0. Lille
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems. 30
Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1199–1208
Vinyals O, Blundell C, Lillicrap T, Wierstra D et al (2016) Matching networks for one shot learning. In: Advances in neural information processing systems. 29
Zhang B, Li X, Ye Y, Huang Z, Zhang L (2021) Prototype completion with primitive knowledge for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 3754–3762
Liu B, Cao Y, Lin Y, Li Q, Zhang Z, Long M, Hu H (2020) Negative margin matters: Understanding margin in few-shot classification. In: European conference on computer vision. pp. 438–455. Springer
Liu C, Fu Y, Xu C, Yang S, Li J, Wang C, Zhang L (2021) Learning a few-shot embedding model with contrastive learning. In: Proceedings of the AAAI conference on artificial intelligence. 35:8635–8643
Xie J, Long F, Lv J, Wang Q, Li P (2022) Joint distribution matters: Deep brownian distance covariance for few-shot classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 7972–7981
Wei X-S, Song Y-Z, Mac Aodha O, Wu J, Peng Y, Tang J, Yang J, Belongie S (2021) Fine-grained image analysis with deep learning: a survey. IEEE Trans Pattern Anal Mach Intell 44(12):8927–8948
Li W, Xu J, Huo J, Wang L, Gao Y, Luo J (2019) Distribution consistency based covariance metric networks for few-shot learning. In: Proceedings of the AAAI conference on artificial intelligence. 33:8642–8649
Dong C, Li W, Huo J, Gu Z, Gao Y (2021) Learning task-aware local representations for few-shot learning. In: Proceedings of the twenty-ninth international conference on international joint conferences on artificial intelligence. pp. 716–722
Li W, Wang L, Xu J, Huo J, Gao Y, Luo J (2019) Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 7260–7268
Wertheimer D, Hariharan B (2019) Few-shot learning with localization in realistic settings. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 6558–6567
Yan S, Zhang S, He X (2019) A dual attention network with semantic embedding for few-shot learning. In: AAAI, 9079–9086
Behera A, Wharton Z, Hewage PR, Bera A (2021) Context-aware attentional pooling (cap) for fine-grained visual classification. In: Proceedings of the AAAI conference on artificial intelligence. 35:929–937
Sun X, Xv H, Dong J, Zhou H, Chen C, Li Q (2020) Few-shot learning for domain-specific fine-grained image classification. IEEE Trans Ind Electron 68(4):3588–3598
Ferentinos KP (2018) Deep learning models for plant disease detection and diagnosis. Comput Electron Agric 145:311–318
Selvaraj MG, Vergara A, Ruiz H, Safari N, Elayabalan S, Ocimati W, Blomme G (2019) Ai-powered banana diseases and pest detection. Plant Methods 15(1):1–11
Aboneh T, Rorissa A, Srinivasagan R, Gemechu A (2021) Computer vision framework for wheat disease identification and classification using jetson gpu infrastructure. Technologies 9(3):47
Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) Meta-learning with memory-augmented neural networks. In: International conference on machine learning. pp. 1842–1850 . PMLR
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning. pp. 1126–1135. PMLR
Ravi S, Larochelle H (2016) Optimization as a model for few-shot learning
Abbas M, Xiao Q, Chen L, Chen P-Y, Chen T (2022) Sharp-maml: sharpness-aware model-agnostic meta learning. arXiv preprint arXiv:2206.03996
Chen Y, Wang X, Liu Z, Xu H, Darrell T (2020) A new meta-baseline for few-shot learning
Fu J, Zheng H, Mei T (2017) Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 4438–4446
Leng J, Liu Y, Chen S (2019) Context-aware attention network for image recognition. Neural Comput Appl 31:9295–9305
Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE international conference on computer vision. pp. 5209–5217
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems. 30
Wei X-S, Luo J-H, Wu J, Zhou Z-H (2017) Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Trans Image Process 26(6):2868–2881
Zhu L, Yang Y (2018) Compound memory networks for few-shot video classification. In: Proceedings of the European conference on computer vision (ECCV). pp. 751–766
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778
Shannon CE (1948) A mathematical theory of communication. The Bell Syst Tech J 27(3):379–423
Cui S, Wang S, Zhuo J, Li L, Huang Q, Tian Q (2020) Towards discriminability and diversity: Batch nuclear-norm maximization under label insufficient situations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 3941–3950
Song J, Shen C, Yang Y, Liu Y, Song M (2018) Transductive unbiased embedding for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1024–1033
Zhuo J, Wang S, Cui S, Huang Q (2019) Unsupervised open domain recognition by semantic discrepancy minimization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 750–759
Zou Y, Yu Z, Liu X, Kumar B, Wang J (2019) Confidence regularized self-training. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 5982–5991
Zou Y, Yu Z, Kumar B, Wang J (2018) Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proceedings of the European conference on computer vision (ECCV). pp. 289–305
Fazel M (2002) Matrix rank minimization with applications. PhD thesis, PhD thesis, Stanford University
Recht B, Fazel M, Parrilo PA (2010) Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev 52(3):471–501
Srebro N, Rennie J, Jaakkola T(2004) Maximum-margin matrix factorization. In: Advances in neural information processing systems. 17
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition. pp. 248–255. IEEE
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset
Sun Q, Liu Y, Chua T-S, Schiele B (2019) Meta-transfer learning for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 403–412
Liu Y, Schiele B, Sun Q (2020) An ensemble of epoch-wise empirical bayes for few-shot learning. In: Computer Vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVI 16, pp. 404–421. Springer
Park S-J, Han S, Baek J-W, Kim I, Song J, Lee HB, Han J-J, Hwang SJ (2020) Meta variance transfer: Learning to augment from the others. In: International conference on machine learning, pp. 7510–7520 . PMLR
Chen Z, Fu Y, Zhang Y, Jiang Y-G, Xue X, Sigal L (2019) Multi-level semantic feature augmentation for one-shot learning. IEEE Trans Image Process 28(9):4594–4605
Chen W-Y, Liu Y-C, Kira Z, Wang Y-CF, Huang J-B (2019) A closer look at few-shot classification. arXiv preprint arXiv:1904.04232
Ye H-J, Hu H, Zhan D-C, Sha F (2020) Few-shot learning via embedding adaptation with set-to-set functions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8808–8817
Zhang C, Cai Y, Lin G, Shen C (2020) Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 12203–12213
Acknowledgements
This work was supported by the National Key R &D Program of China (No. 2021ZD0110901).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, M., Yao, H. & Wang, Y. Focus nuance and toward diversity: exploring domain-specific fine-grained few-shot recognition. Neural Comput & Applic 35, 21275–21290 (2023). https://doi.org/10.1007/s00521-023-08787-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08787-4