Skip to main content
Log in

Focus nuance and toward diversity: exploring domain-specific fine-grained few-shot recognition

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In real-world industrial applications, learning to recognize novel visual categories from a few samples is challenging and promising. Although some efforts have been made in the academic field for few-shot classification studies, there is still a lack of high-precision fine-grained few-shot classification models in some specific fields, especially in the fine-grained agricultural field. As far as we know, this study is the first work on meta-learning few-shot classification for fine-grained plant disease classification (specific to disease severity). We propose a multi-perspective hybrid attention meta-learning model based on a Batch Nuclear-norm constraint. The model explores discriminative features by focusing on key regions, and the hybrid attention module is divided into two sub-modules, soft attention model and patch-hard attention model. The discriminability and diversity constraint module is introduced in the loss function to constrain the Batch Nuclear-norm of the classification matrix, which improves the discriminative properties of the classification model and increases its diversity at the same time. In this paper, a large number of experiments have been carried out on multiple datasets. The experimental results demonstrate that our work has better performance than state-of-the-art models. It can be said that our work is a valuable supplement to the domain-specific industrial application models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

Due to the nature of this research, participants of this study did not agree for their data to be shared publicly, so supporting data are not available.

References

  1. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances neural information processing systems. 28

  2. Joshi D, Singh TP, Joshi AK (2022) Deep learning-based localization and segmentation of wrist fractures on x-ray radiographs. Neural Comput Appl 34(21):19061–19077

    Article  Google Scholar 

  3. Wadhawan A, Kumar P (2020) Deep learning-based sign language recognition system for static signs. Neural Comput Appl 32:7957–7968

    Article  Google Scholar 

  4. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. 25

  5. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  6. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9

  7. Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML Deep learning workshop. vol. 2, p. 0. Lille

  8. Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems. 30

  9. Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1199–1208

  10. Vinyals O, Blundell C, Lillicrap T, Wierstra D et al (2016) Matching networks for one shot learning. In: Advances in neural information processing systems. 29

  11. Zhang B, Li X, Ye Y, Huang Z, Zhang L (2021) Prototype completion with primitive knowledge for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 3754–3762

  12. Liu B, Cao Y, Lin Y, Li Q, Zhang Z, Long M, Hu H (2020) Negative margin matters: Understanding margin in few-shot classification. In: European conference on computer vision. pp. 438–455. Springer

  13. Liu C, Fu Y, Xu C, Yang S, Li J, Wang C, Zhang L (2021) Learning a few-shot embedding model with contrastive learning. In: Proceedings of the AAAI conference on artificial intelligence. 35:8635–8643

  14. Xie J, Long F, Lv J, Wang Q, Li P (2022) Joint distribution matters: Deep brownian distance covariance for few-shot classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 7972–7981

  15. Wei X-S, Song Y-Z, Mac Aodha O, Wu J, Peng Y, Tang J, Yang J, Belongie S (2021) Fine-grained image analysis with deep learning: a survey. IEEE Trans Pattern Anal Mach Intell 44(12):8927–8948

    Article  Google Scholar 

  16. Li W, Xu J, Huo J, Wang L, Gao Y, Luo J (2019) Distribution consistency based covariance metric networks for few-shot learning. In: Proceedings of the AAAI conference on artificial intelligence. 33:8642–8649

  17. Dong C, Li W, Huo J, Gu Z, Gao Y (2021) Learning task-aware local representations for few-shot learning. In: Proceedings of the twenty-ninth international conference on international joint conferences on artificial intelligence. pp. 716–722

  18. Li W, Wang L, Xu J, Huo J, Gao Y, Luo J (2019) Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 7260–7268

  19. Wertheimer D, Hariharan B (2019) Few-shot learning with localization in realistic settings. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 6558–6567

  20. Yan S, Zhang S, He X (2019) A dual attention network with semantic embedding for few-shot learning. In: AAAI, 9079–9086

  21. Behera A, Wharton Z, Hewage PR, Bera A (2021) Context-aware attentional pooling (cap) for fine-grained visual classification. In: Proceedings of the AAAI conference on artificial intelligence. 35:929–937

  22. Sun X, Xv H, Dong J, Zhou H, Chen C, Li Q (2020) Few-shot learning for domain-specific fine-grained image classification. IEEE Trans Ind Electron 68(4):3588–3598

    Article  Google Scholar 

  23. Ferentinos KP (2018) Deep learning models for plant disease detection and diagnosis. Comput Electron Agric 145:311–318

    Article  Google Scholar 

  24. Selvaraj MG, Vergara A, Ruiz H, Safari N, Elayabalan S, Ocimati W, Blomme G (2019) Ai-powered banana diseases and pest detection. Plant Methods 15(1):1–11

    Article  Google Scholar 

  25. Aboneh T, Rorissa A, Srinivasagan R, Gemechu A (2021) Computer vision framework for wheat disease identification and classification using jetson gpu infrastructure. Technologies 9(3):47

    Article  Google Scholar 

  26. Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) Meta-learning with memory-augmented neural networks. In: International conference on machine learning. pp. 1842–1850 . PMLR

  27. Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning. pp. 1126–1135. PMLR

  28. Ravi S, Larochelle H (2016) Optimization as a model for few-shot learning

  29. Abbas M, Xiao Q, Chen L, Chen P-Y, Chen T (2022) Sharp-maml: sharpness-aware model-agnostic meta learning. arXiv preprint arXiv:2206.03996

  30. Chen Y, Wang X, Liu Z, Xu H, Darrell T (2020) A new meta-baseline for few-shot learning

  31. Fu J, Zheng H, Mei T (2017) Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 4438–4446

  32. Leng J, Liu Y, Chen S (2019) Context-aware attention network for image recognition. Neural Comput Appl 31:9295–9305

    Article  Google Scholar 

  33. Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE international conference on computer vision. pp. 5209–5217

  34. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems. 30

  35. Wei X-S, Luo J-H, Wu J, Zhou Z-H (2017) Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Trans Image Process 26(6):2868–2881

    Article  MathSciNet  MATH  Google Scholar 

  36. Zhu L, Yang Y (2018) Compound memory networks for few-shot video classification. In: Proceedings of the European conference on computer vision (ECCV). pp. 751–766

  37. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778

  38. Shannon CE (1948) A mathematical theory of communication. The Bell Syst Tech J 27(3):379–423

    Article  MathSciNet  MATH  Google Scholar 

  39. Cui S, Wang S, Zhuo J, Li L, Huang Q, Tian Q (2020) Towards discriminability and diversity: Batch nuclear-norm maximization under label insufficient situations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 3941–3950

  40. Song J, Shen C, Yang Y, Liu Y, Song M (2018) Transductive unbiased embedding for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1024–1033

  41. Zhuo J, Wang S, Cui S, Huang Q (2019) Unsupervised open domain recognition by semantic discrepancy minimization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 750–759

  42. Zou Y, Yu Z, Liu X, Kumar B, Wang J (2019) Confidence regularized self-training. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 5982–5991

  43. Zou Y, Yu Z, Kumar B, Wang J (2018) Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proceedings of the European conference on computer vision (ECCV). pp. 289–305

  44. Fazel M (2002) Matrix rank minimization with applications. PhD thesis, PhD thesis, Stanford University

  45. Recht B, Fazel M, Parrilo PA (2010) Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev 52(3):471–501

    Article  MathSciNet  MATH  Google Scholar 

  46. Srebro N, Rennie J, Jaakkola T(2004) Maximum-margin matrix factorization. In: Advances in neural information processing systems. 17

  47. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition. pp. 248–255. IEEE

  48. Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset

  49. Sun Q, Liu Y, Chua T-S, Schiele B (2019) Meta-transfer learning for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 403–412

  50. Liu Y, Schiele B, Sun Q (2020) An ensemble of epoch-wise empirical bayes for few-shot learning. In: Computer Vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVI 16, pp. 404–421. Springer

  51. Park S-J, Han S, Baek J-W, Kim I, Song J, Lee HB, Han J-J, Hwang SJ (2020) Meta variance transfer: Learning to augment from the others. In: International conference on machine learning, pp. 7510–7520 . PMLR

  52. Chen Z, Fu Y, Zhang Y, Jiang Y-G, Xue X, Sigal L (2019) Multi-level semantic feature augmentation for one-shot learning. IEEE Trans Image Process 28(9):4594–4605

    Article  MathSciNet  MATH  Google Scholar 

  53. Chen W-Y, Liu Y-C, Kira Z, Wang Y-CF, Huang J-B (2019) A closer look at few-shot classification. arXiv preprint arXiv:1904.04232

  54. Ye H-J, Hu H, Zhan D-C, Sha F (2020) Few-shot learning via embedding adaptation with set-to-set functions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8808–8817

  55. Zhang C, Cai Y, Lin G, Shen C (2020) Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 12203–12213

Download references

Acknowledgements

This work was supported by the National Key R &D Program of China (No. 2021ZD0110901).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongxun Yao.

Ethics declarations

Conflict of interest

The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, M., Yao, H. & Wang, Y. Focus nuance and toward diversity: exploring domain-specific fine-grained few-shot recognition. Neural Comput & Applic 35, 21275–21290 (2023). https://doi.org/10.1007/s00521-023-08787-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08787-4

Keywords

Navigation