An improved model based on YOLOX for detection of tea sprouts in natural environment

Li, Xiutong; Liu, Ruixin; Li, Yuxin; Li, Zhilin; Yan, Peng; Yu, Mei; Dong, Xuan; Yan, Jianwei; Xie, Benliang

doi:10.1007/s12530-024-09589-2

An improved model based on YOLOX for detection of tea sprouts in natural environment

Original Paper
Published: 08 May 2024

(2024)
Cite this article

Evolving Systems Aims and scope Submit manuscript

Xiutong Li^1,2,
Ruixin Liu^1,2,
Yuxin Li^1,2,
Zhilin Li^1,2,
Peng Yan^1,2,
Mei Yu^1,2,
Xuan Dong⁴,
Jianwei Yan⁵ &
…
Benliang Xie^1,2,3

88 Accesses
Explore all metrics

Abstract

The tea industry occupies a pivotal and important position in China’s import and export trade commodities. With the improvement of people's quality of life, the demand for famous tea sprout is increasing. However, manual picking is inefficient and costly. Although mechanical picking can pick tea sprouts efficiently, it lacks selectivity, which leads to an increase in the workload of post-processing and screening of superior tea leaves. To address this, this paper establishes a dataset for tea sprouts in natural environments and proposes an improved YOLOX tea sprouts detection model, YOLOX-ST based on the Swin Transformer. The model employs the Swin Transformer as the backbone network to enhance overall detection accuracy. Additionally, it introduces the CBAM attention mechanism to address issues of miss-detection and false detections in complex environments. Furthermore, a small target detection layer is also incorporated to resolve the problem of incomplete information about tea sprout features learned from the deep feature map. To address the sample imbalance, we introduce the EIoU loss function and apply Focal Loss to the confidence level. The experimental results demonstrate that the proposed model in this paper achieves an accuracy of 95.45%, which is 5.73% higher than the original YOLOX model. Moreover, it outperforms other YOLO series models in terms of accuracy, while achieving a faster detection speed, reaching 93.2 FPS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

YOLOX target detection model can identify and classify several types of tea buds with similar characteristics

Article Open access 03 February 2024

Detection and recognition of tea buds by integrating deep learning and image-processing algorithm

Article 10 February 2024

Tea leaf disease detection and identification based on YOLOv7 (YOLO-T)

Article Open access 13 April 2023

Data availability

The data supporting this study’s findings are available from the corresponding author upon reasonable request.

References

Cao ML, Fu H, Zhu JY, Cai CG (2022) Lightweight tea bud recognition network integrating GhostNet and YOLOv5. Math Biosci Eng MBE 19(12):12897–12914. https://doi.org/10.3934/mbe.2022602
Article Google Scholar
Chen B, Yan JL, Wang K, Matušů R (2021) Fresh tea sprouts detection via image enhancement and fusion SSD. J Control Sci Eng 2021:1–11. https://doi.org/10.1155/2021/6614672
Article Google Scholar
Chen CL, Lu JZ, Zhou MC, Yi J, Liao M, Gao ZM (2022) A YOLOv3-based computer vision system for identification of tea buds and the picking point. Comput Electron Agric 198:107116. https://doi.org/10.1016/j.compag.2022.107116
Article Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. J Mach Ing 20:273–297. https://doi.org/10.1007/BF00994018
Article Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. Preprint https://doi.org/10.48550/arXiv.1810.04805
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai XH, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit N, Houlsby, N (2021) An image is worth 16 × 16 words: transformers for image recognition at scale. Preprint http://arxiv.org/abs/2010.11929
Fu HX, Song GP, Wang YC (2021) Improved YOLOv4 marine target detection combined with CBAM. Symmetry 13(4):623. https://doi.org/10.3390/sym13040623
Article Google Scholar
Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchie for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Gui ZY, Chen JN, Li Y, Chen ZW, Wu CY, Dong CW (2023) A lightweight tea bud detection model based on Yolov5. Comput Electron Agric 205:107636. https://doi.org/10.1016/j.compag.2023.107636
Article Google Scholar
Jian W, Li SG, Yang C (2019) Fast segmentation of tea flowers based on color and region growth. In: 11th international conference on digital image processing (ICDIP 2019), pp 111790R. https://doi.org/10.1117/12.2539682
Karunasena GMKB, Priyankara H (2020) Tea bud leaf identification by using machine learning and image processing techniques. Int J Sci Eng Res 11(8):624–628. https://doi.org/10.14299/ijser.2020.08.02
Article Google Scholar
Lin TY, Goyal P, Girshick R, He KM, Dollár P (2017a) Focal loss for dense object detection. In: 2017 IEEE international conference on computer vision (ICCV), pp 2999–3007. https://doi.org/10.1109/ICCV.2017.324
Lin TY, Dollár P, Girshick R, He KM, Hariharan B, Belongie S (2017b) Feature pyramid networks for object detection. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 936–944. https://doi.org/10.1109/CVPR.2017.106
Lin TY, Goyal P, Girshick R, He KM, Dollár P (2017) Focal loss for dense object detection. In: 2017 IEEE international conference on compute vision (ICCV), pp 2999–3007. https://doi.org/10.1109/ICCV.2017.324
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC, Leibe B, Matas J, Sebe N, Welling M (2016) SSD: single shot multibox detector. In: Computer vision—ECCV 2016, lecture notes in computer science, pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Liu Z, Lin YT, Cao Y, Hu H, Wei YX, Zhang Z, Lin S, Guo B (2021) Swin Transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 9992–10002. https://doi.org/10.1109/ICCV48922.2021.00986
Ma B, Wang XR, Zhang H, Li Fu, Dan JW, Sun XM, Pan ZQ, Bertino E (2019) CBAM-GAN: generative adversarial networks based on convolutional block attention module. In: Artificial intelligence and security, pp 227–236. https://doi.org/10.1007/978-3-030-24274-9_20
Paranavithana IR, Kalansuriya VR (2021) Deep convolutional neural network model for tea bud(s) classification. IAENG Int J Comput Sci 48(3):599–604
Google Scholar
Redmon J, Farhadi A (2016) YOLO9000: better, faster, stronger. computer vision and pattern recognition. Preprint https://doi.org/10.48550/arXiv.1612.08242
Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. computer vision and pattern recognition. Preprint. https://doi.org/10.48550/arXiv.1804.02767
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Ren SQ, He KM, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 28
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid L, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. Preprint http://arxiv.org/abs/1902.09630
Shao PD, Wu MH, Wang XW, Zhou J, Liu S (2018) Research on the tea bud recognition based on improved k-means algorithm. In: Proceedings of 2018 2nd international conference on electronic information technology and computer engineering (EITCE 2018), pp 846–850
Wang CY, Bochkovskiy A, Liao HYM (2023) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: 2023 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 7464–7475. https://doi.org/10.1109/CVPR52729.2023.00721
Woo SH, Park JC, Lee JY, Kweon IS (2018) CBAM: convolutional block attention module. Preprint http://arxiv.org/abs/1807.06521
Xu WK, Zhao LG, Li J, Shang SQ, Ding XP, Wang TW (2022) Detection and classification of tea buds based on deep learning. Comput Electron Agric 192:106547. https://doi.org/10.1016/j.compag.2021.106547
Article Google Scholar
Yang HL, Chen L, Chen MT, Ma ZB, Deng F, Li MZ, Li XR (2019) Tender tea shoots recognition and positioning for picking robot using improved YOLO-V3 model. IEEE Access 7:180998–181011. https://doi.org/10.1109/ACCESS.2019.2958614
Article Google Scholar
Zhang L, Zou L, Wu CY, Jia JM, Chen JN (2021a) Method of famous tea sprout identification and segmentation based on improved watershed algorithm. Comput Electron Agric 184:106108. https://doi.org/10.1016/j.compag.2021.106108
Article Google Scholar
Zhang G, Liu ST, Wang F, Li ZM, Sun J (2021b) YOLOX: exceeding YOLO series in 2021. Preprint. https://doi.org/10.48550/arXiv.2107.08430
Zhang YF, Ren WQ, Zhang Z, Jia Z, Wang L, Tan T (2022) Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 506:146–157. https://doi.org/10.1016/j.neucom.2022.07.04
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Key Research and Development Program (No. 2016YFD0201305-07), Guizhou Provincial Basic Research Program (Natural Science) (No. ZK[2023]060), Open Fund Project in Semiconductor Power Device Reliability Engineering Center of Ministry of Education (No. ERCMEKFJJ2019-06). Thanks for the computing support of the State Key Laboratory of Public Big Data, Guizhou University.

Author information

Authors and Affiliations

College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
Xiutong Li, Ruixin Liu, Yuxin Li, Zhilin Li, Peng Yan, Mei Yu & Benliang Xie
Power Semiconductor Device Reliability Engineering Center of the Ministry of Education, Guiyang, 550025, China
Xiutong Li, Ruixin Liu, Yuxin Li, Zhilin Li, Peng Yan, Mei Yu & Benliang Xie
State Key Laboratory of Public Big Data, Guizhou University, Guiyang, 550025, China
Benliang Xie
College of Tea Sciences, Guizhou University, Guiyang, 550025, China
Xuan Dong
College of Mechanical Engineering, Guizhou University, Guiyang, 550025, China
Jianwei Yan

Authors

Xiutong Li
View author publications
You can also search for this author in PubMed Google Scholar
Ruixin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yuxin Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhilin Li
View author publications
You can also search for this author in PubMed Google Scholar
Peng Yan
View author publications
You can also search for this author in PubMed Google Scholar
Mei Yu
View author publications
You can also search for this author in PubMed Google Scholar
Xuan Dong
View author publications
You can also search for this author in PubMed Google Scholar
Jianwei Yan
View author publications
You can also search for this author in PubMed Google Scholar
Benliang Xie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benliang Xie.

Ethics declarations

Conflict of interest

The authors declared no potential conflict of interest with respect to the research, author-ship, or publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, X., Liu, R., Li, Y. et al. An improved model based on YOLOX for detection of tea sprouts in natural environment. Evolving Systems (2024). https://doi.org/10.1007/s12530-024-09589-2

Download citation

Received: 13 November 2023
Accepted: 19 April 2024
Published: 08 May 2024
DOI: https://doi.org/10.1007/s12530-024-09589-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved model based on YOLOX for detection of tea sprouts in natural environment

Abstract

Access this article

Similar content being viewed by others

YOLOX target detection model can identify and classify several types of tea buds with similar characteristics

Detection and recognition of tea buds by integrating deep learning and image-processing algorithm

Tea leaf disease detection and identification based on YOLOv7 (YOLO-T)

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An improved model based on YOLOX for detection of tea sprouts in natural environment

Abstract

Access this article

Similar content being viewed by others

YOLOX target detection model can identify and classify several types of tea buds with similar characteristics

Detection and recognition of tea buds by integrating deep learning and image-processing algorithm

Tea leaf disease detection and identification based on YOLOv7 (YOLO-T)

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation