Abstract
Semantic segmentation is a typical problem in the field of machine vision. Convolutional neural networks(CNNs)-based methods all have excellent performance in image semantic segmentation. Existing semantic segmentation models tend to focus only on the improvement of image segmentation performance, with little attention to the problems of network lightweighting and multi-scale feature information utilization. To address this problem, we design a multi-scale feature fusion lightweight semantic segmentation network(MFFLNet), which consists of two parts: a deep feature extraction module(DFEM) and a multi-scale feature extraction module(MFEM). First, the deep feature extraction module(DFEM) utilizes the deconvolution layer to replace the convolution layer, which can avoid the problem of feature information loss caused by cropping the feature map when feature fusion is performed. Meanwhile, the dimensionality of the feature map is compressed by using 1 × 1 convolutional layers after each upsampling layer, which can effectively reduce the number of parameters of the model. Then, the multi-scale feature extraction module(MFEM) employs multiple null convolutions with different expansion rates for feature extraction of the image to extract feature information on multiple scales. Finally, the deep features and multi-scale features extracted by the two modules respectively are fused to achieve semantic segmentation of the image. It is shown experimentally that the proposed MFFLNet outperforms the mainstream methods in semantic segmentation on two datasets, PASCAL VOC 2012 and Cityscapes, with mIoU reaching 71. 23% and 79. 24%, respectively, and improving 5. 8% and 8. 8% compared with the state-of-the-art DeepLab V3 + model, respectively.
Similar content being viewed by others
Data availability
Data sharing not applicable to this article as no datasets were generated or analysed during the current study.
Code availability
Not applicable.
References
Yu H, Yang Z, Tan L, Wang Y, Sun W, Sun M, Tang Y (2018) Methods and datasets on semantic segmentation: a review. Neurocomputing 304:82–103. https://doi.org/10.1016/j.neucom.2018.03.037
Garcia-Garcia, Orts-Escolano S, Oprea S, Villena-Martinez V, Martinez–Gonzalez P, Garcia-Rodriguez J (2018) A survey on deep learning techniques for image and video semantic segmentation. Appl Soft Comput 70:41–65
Han B, Wu Y (2017) A novel active contour model based on modified symmetric cross entropy for remote sensing river image segmentation. Pattern Recognit 67:396–409
Sun X, Zhang M, Dong J, Lguensat R, Yang Y, Lu X (2021) A deep framework for eddy detection and tracking from satellite sea surface height data. IEEE Trans Geosci Remote Sens 59(9):7224–7234. https://doi.org/10.1109/TGRS.2020.3032523
Chen S, Ding C, Liu M (2019) Dual-force convolutional neural networks for accurate brain tumor segmentation. Pattern Recognit 88:90–100
Bueno G, Fernandez-Carrobles MM, Gonzalez-Lopez L, Deniz O (2020) Glomerulosclerosis identification in whole slide images using semantic segmentation - ScienceDirect[J]. Comput Methods Prog Biomed 184:105273. https://doi.org/10.1016/j.cmpb.2019.105273
Zhang M, Li X, Xu M, Li Q (2020) Automated semantic segmentation of red blood cells for sickle cell disease. IEEE J Biomed Health Inform 24(11):3095–3102. https://doi.org/10.1109/JBHI.2020.3000484.
Cheng F, Zhang H, Yuan D, Sun M (2019) Leveraging semantic segmentation with learning-based confidence measure. Neurocomputing 329:21–31
Shen F, Zeng G (2019) Semantic image segmentation via guidance of image classification. Neurocomputing 330:259–266
Chen B, Chen G, Jian Y (2019) Importance-aware semantic segmentation for autonomous vehicles. IEEE Trans Intell Transp Syst 20(1):137–148
You J, Liu W, Lee J (2020) A DNN-based semantic segmentation for detecting weed and crop[J]. Comput Electron Agric 178:105750
Lottes P, Behley J, Chebrolu N et al (2020) Robust joint stem detection and crop-weed classification using image sequences for plant-specific treatment in precision farming[J]. J Field Robot 37(1):20–34
Xue Y, Yong F, Lin G et al (2015) Image semantic segmentation based on texture primitive block recognition and merging[J]. Comput Eng 41(3):253–257
Wanfu Z (2017) Research on image semantic segmentation algorithm based on random forest [J]. Electron Technol 30(2):4. https://doi.org/10.16180/j.cnki.issn1007-7820.2017.02.019 (in Chinese)
Xinxin L, Xue L, Qiong W (2013) Multi threshold segmentation method based on grayscale histogram [J]. Computer Application and Software 30(12):4. https://doi.org/10.3969/j.issn.1000-386x.2013.12.008 (in Chinese)
Zhang C (2012) Image semantic segmentation based on conditional random fields [J]. Computer CD Software and Applications 9:3. DOI: CNKI: SUN: GPRJ0.2012-09-014 (in Chinese)
Zhang S, Ma Z, Zhang G et al (2020) Semantic image segmentation with deep convolutional neural networks and quick shift[J]. Symmetry 12(3):427
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Munich: Springer, pp 234–241. 10. 1007/978-3-319-24574-4_28
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: IEEE Conf. Computer Vision and Pattern Recognition(CVPR), Honolulu, USA, pp 2881–2890
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39:2481–2495
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: IEEE conf. computer vision and pattern recognition (CVPR), Boston, USA, pp 3431–3440
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition[J]. Comput Sci. https://doi.org/10.48550/arXiv.1409.1556
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc Comput Vis Pattern Recognit, pp 770–778
Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision, pp 801–818
Li H, Xiong P, Fan H et al (2020) DFANet: deep feature aggregation for real-time semantic segmentation[C]//. 2019 IEEE/CVF conference on computer vision and pattern recognition(CVPR). IEEE
Xu H, Gao Y, Li J et al (2020) CBFNet: Constraint balance factor for semantic segmentation[J]. Neurocomputing 397:39–47
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) BiSeNet: Bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the computer vision – ECCV 2018, Springer International Publishing, pp 334–349. 10. 1007/978-3-030-01261-8_20
Paszke A, Chaurasia A, Kim S et al (2016) ENet: a deep neural network architecture for real-time semantic segmentation[J]. https://doi.org/10.48550/arXiv.1606.02147
Zhang Y, Li X, Lin M et al (2020) Deep-recursive residual network for image semantic segmentation[J]. Neural Comput Appl 32(4)
Everingham M, Eslami SMA, Gool LV, Williams CKI, Winn J, Zisserman A (2014) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136. https://doi.org/10.1007/s11263-014-0733-5
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Zhou L, Fu K, Liu Z et al (2019) Superpixel based continuous conditional random field neural network for semantic segmentation[J]. Neurocomputing 340(MAY 7):196–210
Hao F, Qiu G (2013) Integrating low-level and semantic features for object consistent segmentation[J]. Neurocomputing 119(nov. 7):74–81
Jiang Z, Yuan Y, Wang Q (2018) Contour-aware network for semantic segmentation via adaptive depth[J]. Neurocomputing 284(APR.5):27–35. https://doi.org/10.1016/j.neucom.2018.01.022
Yan M, Wang J, Li J et al (2019) Traffic scene semantic segmentation using self-attention mechanism and bidirectional GRU to correlate context[J]. Neurocomputing:386. https://doi.org/10.1016/j.neucom.2019.12.007
Liang, Xiaodan, Zhao et al (2016) Learning to segment with image-level annotations[J]. Pattern Recognition: The Journal of the Pattern Recognition Society
Li H, Qiu K, Chen L et al SCAttNet: semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images[J]. IEEE Geosci Remote Sens Lett PP(99):1–5
Peng C, Ma J (2020) Semantic segmentation using stride spatial pyramid pooling and dual attention decoder[J]. Pattern Recogn 107(1):107498
Fu J, Liu J, Wang Y, Lu H (2017) Densely connected deconvolutional network for semantic segmentation. 2017 IEEE international conference on image processing (ICIP), pp 3085–3089. https://doi.org/10.1109/ICIP.2017.8296850
Lin G, Liu F, Milan A, Shen C, Reid I (2020) RefineNet: multi-path refinement networks for dense prediction. IEEE Trans Pattern Anal Mach Intell 42(5):1228–1242. https://doi.org/10.1109/TPAMI.2019.2893630
Zhang Y, Sun X, Dong J et al (2021) GPNet: gated pyramid network for semantic segmentation[J]. Pattern Recogn. https://doi.org/10.1016/j.patcog.2021.107940
Zhou Z, Zhou Y, Wang D et al (2021) Self-attention feature fusion network for semantic segmentation[J]. Neurocomputing. https://doi.org/10.1016/j.neucom.2021.04.106
Li H, Xiong P, An J, Wang L (2018) Pyramid attention network for semantic segmentation. In: Proceedings of the British machine vision conference, pp 1–13
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Oršić M, Šegvić S (2021) Efficient semantic segmentation with pyramidal fusion[J]. Pattern Recognition: The Journal of the Pattern Recognition Society 110(1). https://doi.org/10.1016/j.patcog.2020.107611
Sun J, Li Y (2021) Multi-feature fusion network for road scene semantic segmentation[J]. Comput Electr Eng 92(12):107155
Yang M, Yu K, Zhang C, Li Z and Yang K (2018) DenseASPP for Semantic Segmentation in Street Scenes. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, p 3684–3692. https://doi.org/10.1109/CVPR.2018.00388
Szegedy C, Liu W, Jia Y et al (2014) Going deeper with convolutions[J]. IEEE Computer Society. https://doi.org/10.1109/CVPR.2015.7298594
Qin Z et al (2019) ThunderNet: towards real-time generic object detection on Mobile devices. IEEE/CVF International Conference on Computer Vision (ICCV) 2019:6717–6726. https://doi.org/10.1109/ICCV.2019.00682
Lin D, Ji Y, Lischinski D, Cohen-Or D, Huang H (2018) Multi-scale context intertwining for semantic segmentation. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision – ECCV 2018. ECCV 2018, Lecture notes in computer science(), vol 11207. Springer, Cham. https://doi.org/10.1007/978-3-030-01219-9_37
Noh H, Hong S, Han B (2016) Learning deconvolution network for semantic segmentation[J]. IEEE. https://doi.org/10.1109/ICCV.2015.178
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift[J]. Computer Science. https://doi.org/10.48550/arXiv.1502.03167
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks[J]. J Mach Learn Res 15:315–323
Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions[C]. ICLR. https://doi.org/10.48550/arXiv.1511.07122
Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting[J]. J Mach Learn Res 15(1):1929–1958
Liu W, Wen Y, Yu Z et al (2016) Large-margin Softmax loss for convolutional neural networks[J]. JMLR.org. https://doi.org/10.48550/arXiv.1612.02295
Jiang Z, Yuan Y, Wang Q (2018) Contour-aware network for semantic segmentation via adaptive depth[J]. Neurocomputing 284(APR. 5):27–35
Ruder S (2016) An overview of gradient descent optimization algorithms [J]. https://doi.org/10.48550/arXiv.1609.04747
Garcia-Garcia A, Orts-Escolano S, Oprea S et al (2017) A review on deep learning techniques applied to semantic segmentation[J]. https://doi.org/10.48550/arXiv.1704.06857
Jin R, Yu T, Han X, Liu Y (2021) The segmentation of road scenes based on improved ESPNet model. Security and Communication Networks vol 2021, Article ID 1681952, 11 pages. https://doi.org/10.1155/2021/1681952
Li R, Cao W, Jiao Q et al (2020) Simplified unsupervised image translation for semantic segmentation adaptation[J]. Pattern Recogn:105. https://doi.org/10.1016/j.patcog.2020.107343
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Ethics approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Depeng, W., Huabin, W. MFFLNet: lightweight semantic segmentation network based on multi-scale feature fusion. Multimed Tools Appl 83, 30073–30093 (2024). https://doi.org/10.1007/s11042-023-16782-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16782-z