Skip to main content
Log in

MFFLNet: lightweight semantic segmentation network based on multi-scale feature fusion

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Semantic segmentation is a typical problem in the field of machine vision. Convolutional neural networks(CNNs)-based methods all have excellent performance in image semantic segmentation. Existing semantic segmentation models tend to focus only on the improvement of image segmentation performance, with little attention to the problems of network lightweighting and multi-scale feature information utilization. To address this problem, we design a multi-scale feature fusion lightweight semantic segmentation network(MFFLNet), which consists of two parts: a deep feature extraction module(DFEM) and a multi-scale feature extraction module(MFEM). First, the deep feature extraction module(DFEM) utilizes the deconvolution layer to replace the convolution layer, which can avoid the problem of feature information loss caused by cropping the feature map when feature fusion is performed. Meanwhile, the dimensionality of the feature map is compressed by using 1 × 1 convolutional layers after each upsampling layer, which can effectively reduce the number of parameters of the model. Then, the multi-scale feature extraction module(MFEM) employs multiple null convolutions with different expansion rates for feature extraction of the image to extract feature information on multiple scales. Finally, the deep features and multi-scale features extracted by the two modules respectively are fused to achieve semantic segmentation of the image. It is shown experimentally that the proposed MFFLNet outperforms the mainstream methods in semantic segmentation on two datasets, PASCAL VOC 2012 and Cityscapes, with mIoU reaching 71. 23% and 79. 24%, respectively, and improving 5. 8% and 8. 8% compared with the state-of-the-art DeepLab V3 + model, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

Code availability

Not applicable.

References

  1. Yu H, Yang Z, Tan L, Wang Y, Sun W, Sun M, Tang Y (2018) Methods and datasets on semantic segmentation: a review. Neurocomputing 304:82–103. https://doi.org/10.1016/j.neucom.2018.03.037

    Article  Google Scholar 

  2. Garcia-Garcia, Orts-Escolano S, Oprea S, Villena-Martinez V, Martinez–Gonzalez P, Garcia-Rodriguez J (2018) A survey on deep learning techniques for image and video semantic segmentation. Appl Soft Comput 70:41–65

  3. Han B, Wu Y (2017) A novel active contour model based on modified symmetric cross entropy for remote sensing river image segmentation. Pattern Recognit 67:396–409

    Article  ADS  Google Scholar 

  4. Sun X, Zhang M, Dong J, Lguensat R, Yang Y, Lu X (2021) A deep framework for eddy detection and tracking from satellite sea surface height data. IEEE Trans Geosci Remote Sens 59(9):7224–7234. https://doi.org/10.1109/TGRS.2020.3032523

    Article  ADS  Google Scholar 

  5. Chen S, Ding C, Liu M (2019) Dual-force convolutional neural networks for accurate brain tumor segmentation. Pattern Recognit 88:90–100

    Article  ADS  Google Scholar 

  6. Bueno G, Fernandez-Carrobles MM, Gonzalez-Lopez L, Deniz O (2020) Glomerulosclerosis identification in whole slide images using semantic segmentation - ScienceDirect[J]. Comput Methods Prog Biomed 184:105273. https://doi.org/10.1016/j.cmpb.2019.105273

  7. Zhang M, Li X, Xu M, Li Q (2020) Automated semantic segmentation of red blood cells for sickle cell disease. IEEE J Biomed Health Inform 24(11):3095–3102. https://doi.org/10.1109/JBHI.2020.3000484.

  8. Cheng F, Zhang H, Yuan D, Sun M (2019) Leveraging semantic segmentation with learning-based confidence measure. Neurocomputing 329:21–31

    Article  Google Scholar 

  9. Shen F, Zeng G (2019) Semantic image segmentation via guidance of image classification. Neurocomputing 330:259–266

    Article  Google Scholar 

  10. Chen B, Chen G, Jian Y (2019) Importance-aware semantic segmentation for autonomous vehicles. IEEE Trans Intell Transp Syst 20(1):137–148

    Article  Google Scholar 

  11. You J, Liu W, Lee J (2020) A DNN-based semantic segmentation for detecting weed and crop[J]. Comput Electron Agric 178:105750

    Article  Google Scholar 

  12. Lottes P, Behley J, Chebrolu N et al (2020) Robust joint stem detection and crop-weed classification using image sequences for plant-specific treatment in precision farming[J]. J Field Robot 37(1):20–34

    Article  Google Scholar 

  13. Xue Y, Yong F, Lin G et al (2015) Image semantic segmentation based on texture primitive block recognition and merging[J]. Comput Eng 41(3):253–257

  14. Wanfu Z (2017) Research on image semantic segmentation algorithm based on random forest [J]. Electron Technol 30(2):4. https://doi.org/10.16180/j.cnki.issn1007-7820.2017.02.019 (in Chinese)

    Article  ADS  Google Scholar 

  15. Xinxin L, Xue L, Qiong W (2013) Multi threshold segmentation method based on grayscale histogram [J]. Computer Application and Software 30(12):4. https://doi.org/10.3969/j.issn.1000-386x.2013.12.008 (in Chinese)

    Article  Google Scholar 

  16. Zhang C (2012) Image semantic segmentation based on conditional random fields [J]. Computer CD Software and Applications 9:3. DOI: CNKI: SUN: GPRJ0.2012-09-014 (in Chinese)

  17. Zhang S, Ma Z, Zhang G et al (2020) Semantic image segmentation with deep convolutional neural networks and quick shift[J]. Symmetry 12(3):427

    Article  ADS  Google Scholar 

  18. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Munich: Springer, pp 234–241. 10. 1007/978-3-319-24574-4_28

  19. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: IEEE Conf. Computer Vision and Pattern Recognition(CVPR), Honolulu, USA, pp 2881–2890

  20. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

  21. Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39:2481–2495

    Article  PubMed  Google Scholar 

  22. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: IEEE conf. computer vision and pattern recognition (CVPR), Boston, USA, pp 3431–3440

  23. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition[J]. Comput Sci. https://doi.org/10.48550/arXiv.1409.1556

  24. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc Comput Vis Pattern Recognit, pp 770–778

  25. Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision, pp 801–818

  26. Li H, Xiong P, Fan H et al (2020) DFANet: deep feature aggregation for real-time semantic segmentation[C]//. 2019 IEEE/CVF conference on computer vision and pattern recognition(CVPR). IEEE

  27. Xu H, Gao Y, Li J et al (2020) CBFNet: Constraint balance factor for semantic segmentation[J]. Neurocomputing 397:39–47

    Article  Google Scholar 

  28. Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) BiSeNet: Bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the computer vision – ECCV 2018, Springer International Publishing, pp 334–349. 10. 1007/978-3-030-01261-8_20

  29. Paszke A, Chaurasia A, Kim S et al (2016) ENet: a deep neural network architecture for real-time semantic segmentation[J]. https://doi.org/10.48550/arXiv.1606.02147

  30. Zhang Y, Li X, Lin M et al (2020) Deep-recursive residual network for image semantic segmentation[J]. Neural Comput Appl 32(4)

  31. Everingham M, Eslami SMA, Gool LV, Williams CKI, Winn J, Zisserman A (2014) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136. https://doi.org/10.1007/s11263-014-0733-5

    Article  Google Scholar 

  32. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223

  33. Zhou L, Fu K, Liu Z et al (2019) Superpixel based continuous conditional random field neural network for semantic segmentation[J]. Neurocomputing 340(MAY 7):196–210

  34. Hao F, Qiu G (2013) Integrating low-level and semantic features for object consistent segmentation[J]. Neurocomputing 119(nov. 7):74–81

  35. Jiang Z, Yuan Y, Wang Q (2018) Contour-aware network for semantic segmentation via adaptive depth[J]. Neurocomputing 284(APR.5):27–35. https://doi.org/10.1016/j.neucom.2018.01.022

    Article  Google Scholar 

  36. Yan M, Wang J, Li J et al (2019) Traffic scene semantic segmentation using self-attention mechanism and bidirectional GRU to correlate context[J]. Neurocomputing:386. https://doi.org/10.1016/j.neucom.2019.12.007

  37. Liang, Xiaodan, Zhao et al (2016) Learning to segment with image-level annotations[J]. Pattern Recognition: The Journal of the Pattern Recognition Society

  38. Li H, Qiu K, Chen L et al SCAttNet: semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images[J]. IEEE Geosci Remote Sens Lett PP(99):1–5

  39. Peng C, Ma J (2020) Semantic segmentation using stride spatial pyramid pooling and dual attention decoder[J]. Pattern Recogn 107(1):107498

    Article  Google Scholar 

  40. Fu J, Liu J, Wang Y, Lu H (2017) Densely connected deconvolutional network for semantic segmentation. 2017 IEEE international conference on image processing (ICIP), pp 3085–3089. https://doi.org/10.1109/ICIP.2017.8296850

  41. Lin G, Liu F, Milan A, Shen C, Reid I (2020) RefineNet: multi-path refinement networks for dense prediction. IEEE Trans Pattern Anal Mach Intell 42(5):1228–1242. https://doi.org/10.1109/TPAMI.2019.2893630

  42. Zhang Y, Sun X, Dong J et al (2021) GPNet: gated pyramid network for semantic segmentation[J]. Pattern Recogn. https://doi.org/10.1016/j.patcog.2021.107940

  43. Zhou Z, Zhou Y, Wang D et al (2021) Self-attention feature fusion network for semantic segmentation[J]. Neurocomputing. https://doi.org/10.1016/j.neucom.2021.04.106

  44. Li H, Xiong P, An J, Wang L (2018) Pyramid attention network for semantic segmentation. In: Proceedings of the British machine vision conference, pp 1–13

  45. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Article  PubMed  Google Scholar 

  46. Oršić M, Šegvić S (2021) Efficient semantic segmentation with pyramidal fusion[J]. Pattern Recognition: The Journal of the Pattern Recognition Society 110(1). https://doi.org/10.1016/j.patcog.2020.107611

  47. Sun J, Li Y (2021) Multi-feature fusion network for road scene semantic segmentation[J]. Comput Electr Eng 92(12):107155

    Article  Google Scholar 

  48. Yang M, Yu K, Zhang C, Li Z and Yang K (2018) DenseASPP for Semantic Segmentation in Street Scenes. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, p 3684–3692. https://doi.org/10.1109/CVPR.2018.00388

  49. Szegedy C, Liu W, Jia Y et al (2014) Going deeper with convolutions[J]. IEEE Computer Society. https://doi.org/10.1109/CVPR.2015.7298594

    Book  Google Scholar 

  50. Qin Z et al (2019) ThunderNet: towards real-time generic object detection on Mobile devices. IEEE/CVF International Conference on Computer Vision (ICCV) 2019:6717–6726. https://doi.org/10.1109/ICCV.2019.00682

    Article  Google Scholar 

  51. Lin D, Ji Y, Lischinski D, Cohen-Or D, Huang H (2018) Multi-scale context intertwining for semantic segmentation. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision – ECCV 2018. ECCV 2018, Lecture notes in computer science(), vol 11207. Springer, Cham. https://doi.org/10.1007/978-3-030-01219-9_37

    Chapter  Google Scholar 

  52. Noh H, Hong S, Han B (2016) Learning deconvolution network for semantic segmentation[J]. IEEE. https://doi.org/10.1109/ICCV.2015.178

  53. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift[J]. Computer Science. https://doi.org/10.48550/arXiv.1502.03167

  54. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks[J]. J Mach Learn Res 15:315–323

    Google Scholar 

  55. Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions[C]. ICLR. https://doi.org/10.48550/arXiv.1511.07122

    Book  Google Scholar 

  56. Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting[J]. J Mach Learn Res 15(1):1929–1958

    MathSciNet  Google Scholar 

  57. Liu W, Wen Y, Yu Z et al (2016) Large-margin Softmax loss for convolutional neural networks[J]. JMLR.org. https://doi.org/10.48550/arXiv.1612.02295

  58. Jiang Z, Yuan Y, Wang Q (2018) Contour-aware network for semantic segmentation via adaptive depth[J]. Neurocomputing 284(APR. 5):27–35

  59. Ruder S (2016) An overview of gradient descent optimization algorithms [J]. https://doi.org/10.48550/arXiv.1609.04747

  60. Garcia-Garcia A, Orts-Escolano S, Oprea S et al (2017) A review on deep learning techniques applied to semantic segmentation[J]. https://doi.org/10.48550/arXiv.1704.06857

  61. Jin R, Yu T, Han X, Liu Y (2021) The segmentation of road scenes based on improved ESPNet model. Security and Communication Networks vol 2021, Article ID 1681952, 11 pages. https://doi.org/10.1155/2021/1681952

  62. Li R, Cao W, Jiao Q et al (2020) Simplified unsupervised image translation for semantic segmentation adaptation[J]. Pattern Recogn:105. https://doi.org/10.1016/j.patcog.2020.107343

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Depeng.

Ethics declarations

Conflict of interest

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Ethics approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Depeng, W., Huabin, W. MFFLNet: lightweight semantic segmentation network based on multi-scale feature fusion. Multimed Tools Appl 83, 30073–30093 (2024). https://doi.org/10.1007/s11042-023-16782-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16782-z

Keywords

Navigation