Abstract
Signals from different modalities each have their own combination algebra which affects their sampling processing. RGB is mostly linear; depth is a geometric signal following the operations of mathematical morphology. If a network obtaining RGB-D input has both kinds of operators available in its layers, it should be able to give effective output with fewer parameters. In this paper, morphological elements in conjunction with more familiar linear modules are used to construct a mixed linear-morphological network called HaarNet. This is the first large-scale linear-morphological hybrid, evaluated on a set of sizeable real-world datasets. In the network, morphological Haar sampling is applied to both feature channels in several layers, which splits extreme values and high-frequency information such that both can be processed to improve both modalities. Moreover, morphologically parameterised ReLU is used, and morphologically-sound up-sampling is applied to obtain a full-resolution output. Experiments show that HaarNet is competitive with a state-of-the-art CNN, implying that morphological networks are a promising research direction for geometry-based learning tasks.
References
Armeni, I., Sax, S., Zamir, A.R., Savarese, S.: Joint 2d-3d-semantic data for indoor scene understanding. arXiv preprint arXiv:1702.01105 (2017)
Burgeth, B., Weickert, J.: An explanation for the logarithmic connection between linear and morphological system theory. IJCV 64, 157–169 (2005)
Charisopoulos, V., Maragos, P.: Morphological perceptrons: geometry and training algorithms. In: Angulo, J., Velasco-Forero, S., Meyer, F. (eds.) ISMM 2017. LNCS, vol. 10225, pp. 3–15. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57240-6_1
Chen, B., Polatkan, G., Sapiro, G., Blei, D., Dunson, D., Carin, L.: Deep learning with hierarchical convolutional factor analysis. TPAMI 35(8), 1887–1901 (2013)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI 40(4), 834–848 (2017)
Chen, S., Zhu, X., Liu, W., He, X., Liu, J.: Global-local propagation network for RGB-D semantic segmentation. arXiv preprint arXiv:2101.10801 (2021)
Chen, X., et al.: Bi-directional cross-modality feature propagation with separation-and-aggregation gate for RGB-D semantic segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 561–577. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_33
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: CVPR (2016)
Csurka, G., Larlus, D., Perronnin, F., Meylan, F.: What is a good evaluation measure for semantic segmentation? TPAMI 26(1) (2004)
Dimitriadis, N., Maragos, P.: Advances in the training, pruning and enforcement of shape constraints of morphological neural networks using tropical algebra. arXiv preprint arXiv:2011.07643 (2020)
Franchi, G., Fehri, A., Yao, A.: Deep morphological networks. Pattern Recogn. 102, 107246 (2020)
Fukushima, K.: Cognitron: a self-organizing multilayered neural network. Biol. Cybern. 20(3–4), 121–136 (1975)
Groenendijk, R., Dorst, L., Gevers, T.: Morphpool: efficient non-linear pooling & unpooling in CNNs. In: 33rd British Machine Vision Conference 2022, BMVC 2022, London, UK, November 21-24, 2022. BMVA Press (2022)
Groenendijk, R., Dorst, L., Gevers, T.: Geometric back-propagation in morphological neural networks. TPAMI, pp. 1–8 (2023)
Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 345–360. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_23
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR. pp. 770–778 (2016)
Heijmans, H.J., Goutsias, J.: Nonlinear multiresolution signal decomposition schemes. ii. morphological wavelets. TIP 9(11), 1897–1913 (2000)
Heijmans, H.J., Toet, A.: Morphological sampling. CVGIP: Image understanding 54(3), 384–400 (1991)
Hernández, G., Zamora, E., Sossa, H., Téllez, G., Furlán, F.: Hybrid neural networks for big data classification. Neurocomputing 390, 327–340 (2020)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR, pp. 7132–7141 (2018)
Jarrett, K., Kavukcuoglu, K., Ranzato, M., LeCun, Y.: What is the best multi-stage architecture for object recognition? In: ICCV, pp. 2146–2153. IEEE (2009)
Jiang, S., Xu, Y., Li, D., Fan, R.: Multi-scale fusion for RGB-D indoor semantic segmentation. Sci. Rep. 12(1), 20305 (2022)
Mondal, R., Santra, S., Mukherjee, S.S., Chanda, B.: Morphological network: How far can we go with morphological neurons? In: BMVC. BMVA Press (2022)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: NeuRIPS, pp. 8024–8035. Curran Associates, Inc. (2019)
Qiu, S., Xu, X., Cai, B.: Frelu: flexible rectified linear units for improving convolutional neural networks. In: ICPR. pp. 1223–1228. IEEE (2018)
Ritter, G.X., Sussner, P.: An introduction to morphological neural networks. In: ICPR, vol. 4, pp. 709–717. IEEE (1996)
Roy, S.K., Mondal, R., Paoletti, M.E., Haut, J.M., Plaza, A.: Morphological convolutional neural networks for hyperspectral image classification. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens. 14, 8689–8702 (2021)
Serra, J.: Image analysis and mathematical morphology (1983)
Serra, J., Vincent, L.: An overview of morphological filtering. Circ. Syst. Signal Process. 11(1), 47–108 (1992)
Shen, Y., Shih, F.Y., Zhong, X., Chang, I.C.: Deep morphological neural networks. PRAI 36(12), 2252023 (2022)
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Sussner, P.: Morphological perceptron learning. In: Proceedings of the 1998 ISIC/CIRA, pp. 477–482. IEEE (1998)
Sutskever, I., Martens, J., Dahl, G., Hinton, G.: On the importance of initialization and momentum in deep learning. In: ICML, pp. 1139–1147. PMLR (2013)
Velasco-Forero, S., Angulo, J.: Morphoactivation: generalizing ReLU activation function by mathematical morphology. In: Baudrier, É., Naegel, B., Krähenbühl, A., Tajine, M. (eds.) DGMM 2022. LNCS, vol. 13493, pp. 449–461. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19897-7_35
Velasco-Forero, S., Rhim, A., Angulo, J.: Fixed point layers for geodesic morphological operations. In: BMVC. BMVA Press (2022)
Xie, L., Tian, Q., Wang, M., Zhang, B.: Spatial pooling of heterogeneous features for image classification. TIP 23(5), 1994–2008 (2014)
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? NeuRIPS 27 (2014)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Zeiler, M.D., Taylor, G.W., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: ICCV, pp. 2018–2025. IEEE (2011)
Zhou, H., Qi, L., Huang, H., Yang, X., Wan, Z., Wen, X.: CANet: Co-attention network for RGB-D semantic segmentation. Pattern Recogn. 124, 108468 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Groenendijk, R., Dorst, L., Gevers, T. (2024). HaarNet: Large-Scale Linear-Morphological Hybrid Network for RGB-D Semantic Segmentation. In: Brunetti, S., Frosini, A., Rinaldi, S. (eds) Discrete Geometry and Mathematical Morphology. DGMM 2024. Lecture Notes in Computer Science, vol 14605. Springer, Cham. https://doi.org/10.1007/978-3-031-57793-2_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-57793-2_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-57792-5
Online ISBN: 978-3-031-57793-2
eBook Packages: Computer ScienceComputer Science (R0)