CateNorm: Categorical Normalization for Robust Medical Image Segmentation

Xiao, Junfei; Yu, Lequan; Zhou, Zongwei; Bai, Yutong; Xing, Lei; Yuille, Alan; Zhou, Yuyin

doi:10.1007/978-3-031-16852-9_13

Junfei Xiao¹⁵,
Lequan Yu¹⁶,
Zongwei Zhou¹⁵,
Yutong Bai¹⁵,
Lei Xing¹⁷,
Alan Yuille¹⁵ &
…
Yuyin Zhou¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13542))

Included in the following conference series:

MICCAI Workshop on Domain Adaptation and Representation Transfer

797 Accesses
4 Citations

Abstract

Batch normalization (BN) uniformly shifts and scales the activations based on the statistics of a batch of images. However, the intensity distribution of the background pixels often dominates the BN statistics because the background accounts for a large proportion of the entire image. This paper focuses on enhancing BN with the intensity distribution of foreground pixels, the one that really matters for image segmentation. We propose a new normalization strategy, named categorical normalization (CateNorm), to normalize the activations according to categorical statistics. The categorical statistics are obtained by dynamically modulating specific regions in an image that belong to the foreground. CateNorm demonstrates both precise and robust segmentation results across five public datasets obtained from different domains, covering complex and variable data distributions. It is attributable to the ability of CateNorm to capture domain-invariant information from multiple domains (institutions) of medical data.

Code is available at https://github.com/lambert-x/CateNorm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization (2016)
Google Scholar
Bloch, N., et al.: NCI-ISBI 2013 challenge: automated segmentation of prostate structures. The Cancer Imaging Archive (2015). http://doi.org/10.7937/K9/TCIA.2015.zF0vlOPv
Chang, W.G., You, T., Seo, S., Kwak, S., Han, B.: Domain-specific batch normalization for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7354–7362 (2019)
Google Scholar
Chen, J., et al.: TransUNet: transformers make strong encoders for medical image segmentation (2021)
Google Scholar
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Chapter Google Scholar
Clark, K., et al.: The cancer imaging archive (TCIA): maintaining and operating a public information repository. J. Digit. Imaging 26(6), 1045–1057 (2013)
Article Google Scholar
Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=BJO-BuT1g
Fu, S., et al.: Domain adaptive relational reasoning for 3D multi-organ segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 656–666. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_64
Chapter Google Scholar
Gibson, E., et al.: Multi-organ abdominal CT reference standard segmentations, February 2018. https://doi.org/10.5281/zenodo.1169361
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hu, S., Yuan, J., Wang, S.: Cross-modality synthesis from MRI to pet using adversarial U-Net with different normalization. In: 2019 International Conference on Medical Imaging Physics and Engineering (ICMIPE), pp. 1–5. IEEE (2019)
Google Scholar
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)
Google Scholar
Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., Maier-Hein, K.H.: No new-net. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 234–244. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_21
Chapter Google Scholar
Kao, P.-Y., Ngo, T., Zhang, A., Chen, J.W., Manjunath, B.S.: Brain tumor segmentation and tractographic feature extraction from structural MR images for overall survival prediction. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 128–141. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_12
Chapter Google Scholar
Landman, B., Xu, Z., Igelsias, J., Styner, M., Langerak, T., Klein, A.: 2015 MICCAI multi-atlas labeling beyond the cranial vault workshop and challenge (2015). https://doi.org/10.7303/syn3193805
Lemaître, G., Martí, R., Freixenet, J., Vilanova, J.C., Walker, P.M., Meriaudeau, F.: Computer-aided detection and diagnosis for prostate cancer based on mono and multi-parametric MRI: a review. Comput. Biol. Med. 60, 8–31 (2015)
Article Google Scholar
Li, Y., Wang, N., Shi, J., Liu, J., Hou, X.: Revisiting batch normalization for practical domain adaptation (2016)
Google Scholar
Liu, Q., Dou, Q., Yu, L., Heng, P.A.: MS-Net: multi-site network for improving prostate segmentation with heterogeneous MRI data. IEEE Trans. Med. Imaging 39(9), 2713–2724 (2020)
Article Google Scholar
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Roth, H., Farag, A., Turkbey, E.B., Lu, L., Liu, J., Summers, R.M.: Data from Pancreas-CT (2016). https://doi.org/10.7937/K9/TCIA.2016.TNB1KQBU, The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2016.tNB1kqBU
Roth, H.R., et al.: DeepOrgan: multi-level deep convolutional networks for automated pancreas segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 556–564. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24553-9_68
Chapter Google Scholar
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: the missing ingredient for fast stylization (2017)
Google Scholar
Wu, Y., He, K.: Group normalization. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_1
Chapter Google Scholar
Xia, X., Kulis, B.: W-Net: a deep model for fully unsupervised image segmentation (2017)
Google Scholar
Yu, L., Yang, X., Chen, H., Qin, J., Heng, P.A.: Volumetric convnets with mixed residual connections for automated prostate segmentation from 3D MR images. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Zhou, X.Y., Yang, G.Z.: Normalization in training U-Net for 2-D biomedical semantic segmentation. IEEE Robot. Autom. Lett. 4(2), 1792–1799 (2019)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the Lustgarten Foundation for Pancreatic Cancer Research. We also thank Quande Liu for the discussion.

Author information

Authors and Affiliations

Johns Hopkins University, Baltimore, USA
Junfei Xiao, Zongwei Zhou, Yutong Bai & Alan Yuille
The University of Hong Kong, Pok Fu Lam, Hong Kong
Lequan Yu
Stanford University, Stanford, USA
Lei Xing
UC Santa Cruz, Santa Cruz, USA
Yuyin Zhou

Authors

Junfei Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Lequan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Zongwei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yutong Bai
View author publications
You can also search for this author in PubMed Google Scholar
Lei Xing
View author publications
You can also search for this author in PubMed Google Scholar
Alan Yuille
View author publications
You can also search for this author in PubMed Google Scholar
Yuyin Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junfei Xiao .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Konstantinos Kamnitsas
University of Tübingen, Tübingen, Germany
Lisa Koch
Imperial College London, London, UK
Mobarakol Islam
Nvidia Corporation, Santa Clara, CA, USA
Ziyue Xu
King’s College London, London, UK
Jorge Cardoso
Chinese University of Hong Kong, Hong Kong, Hong Kong
Qi Dou
Nvidia GmbH, Munich, Bayern, Germany
Nicola Rieke
University of Edinburgh, Edinburgh, UK
Sotirios Tsaftaris

Appendices

A Details of Aligning Input Distribution Algorithm

Assume that we have N source domains $S_1, S_2, S_3, ..., S_N$, with $M_1, M_2, M_3, ..., M_N$ examples respectively, where the i-th domain source domain $S_i$ consists of an image set $\{\textbf{x}_{i,j}\in \mathbb {R}^{D_{i,j}}\}_{j=1,...,M_i}$ as well as their associated annotations. Our goal is to align the image distributions of these source domains with the target domain T based on the class-wise (region-wise) statistics. The algorithm can be illustrated as the following steps:

Step 1: Calculate class-wise statistics of each case

Firstly, we calculate the mean and standard deviation of each case in both the source domain and the target domain.

$$\begin{aligned} \mu _{i,j}^{c}&= \frac{\sum _{k=1}^{|D^c_{i,j}|}\textbf{x}^c_{i,j,k}}{|D^c_{i,j}|} , \end{aligned}$$

(6)

$$\begin{aligned} \sigma _{i,j}^{c}&= \sqrt{\frac{1}{ |D^{c}_{i,j}|} \sum _{k=1}^{|D^{c}_{i,j}|}(\textbf{x}^c_{i,j,k} - \mu _{i,j}^{c})^{2}}, \end{aligned}$$

(7)

where $\textbf{x}_{i,j}^{c}$ denotes the pixels which belong to the c-th class (region) in image $\textbf{x}_{i,j}$, with the number of pixels denoted as $|D^c_{i,j}|$. As a special case, $i=T$ indicates the target domain.

Step 2: Estimate aligned (new) class-wise statistics

Next, we calculate the mean of the statistics over all examples obtained in each domain as follows:

$$\begin{aligned} \bar{\mu }_{i}^{c}&=\frac{\sum _{j=1}^{M_i}\mu _{i,j}^{c}}{M_i}, \end{aligned}$$

(8)

$$\begin{aligned} \bar{\sigma }_{i}^{c}&=\frac{\sum _{j=1}^{M_i}\sigma _{i,j}^{c}}{M_i}. \end{aligned}$$

(9)

Based on the $\bar{\mu }_{i}^{c}$, we now estimate the new class-wise mean $\tilde{\mu }_{i,j}$ for each case of the source domain $S_i$ as follows:

$$\begin{aligned} \begin{aligned} \tilde{\mu }_{i,j}^{c}&=\frac{\mu _{i,j}^{c} - \bar{\mu }_{i}^{c}}{\sqrt{\frac{\sum _{j=1}^{M_i}(\mu _{i,j}^{c} - \bar{\mu }_{i}^{c})^2}{M_i}}} \cdot \sqrt{\frac{\sum _{j=1}^{M_T}(\mu _{T,j}^{c} - \bar{\mu }_{T}^{c})^2}{M_T}} + \bar{\mu }_{T}^{c}, \\ \end{aligned} \end{aligned}$$

(10)

where $M_T$ denotes the number of cases in the target domain T. Similarly, the new standard deviation $\tilde{\sigma }_{i,j}$ can be computed by:

$$\begin{aligned} \begin{aligned} \tilde{\sigma }_{i,j}^{c}&=\frac{\sigma _{i,j}^{c} - \bar{\sigma }_{i}^{c}}{\sqrt{\frac{\sum _{j=1}^{M_i}(\sigma _{i,j}^{c} - \bar{\sigma }_{i}^{c})^2}{M_i}}} \cdot \sqrt{\frac{\sum _{j=1}^{M_T}(\sigma _{T,j}^{c} - \bar{\sigma }_{T}^{c})^2}{M_T}} + \bar{\sigma }_{T}^{c}. \\ \end{aligned} \end{aligned}$$

(11)

Step 3: Align each case with the estimated statistics

Based on the computed new mean and standard deviation $\tilde{\mu }_{i,j}$, $\tilde{\sigma }_{i,j}$, the aligned image $\tilde{\textbf{x}}_{i,j}$ can be computed as:

$$\begin{aligned} \tilde{\textbf{x}}_{i,j}^{c}&=\frac{\textbf{x}_{i,j}^{c} - \mu _{i,j}^{c}}{\sigma _{i,j}^{c}} \cdot \tilde{\sigma }_{i,j}^{c} + \tilde{\mu }_{i,j}^{c}. \end{aligned}$$

(12)

B Implementation Details

(See Tables 4, 5).

Table 4. Data preprocessing.

Full size table

Table 5. Experimental setting.

Full size table

C Details of the prostate datasets

(See Tables 6).

Table 6. Details of the 3 prostate segmentation datasets.

Full size table

D Training procedure of CateNorm

E Average Surface Distance (ASD) Comparison

The detailed average surface distance results of both prostate segmentation and abdominal segmentation tasks can be found in Tables 7 and 8. The proposed CateNorm achieves the lowest average ASD on both tasks, even under the more challenging multi-domain setting (Tables 9, 10, 12).

Table 7. ASD comparison on the abdominal datasets under the multi-domain setting (in mm). Compared with the baseline and other competitive methods, the proposed CateNorm achieves the lowest average ASD.

Full size table

Table 8. ASD comparison on prostate segmentation datasets under the multi-domain setting (in mm). Compared with the baseline and other competitive methods, the proposed CateNorm achieves the lowest average ASD.

Full size table

Table 9. Comparison on the multi-organ segmentation dataset (BTCV) with single-domain setting (Dice Score in %).

Full size table

Table 10. Organ-wise results on the multi-organ segmentation datasets under the multi-domain setting (Dice Score in %).

Full size table

Table 11. CateNorm is compatible to other segmentation models. This table compares performance on multi-domain multi-organ and prostate segmentation with DeepLabv3+ [5] architecture. Our CateNorm consistently outperforms BN.

Full size table

Table 12. CateNorm is not sensitive to the warmup length. This table reports average accuracy (%) of our CateNorm under deteriorated pretrained models with fewer pretraining iterations. We reduce the warmup iterations to 450, 1440, and 9000 for multi-domain prostate segmentation experiments, to investigate how our CateNorm performs when warmuped with less iterations.

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xiao, J. et al. (2022). CateNorm: Categorical Normalization for Robust Medical Image Segmentation. In: Kamnitsas, K., et al. Domain Adaptation and Representation Transfer. DART 2022. Lecture Notes in Computer Science, vol 13542. Springer, Cham. https://doi.org/10.1007/978-3-031-16852-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-16852-9_13
Published: 15 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16851-2
Online ISBN: 978-3-031-16852-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

CateNorm: Categorical Normalization for Robust Medical Image Segmentation

Abstract

Access this chapter

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

A Details of Aligning Input Distribution Algorithm

B Implementation Details

C Details of the prostate datasets

D Training procedure of CateNorm

E Average Surface Distance (ASD) Comparison

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation