Skip to main content

CateNorm: Categorical Normalization for Robust Medical Image Segmentation

  • Conference paper
  • First Online:
Domain Adaptation and Representation Transfer (DART 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13542))

Included in the following conference series:

Abstract

Batch normalization (BN) uniformly shifts and scales the activations based on the statistics of a batch of images. However, the intensity distribution of the background pixels often dominates the BN statistics because the background accounts for a large proportion of the entire image. This paper focuses on enhancing BN with the intensity distribution of foreground pixels, the one that really matters for image segmentation. We propose a new normalization strategy, named categorical normalization (CateNorm), to normalize the activations according to categorical statistics. The categorical statistics are obtained by dynamically modulating specific regions in an image that belong to the foreground. CateNorm demonstrates both precise and robust segmentation results across five public datasets obtained from different domains, covering complex and variable data distributions. It is attributable to the ability of CateNorm to capture domain-invariant information from multiple domains (institutions) of medical data.

Code is available at https://github.com/lambert-x/CateNorm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization (2016)

    Google Scholar 

  2. Bloch, N., et al.: NCI-ISBI 2013 challenge: automated segmentation of prostate structures. The Cancer Imaging Archive (2015). http://doi.org/10.7937/K9/TCIA.2015.zF0vlOPv

  3. Chang, W.G., You, T., Seo, S., Kwak, S., Han, B.: Domain-specific batch normalization for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7354–7362 (2019)

    Google Scholar 

  4. Chen, J., et al.: TransUNet: transformers make strong encoders for medical image segmentation (2021)

    Google Scholar 

  5. Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49

    Chapter  Google Scholar 

  6. Clark, K., et al.: The cancer imaging archive (TCIA): maintaining and operating a public information repository. J. Digit. Imaging 26(6), 1045–1057 (2013)

    Article  Google Scholar 

  7. Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=BJO-BuT1g

  8. Fu, S., et al.: Domain adaptive relational reasoning for 3D multi-organ segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 656–666. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_64

    Chapter  Google Scholar 

  9. Gibson, E., et al.: Multi-organ abdominal CT reference standard segmentations, February 2018. https://doi.org/10.5281/zenodo.1169361

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  11. Hu, S., Yuan, J., Wang, S.: Cross-modality synthesis from MRI to pet using adversarial U-Net with different normalization. In: 2019 International Conference on Medical Imaging Physics and Engineering (ICMIPE), pp. 1–5. IEEE (2019)

    Google Scholar 

  12. Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)

    Google Scholar 

  13. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)

    Google Scholar 

  14. Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., Maier-Hein, K.H.: No new-net. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 234–244. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_21

    Chapter  Google Scholar 

  15. Kao, P.-Y., Ngo, T., Zhang, A., Chen, J.W., Manjunath, B.S.: Brain tumor segmentation and tractographic feature extraction from structural MR images for overall survival prediction. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 128–141. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_12

    Chapter  Google Scholar 

  16. Landman, B., Xu, Z., Igelsias, J., Styner, M., Langerak, T., Klein, A.: 2015 MICCAI multi-atlas labeling beyond the cranial vault workshop and challenge (2015). https://doi.org/10.7303/syn3193805

  17. Lemaître, G., Martí, R., Freixenet, J., Vilanova, J.C., Walker, P.M., Meriaudeau, F.: Computer-aided detection and diagnosis for prostate cancer based on mono and multi-parametric MRI: a review. Comput. Biol. Med. 60, 8–31 (2015)

    Article  Google Scholar 

  18. Li, Y., Wang, N., Shi, J., Liu, J., Hou, X.: Revisiting batch normalization for practical domain adaptation (2016)

    Google Scholar 

  19. Liu, Q., Dou, Q., Yu, L., Heng, P.A.: MS-Net: multi-site network for improving prostate segmentation with heterogeneous MRI data. IEEE Trans. Med. Imaging 39(9), 2713–2724 (2020)

    Article  Google Scholar 

  20. Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)

    Google Scholar 

  21. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  22. Roth, H., Farag, A., Turkbey, E.B., Lu, L., Liu, J., Summers, R.M.: Data from Pancreas-CT (2016). https://doi.org/10.7937/K9/TCIA.2016.TNB1KQBU, The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2016.tNB1kqBU

  23. Roth, H.R., et al.: DeepOrgan: multi-level deep convolutional networks for automated pancreas segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 556–564. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24553-9_68

    Chapter  Google Scholar 

  24. Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: the missing ingredient for fast stylization (2017)

    Google Scholar 

  25. Wu, Y., He, K.: Group normalization. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_1

    Chapter  Google Scholar 

  26. Xia, X., Kulis, B.: W-Net: a deep model for fully unsupervised image segmentation (2017)

    Google Scholar 

  27. Yu, L., Yang, X., Chen, H., Qin, J., Heng, P.A.: Volumetric convnets with mixed residual connections for automated prostate segmentation from 3D MR images. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  28. Zhou, X.Y., Yang, G.Z.: Normalization in training U-Net for 2-D biomedical semantic segmentation. IEEE Robot. Autom. Lett. 4(2), 1792–1799 (2019)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Lustgarten Foundation for Pancreatic Cancer Research. We also thank Quande Liu for the discussion.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junfei Xiao .

Editor information

Editors and Affiliations

Appendices

A Details of Aligning Input Distribution Algorithm

Assume that we have N source domains \(S_1, S_2, S_3, ..., S_N\), with \(M_1, M_2, M_3, ..., M_N\) examples respectively, where the i-th domain source domain \(S_i\) consists of an image set \(\{\textbf{x}_{i,j}\in \mathbb {R}^{D_{i,j}}\}_{j=1,...,M_i}\) as well as their associated annotations. Our goal is to align the image distributions of these source domains with the target domain T based on the class-wise (region-wise) statistics. The algorithm can be illustrated as the following steps:

Step 1: Calculate class-wise statistics of each case

Firstly, we calculate the mean and standard deviation of each case in both the source domain and the target domain.

$$\begin{aligned} \mu _{i,j}^{c}&= \frac{\sum _{k=1}^{|D^c_{i,j}|}\textbf{x}^c_{i,j,k}}{|D^c_{i,j}|} , \end{aligned}$$
(6)
$$\begin{aligned} \sigma _{i,j}^{c}&= \sqrt{\frac{1}{ |D^{c}_{i,j}|} \sum _{k=1}^{|D^{c}_{i,j}|}(\textbf{x}^c_{i,j,k} - \mu _{i,j}^{c})^{2}}, \end{aligned}$$
(7)

where \(\textbf{x}_{i,j}^{c}\) denotes the pixels which belong to the c-th class (region) in image \(\textbf{x}_{i,j}\), with the number of pixels denoted as \(|D^c_{i,j}|\). As a special case, \(i=T\) indicates the target domain.

Step 2: Estimate aligned (new) class-wise statistics

Next, we calculate the mean of the statistics over all examples obtained in each domain as follows:

$$\begin{aligned} \bar{\mu }_{i}^{c}&=\frac{\sum _{j=1}^{M_i}\mu _{i,j}^{c}}{M_i}, \end{aligned}$$
(8)
$$\begin{aligned} \bar{\sigma }_{i}^{c}&=\frac{\sum _{j=1}^{M_i}\sigma _{i,j}^{c}}{M_i}. \end{aligned}$$
(9)

Based on the \(\bar{\mu }_{i}^{c}\), we now estimate the new class-wise mean \(\tilde{\mu }_{i,j}\) for each case of the source domain \(S_i\) as follows:

$$\begin{aligned} \begin{aligned} \tilde{\mu }_{i,j}^{c}&=\frac{\mu _{i,j}^{c} - \bar{\mu }_{i}^{c}}{\sqrt{\frac{\sum _{j=1}^{M_i}(\mu _{i,j}^{c} - \bar{\mu }_{i}^{c})^2}{M_i}}} \cdot \sqrt{\frac{\sum _{j=1}^{M_T}(\mu _{T,j}^{c} - \bar{\mu }_{T}^{c})^2}{M_T}} + \bar{\mu }_{T}^{c}, \\ \end{aligned} \end{aligned}$$
(10)

where \(M_T\) denotes the number of cases in the target domain T. Similarly, the new standard deviation \(\tilde{\sigma }_{i,j}\) can be computed by:

$$\begin{aligned} \begin{aligned} \tilde{\sigma }_{i,j}^{c}&=\frac{\sigma _{i,j}^{c} - \bar{\sigma }_{i}^{c}}{\sqrt{\frac{\sum _{j=1}^{M_i}(\sigma _{i,j}^{c} - \bar{\sigma }_{i}^{c})^2}{M_i}}} \cdot \sqrt{\frac{\sum _{j=1}^{M_T}(\sigma _{T,j}^{c} - \bar{\sigma }_{T}^{c})^2}{M_T}} + \bar{\sigma }_{T}^{c}. \\ \end{aligned} \end{aligned}$$
(11)

Step 3: Align each case with the estimated statistics

Based on the computed new mean and standard deviation \(\tilde{\mu }_{i,j}\), \(\tilde{\sigma }_{i,j}\), the aligned image \(\tilde{\textbf{x}}_{i,j}\) can be computed as:

$$\begin{aligned} \tilde{\textbf{x}}_{i,j}^{c}&=\frac{\textbf{x}_{i,j}^{c} - \mu _{i,j}^{c}}{\sigma _{i,j}^{c}} \cdot \tilde{\sigma }_{i,j}^{c} + \tilde{\mu }_{i,j}^{c}. \end{aligned}$$
(12)

B Implementation Details

(See Tables 4, 5).

Table 4. Data preprocessing.
Table 5. Experimental setting.

C Details of the prostate datasets

(See Tables 6).

Table 6. Details of the 3 prostate segmentation datasets.

D Training procedure of CateNorm

figure a

E Average Surface Distance (ASD) Comparison

The detailed average surface distance results of both prostate segmentation and abdominal segmentation tasks can be found in Tables 7 and 8. The proposed CateNorm achieves the lowest average ASD on both tasks, even under the more challenging multi-domain setting (Tables 9, 10, 12).

Table 7. ASD comparison on the abdominal datasets under the multi-domain setting (in mm). Compared with the baseline and other competitive methods, the proposed CateNorm achieves the lowest average ASD.
Table 8. ASD comparison on prostate segmentation datasets under the multi-domain setting (in mm). Compared with the baseline and other competitive methods, the proposed CateNorm achieves the lowest average ASD.
Fig. 3.
figure 3

Performance gain under partial annotation. We compare our method to the baseline with fewer annotated classes (i.e., 3/5). We can see that by partitioning the images into different number of regions, CateNorm consistently achieves better results than BN for all tested organs. This suggests that our algorithm is not sensitive to the number of regions.

Fig. 4.
figure 4

Qualitative results comparison. We compare our baseline and the other SOTA method under the multi-domain setting on prostate segmentation and abdominal multi-organ segmentation. Results in the first three rows clearly show that our method outperforms others as their results are cracked and incomplete with these unapparent prostate boundaries. And the results in the last two rows show our methods could better suppress inconsistent class information inside a close segmented area (e.g., reducing false positives inside the stomach) and predict hard organs like the pancreas more accurately by incorporating general and categorical statistics.

Table 9. Comparison on the multi-organ segmentation dataset (BTCV) with single-domain setting (Dice Score in %).
Table 10. Organ-wise results on the multi-organ segmentation datasets under the multi-domain setting (Dice Score in %).
Table 11. CateNorm is compatible to other segmentation models. This table compares performance on multi-domain multi-organ and prostate segmentation with DeepLabv3+ [5] architecture. Our CateNorm consistently outperforms BN.
Table 12. CateNorm is not sensitive to the warmup length. This table reports average accuracy (%) of our CateNorm under deteriorated pretrained models with fewer pretraining iterations. We reduce the warmup iterations to 450, 1440, and 9000 for multi-domain prostate segmentation experiments, to investigate how our CateNorm performs when warmuped with less iterations.
Fig. 5.
figure 5

Set CateNorm block(s) early. This table compares performance with single CateNorm block set in different positions. Adding the CateNorm to the encoder (block index 1–5) always yields better performance than adding to the decoder (block index 6–10). In general, the performance decreases as the block index increases. We believe that it is because the earlier layers in the encoder extract lower-level features that are less discriminative than the decoder features.

Fig. 6.
figure 6

CateNorm does normalize with semantic information. This figure visualizes the learned \(\gamma ^{CateNorm }\) (1st row) and \(\beta ^{CateNorm }\) (2nd row) of a CateNorm layer in a CateNorm block on different channels of the intermediate CateNorm layer during the second forward. With prior class information as guidance, CateNorm can modulate spatially-adaptive parameters. Such spatial-wise modulation can be complementary to the channel-wise modulation accomplished by BN, and derives more discriminative features that benefit segmentation.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xiao, J. et al. (2022). CateNorm: Categorical Normalization for Robust Medical Image Segmentation. In: Kamnitsas, K., et al. Domain Adaptation and Representation Transfer. DART 2022. Lecture Notes in Computer Science, vol 13542. Springer, Cham. https://doi.org/10.1007/978-3-031-16852-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16852-9_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16851-2

  • Online ISBN: 978-3-031-16852-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics