Brain segmentation in patients with perinatal arterial ischemic stroke

Background: Perinatal arterial ischemic stroke (PAIS) is associated with adverse neurological outcomes. Quan- tification of ischemic lesions and consequent brain development in newborn infants relies on labor-intensive manual assessment of brain tissues and ischemic lesions. Hence, we propose an automatic method utilizing convolutional neural networks (CNNs) to segment brain tissues and ischemic lesions in MRI scans of infants suffering from PAIS. Materials and Methods: This single-center retrospective study included 115 patients with PAIS that underwent MRI after the stroke onset (baseline) and after three months (follow-up). Nine baseline and 12 follow-up MRI scans were manually annotated to provide reference segmentations (white matter, gray matter, basal ganglia and thalami, brainstem, ventricles, extra-ventricular cerebrospinal fluid, and cerebellum, and additionally on the baseline scans the ischemic lesions). Two CNNs were trained to perform automatic segmentation on the baseline and follow-up MRIs, respectively. Automatic segmentations were quantitatively evaluated using the Dice coef- ficient (DC) and the mean surface distance (MSD). Volumetric agreement between segmentations that were manually and automatically obtained was computed. Moreover, the scan quality and automatic segmentations were qualitatively evaluated in a larger set of MRIs without manual annotation by two experts. In addition, the scan quality was qualitatively evaluated in these scans to establish its impact on the automatic segmentation performance. Results: Automatic brain tissue segmentation led to a DC and MSD between 0.78 – 0.92 and 0.18 – 1.08 mm for baseline, and between 0.88 – 0.95 and 0.10 – 0.58 mm for follow-up scans, respectively. For the ischemic lesions at baseline the DC and MSD were between 0.72 – 0.86 and 1.23 – 2.18 mm, respectively. Volumetric measurements indicated limited oversegmentation of the extra-ventricular cerebrospinal fluid in both the follow-up and baseline scans, oversegmentation of the ischemic lesions in the left hemisphere, and undersegmentation of the ischemic lesions in the right hemisphere. In scans without imaging artifacts, brain tissue segmentation was graded as excellent in more than 85% and 91% of cases, respectively for the baseline and follow-up scans. For the ischemic lesions at baseline, this was in 61% of cases. Conclusions: Automatic segmentation of brain tissue and ischemic lesions in MRI scans of patients with PAIS is feasible. The method may allow evaluation of the brain development and efficacy of treatment in large datasets.


Introduction
Perinatal arterial ischemic stroke (PAIS) has an incidence rate of 1 in 5000 live births (Nelson and Lynch, 2004;Laugesaar et al., 2007;Chabrier et al., 2011;Sorg et al., 2021;Gale et al., 2018) and is associated with adverse motor and cognitive outcomes (Schulzke et al., 2005;Lee et al., 2005). Infants suffering from PAIS often present with hemi-convulsions and subsequently undergo neonatal brain MRI to diagnose PAIS (baseline) (Nelson and Lynch, 2004;Chabrier et al., 2011). A follow-up scan, acquired weeks to months later, allows evaluation of residual damage and may improve prediction of outcome (follow-up). Currently, stroke size and location on the neonatal MRI scan and the effect of the stroke on brain development are assessed by qualitative visual evaluation of the MRI. However, qualitative evaluation is subjective and prone to intra-and inter-observer variability. Hence, quantitative evaluation would be preferred for the assessment of the ischemic lesion and the brain tissue classes affected and unaffected by stroke. Moreover, quantitative analysis would allow evaluation of the effects of neuro-regenerative interventions such as recombinant human erythropoietin (rhEPO) and mesenchymal stromal cells and might improve long-term outcome prediction Bava et al., 2007;Baak et al., 2022). For this quantitative analysis, accurate segmentation of the brain tissue classes and ischemic lesions in each hemisphere in baseline and follow-up MRI acquisitions is needed.
Given the complexity of the task, manual segmentation of brain tissue classes is practically infeasible in the clinical routine as well as in large studies. Hence, automatic segmentation would be required. In prior research, methods for automated segmentation of brain tissue classes in MRI scans of infants without large pathology have been developed (Išgum et al., 2015;Moeskops et al., 2016;Makropoulos et al., 2014;Grigorescu et al., 2021;Ding et al., 2020;Fan et al., 2022). However, stroke impacts the appearance and shape of brain tissues and therefore, methods developed for analysis of the brain without substantial pathology are not directly applicable for segmentation of MRIs presenting with stroke. After stroke onset, i.e. in the acute period, ischemic areas are well visible on DWI whilst on T2-weighted scans the ischemic lesions become visible after a few days (Dudink et al., 2009). Furthermore, T2-weighted MRI best shows different brain tissue classes. Therefore, several methods have been developed to segment ischemic lesions in neonatal MRI scans that analyzed one or both of these MRI sequences. For example, Murphy et al. developed a method to segment hypoxic-ischemic brain tissue in newborn infants with hypoxic-ischemic encephalopathy on DWI (Murphy et al., 2017). This method classified ischemic voxels based on their spatial and intensity features with a random forest classifier. Ghosh et al. developed a method to segment the ischemic lesion using diffusion weighted images (DWI) and apparent diffusion coefficient (ADC) maps (Ghosh et al., 2014). Their method used hierarchical region splitting and symmetry based region growing to segment the ischemic lesion. Išgum et al. segmented the ischemic lesions using DWI and ADC maps. Initial segmentations were made by comparing the ADC maps of the patients to ADC maps of control subjects. Subsequently, these segmentations were refined by using spatial and texture features along with a linear discriminant classifier (Isgum et al., 2011).
To the best of our knowledge, deep learning methods for ischemic stroke segmentation have been applied only to brain MR scans in adults. Zhang et al proposed a 3D convolutional neural network (CNN) to segment ischemic stroke in adult DWI (Zhang et al., 2018). Thereafter, Praveen et al. used an auto-encoder in conjunction with a support vector machine to discriminate between normal and ischemic tissue on coregistered T1-weighted, T2-weighted, DWI and FLAIR MR scans (Praveen et al., 2018). All of the aforementioned methods segment brain tissue affected by ischemia but do not segment unaffected brain tissue classes. However, to quantify the treatment effect or improve outcome prediction after stroke, quantitative analysis of these tissues may be important as well. Given that PAIS typically affects one hemisphere, analysis of brain tissue classes in contra-and ipsilesional hemispheres would allow comparison and quantification of the stroke damage.
Hence, in this study, we propose an automatic method utilizing convolutional neural networks (CNNs) to segment brain tissues (white matter (WM), gray matter (GM), basal ganglia and thalami (BGT), brainstem (BS), ventricular cerebrospinal fluid (vCSF), the extraventricular cerebrospinal fluid (eCSF), cerebellum (CB)) and ischemic lesions in MRI scans of infants suffering from PAIS. Given the differences in the visibility of the stroke and brain tissues caused by brain maturation in newborns, we train two age-specific instances of the network architecture: one for the segmentation of baseline (baseline network) and another for the follow-up (follow-up network) brain MRI scans. Brain tissues are best visible on T2-weighted MRI scans while the early ischemic changes result in diffusion restriction that are best visualized with DWI within the first week after the insult. Therefore, the baseline network analyses both DWI and T2-weighted MRI scans. On follow-up scans, the diffusion restriction on DWI is no longer present; several weeks after the ischemic event the damaged tissue degenerates and the developed space fills with eCSF (Dudink et al., 2009). Hence, only the T2-weighted scan is analyzed by our follow-up network. We develop and evaluate our method with a set of baseline and follow-up MRI scans of 115 infants suspected on PAIS.

Patients and images
The study includes 115 newborn infants admitted to the Neonatal Intensive Care Unit, University Medical Center Utrecht, the Netherlands that suffered from PAIS, which was confirmed on MRI. The patient characteristics are shown in Table 1. Clinical and imaging data of stroke patients are collected in the Neonatal Stroke Registry, which was approved by the IRB of UMC Utrecht (IRB number 21-845). Parental consent was given for the collection of their infant's data.
The patients in this cohort underwent a baseline MRI, a follow-up MRI, or both. The baseline MRI of the brain was usually performed within one week after birth. Three scanners were used: a 1.5T Philips Achieva, a 3T Philips Achieva, and a 3T Philips Ingenia Elition X scanner. All scans were acquired in the axial plane. The image acquisition parameters are listed in Table 2.
Follow-up MRI scans were performed at two to three months of age. Patients were scanned with a 1.5T or a 3T Philips Achieva scanner. The image acquisition parameters are shown in Table 2. Note that 17 patents included in our study had only a baseline scan and 11 patients had only the follow-up scan. The scans were missing due to medical reasons, patient withdrawal from the study, or application of a different image acquisition protocol.

Table 1
Patient characteristics, ischemic lesion hemisphere, and vascular territory of the scans acquired at baseline and follow-up. The vascular territories reported are the middle cerebral artery (MCA), posterior cerebral artery (PCA), and the anterior cerebral artery (ACA). *IQR is reported in days.

Reference annotations
To train and evaluate the automatic segmentation, manual expert annotations providing the reference annotations of brain tissue classes and ischemic lesions were performed in 9 baseline and 12 follow-up MR scans. These scans contained no image artifacts. Manual annotations of brain tissue classes (WM, GM, BGT, BST, vCSF, eCSF, CB) were made in T2-weighted MRI at both baseline and follow-up following the definition described in (Išgum et al., 2015). Manual annotation of the ischemic lesions was made using the T2-weighted and DWI MR scans available at baseline. Given the very extensive workload of creating the manual annotations, brain tissue segmentations were pre-segmented automatically using the algorithm described in  and subsequently manually corrected by one of two trained experts (LB and NW). The manual corrections were made using in-house developed annotation software . Subsequently, to divide the brain into two hemispheres a reference midline of the brain was annotated for both the baseline and follow-up T2-weighted scans in ImageJ (Rueden et al., 2017) that was subsequently used to separate the segmentation of the brain tissue classes into tissue classes per hemisphere resulting in a total of 14 brain tissue classes. At baseline, the ischemic lesion was segmented in addition to the brain tissue classes. For this, the DWI (B800 or B1000) was registered to the T2-weighted MRI. Hence, the baseline and follow-up segmentation consisted of 16 and 14 classes, respectively. Examples of reference segmentations of the baseline and follow-up scans are shown in Fig. 1.

Method
Our method consisted of two main stages. In the preprocessing stage, T2-weighted and DWI scans were normalized, resampled and aligned to allow joint analysis. Thereafter, convolutional neural networks were used to segment brain tissue classes per hemisphere on baseline and follow-up T2-weighted MRI scans, and the ischemic lesions in the baseline using T2-weighted and DWI scans.

Preprocessing
To focus the analysis on the brain only the T2-weighted scans were skull stripped prior to segmentation (Smith, 2002). Given that the ischemic lesion is best visible on DWI scans and brain tissue classes on the T2-weighted scans, the DWI and T2-weighted scan were aligned by rigid and thereafter by deformable registration. For this, the DWI was resampled to the spatial resolution of the T2-weighted scan. Registration was performed using SimpleITK. The Mattes mutual information with 64 bins was used as the loss function for both types of registration. For the rigid registration the gradient descent optimizer was used with a learning rate of 1. Optimization was performed for 150 iterations. For the deformable registration, the space between the control points was set to 25 mm and an L-BFGS optimizer (Liu and Nocedal, 1989) was applied for 100 iterations.
The intensities of MRI voxels vary per acquisition protocol and scanner. Hence, all voxel intensities were clamped between zero and the 99th percentile of the voxel intensity values in each scan. The voxel values were subsequently normalized between − 1 and 1, and each axial slice was resampled to 512 × 512 voxels.

Automatic segmentation
Currently, UNet-like architectures have been shown to be most successful at segmentation of medical images . Their success is attributed to the combination of coarse features from the down-sampling path that are combined with fine-grained features from the up-sampling path, via the use of skip-connections. For our research, we have chosen UNet++ (Fig. 2) architecture because it uses deep supervision to co-train an ensemble of sub-networks of varying depth within the architecture. Prior literature has shown that this outperforms the standard UNet architecture (Zhou et al., 2020). Each node in UNet++ consisted of two 3x3 convolutions, followed by batch normalization and the ReLU activation function. The features were down-sampled by max-pooling with a stride of two, up-sampled by a transposed convolution with a stride of two or passed to the next layer. Each node was connected to all previous nodes in the same row via skipconnections. To successfully co-train the nested ensembles that UNet++ consists of, deep supervision was applied to the three nodes before the output node. Before the deep supervision loss functions were applied, a one by one convolution was used to reduce the number of feature maps to one. The final loss was calculated by taking a weighted average between the deep supervision losses and output loss. The output loss had a weight that was seven times greater than that of each individual deep supervision losses. In total, the baseline and follow-up network respectively segmented 16 and 14 classes. During training, the weighted cross entropy loss function was used. To remedy the class imbalance that was caused by the differing number of voxels per class, the tissue classes were assigned a weight ten times greater than the background class. Multiscale analysis has been shown to improve segmentation performance (Zhou et al., 2018). To allow multiscale analysis during inference, the output of each deep supervision layer was averaged to create the final prediction. The UNet++ was implemented using Pytorch 1.5.1 (Paszke et al., 2019).
The UNet++ was randomly initialized and trained on stacks of three consecutive axial slices. The segmentation label was predicted for the Fig. 2. The UNet ++ architecture used to segment the brain tissue classes and ischemic lesion for each brain hemisphere. The input consisted of three consecutive axial slices from each MRI sequence. The baseline DWI and T2-weighted MRI scans were interleaved. For the baseline scans, each T2-weighted slice was concatenated to the anatomically corresponding DWI slice. Each node outside of the down-sampling path was connected to all the nodes that preceded it horizontally via skipconnections and to the node that was one horizontal and diagonal index lower. Deep supervision was applied to all the nodes in the top horizontal layer, other than the input node. axial slice at the center of the stack. The slices adjacent to the slice of interest were provided as additional spatial context. To allow analysis using the information from both the baseline T2-weighted and DWI images, each axial slice of the T2-weighted scan was followed by its anatomically corresponding axial slice of the DWI scan. Hence, corresponding slices from each sequence were concatenated in the channel dimension.
Given that voxel-wise segmentation is applied, the segmentation result may contain small isolated clusters of voxels which would not be physiologically possible. Therefore, the tissue classes within the largest connected component of the binarized segmentation map were retained. Similarly, to ensure that no gaps exist due to an incorrect hemisphere segmentation, morphological closing with a 2D diamond shaped structuring element with a connectivity of 1 and a rank of 2 was applied to each hemisphere segmentation.

Evaluation
Quantitative evaluation of the brain tissue classes and the ischemic lesion segmentation was performed in scans with manual reference annotations. The evaluation was performed in two ways. First, the results were analyzed per hemisphere, i.e. the results of the left hemisphere were compared to those of the right hemisphere. Second, the results were analyzed by whether the hemisphere was affected or unaffected by ischemia. To evaluate the overlap between the reference and automatic segmentation the Dice coefficient was computed. To evaluate segmentation agreement along the tissue boundary, the mean surface distance (MSD) between the automatic and reference segmentation was calculated. Moreover, the bias and limits of agreement were calculated for the brain tissue and ischemic lesion volumes obtained from automatic and reference segmentations.
In scans without manual reference annotations, the automatic segmentation was qualitatively evaluated. In addition, to assess whether image quality impacted the automatic segmentation, presence of image corruption caused by imaging artifacts was rated (None, Mild, Moderate, Severe) as listed in Table 3. Images graded as severely affected by artifacts (Severe) were excluded from the qualitative evaluation of the automatic segmentation. The automatic segmentation was evaluated qualitatively by two expert observers, NW (Observer 1) and LB (Observer 2), on a three-point scale (Excellent, Moderate, Poor) as listed in Table 4. Additionally, in the follow-up images the quality of the segmentation of the brain tissues surrounding the location of the former ischemic lesion was also rated on the same three-point scale. However, if no tissue damage was visible, surrounding tissue was rated as Invisible.
The presented automatic method segments brain tissues and ischemic lesions per hemisphere. However, rating automatic segmentations in 16 and 14 classes in each baseline and follow-up MRIs respectively is an extremely time consuming task, especially in a larger set of images. Therefore, to make the qualitative evaluation feasible, rating was performed per tissue class as described above, and hemisphere separation was graded separately for the whole brain. The intraand inter-rater agreement of the qualitative evaluation of the scan quality and automatic segmentation were evaluated by the accuracy between the raters.

Experimental setup
Given that scans of 9 patients made at baseline were manually annotated and available for training and quantitative evaluation, automatic segmentation was performed in nine-fold cross-validation experiments. This means that in each experiment, 8 scans were available for training and 1 scan for testing. In case of the follow-up data, scans of 12 patients were manually annotated. Hence, automatic segmentation was performed in six-fold cross-validation experiments, where every time 10 training scans and 2 test scans were utilized.
To optimize the networks, an Adam optimizer (Kingma and Ba, 2015), a weight decay of 5e-5 and a batch size of 16 were used. The UNet++ was optimized for 300 epochs using a cyclical learning rate schedule (Smith, 2017) with a cycle length of 1,000 iterations. The minimum learning rate was 1e-4 and the maximum learning rate was 1.1e-3.
Automatic segmentation including preprocessing and segmentation took under 4 min per scan.

Quantitative evaluation
Quantitative results for the baseline and follow-up scans, analyzed as the left and right hemisphere, are listed in Table 5. The results show that segmentation performance for the brain tissue classes on the baseline scans was similar for the left and the right hemisphere, except for the right BGT. Visual analysis of the segmentations revealed that in several cases ischemic lesions were erroneously segmented as BGT which likely caused the asymmetry in the segmentation of BGT. On the follow-up scans, the performance of the segmentation was similar in the left and the right hemisphere for all tissue classes. Examples of segmentations for the baseline and follow-up scans are respectively shown in Fig. 3 and Fig. 4.
Quantitative results for the baseline and follow-up scans, analyzed for the hemispheres affected by ischemia and not affected by ischemia, are listed in Table 6. The results show that on the baseline scans the performance for the hemispheres affected by ischemia is slightly lower than for the hemispheres not affected by ischemia. On the follow-up scans the performance on the hemispheres that were and were not affected by ischemia is similar.
The median and interquartile range of the brain tissue and ischemic lesion volumes obtained from the reference segmentations in the left and right hemispheres are shown in Fig. 5 for both the baseline and the follow-up scans. Moreover, the median and inter-quartile range of the brain tissue and ischemic lesion volumes obtained from the reference segmentations in the hemispheres that are and are not affected by ischemia are shown in Fig. 6 for both baseline and follow-up scans. The bias and limits of agreement between the reference and automatically quantified volumes for both the baseline and follow-up scans in the left and right hemispheres are shown in Fig. 7. For the baseline scans, eCSF volumes in both hemispheres had a bias greater than the other tissue classes indicating limited oversegmentation. The left and right ischemic lesion respectively had a bias greater and smaller than zero, indicating limited under -and over-segmentation. On the follow-up scans the eCSF Table 3 Grading the quality of brain tissue segmentation, ischemic lesion segmentation, hemisphere separation, and the quality of the segmentation surrounding the former location of the ischemic lesion.

Excellent
Segmentation clinically usable after no or minor manual corrections Moderate Segmentation clinically usable after major manual corrections Poor Segmentation clinically unusable Table 4 Grading of the degree to which an image is corrupted by image artefacts.

No artefacts visible in the scan Mild Differentiation between tissue classes is clear despite mild artifacts being present Moderate
Differentiation between tissue classes is visible, but with limited uncertainty Severe Differentiation between tissue classes is not visible due to artifacts in both hemispheres had a greater bias than the other tissue classes indicating oversegmentation. The bias and limits of agreement between the reference and automatically quantified volumes for both the baseline and follow-up scan in the hemispheres that are affected and not affected by ischemia are shown in Fig. 8.

Qualitative evaluation
The ratings of the baseline image quality with respect to presence of imaging artifacts and automatic segmentation performance by Observer 1 are shown in Fig. 9. In total, 74, 19, 8 and 5 scans were rated as having None, Mild, Moderate or Severe artifacts.
Specifically, the results show that the segmentation of brain tissue classes is accurate on images that have no visible artifacts, i.e. with the majority of the ratings being Excellent (useable after no or minor correction). Hemisphere separation is also accurate with most segmentations being rated as Excellent. The results also show that the segmentation of the ischemic lesion is the most challenging, with 61% percent of ischemic lesion segmentations on images without artifacts being rated as Excellent and 22.6% being rated as Moderate.
The ratings of the follow-up image quality with respect to presence of imaging artifacts and automatic segmentation performance by Observer 2 are shown in Fig. 10. In total, Observer 2 graded 66, 18, 13 and 0 images as having None, Mild, Moderate or Severe imaging artifacts, respectively.
The segmentation of the brain tissues was mostly rated as Excellent in images without artifacts. Furthermore, on scans with no artifacts, 65% and 100% percent of cases the tissue surrounding former ischemia and hemisphere separation, respectively, were rated as Excellent.
The inter-and intrarater accuracy was calculated on a subset of 30 randomly selected baseline scans. The interrater accuracy was also calculated on a subset of 30 randomly selected follow-up scans. The inter-and intrarater accuracy for the brain tissue segmentation, ischemic lesion segmentation, tissue surrounding the formerly ischemic lesion, and hemisphere separation are shown in Table 7.

Table 5
Median and [interquartile range] of the Dice coefficient (Dice) and mean surface distance (MSD) in mm between the reference and automatic segmentation. Results are displayed for each brain tissue type in the left and right hemisphere: white matter (WM), gray matter (GM), basal ganglia and thalamus (BGT), brainstem (BST), the ventricular cerebrospinal fluid (vCSF), extra-ventricular cerebrospinal fluid (eCSF), and the cerebellum (CB). In addition, the ischemic lesion (IL) is reported for the baseline scans. The mean Dice coefficient over all brain tissue types and the IL is listed. Among the baseline scans four had an ischemic lesion in the left hemisphere and five in the right hemisphere.  . 3. Examples of brain tissue and ischemic lesion segmentations on the baseline scans for two patients (top and bottom row). From left to right: the T2-weighted scan, DWI scan, reference segmentation, and automatic segmentation. The top row shows an example on which the network successfully segmented the hemispheres and the brain tissues. The bottom row shows a DWI scan on which the ischemic lesion is undersegmented and the basal ganglia are oversegmented by the automatic method.

Discussion
We have presented a deep learning method for segmentation of brain tissue classes and the ischemic lesions in both hemispheres of brain MR scans in infants with PAIS. The segmentation was applied to scans acquired at two time points, i.e. to scans made after the onset of stroke and to scans made at 3-months follow-up. Quantitative evaluation of the automatic segmentation showed that on average the brain tissue classes and the ischemic lesions had good spatial and volumetric agreement with the manual expert segmentations. The segmentation performance of our method was slightly better in the hemisphere that was not affected by ischemia than in the hemisphere that was affected by ischemia on baseline scans and performance was similar on follow-up scans. Furthermore, the qualitative analysis on a larger clinically representative set of MRIs showed that in most of the cases automatic brain tissue segmentations obtained on the baseline (85%) and followup scans (91%) were clinically usable after no or minor correction. However, segmentation of the ischemic lesions on the baseline scans was more challenging and thus required manual correction more often. Segmentation of the brain tissue and especially ischemic lesions was Fig. 4. Example of brain tissue segmentations on follow-up scans for two patients (top and bottom row). From left to right: the T2-weighted scan, reference segmentation,and automatic segmentation. The top row shows a scan which was accurately segmented. The bottom row shows a scan on which the left cerebellum is incorrectly segmented.

Table 6
Median and [interquartile range] for the Dice coefficient (Dice) and mean surface distance (MSD) in mm between the reference and automatic segmentation. Results are displayed for each brain tissue type in the hemispheres affected and not affected by ischemia: white matter (WM), gray matter (GM), basal ganglia and thalamus (BGT), brainstem (BST), the ventricular cerebrospinal fluid (vCSF), extra-ventricular cerebrospinal fluid (eCSF), and the cerebellum (CB). In addition, the ischemic lesion (IL) is reported for the baseline scans. The mean Dice coefficient over all brain tissue types and the IL is listed. Moreover, 2 patients in the follow-up scans had lesions in both hemispheres. Hence, both hemispheres were considered affected by ischemia for these patients. compromised in scans corrupted by artifacts. Hence, additional scrutiny of the segmentations should be applied when using scans corrupted by artifacts. Several aspects may have contributed to compromised segmentation results. First, contrast between the ischemic lesions and the unaffected regions on DWI varied greatly between patients. This may have been due to differences in the timing of MRI. After a peak in signal intensity on DWI, the contrast slowly normalizes around day 7, so-called pseudonormalization (van der Aa et al., 2013). This diversity in contrast differences may not have been represented in our training data. Second, small false positive segmentations occurred for the ischemic lesion segmentation, but rarely for the brain tissue classes. While challenging for the automatic segmentation, this type of error is easily manually corrected. Third, ischemic lesions in the left hemisphere were often oversegmented. Although these false positive segmentations and oversegmentation may lead to an inaccurate estimation of ischemic volume, Fig. 5. Boxplots of the reference brain tissue volumes in mL in the baseline (n = 9) and follow-up scans (n = 12). Results are displayed for the left and right hemisphere in each plot: white matter (WM), gray matter (GM), basal ganglia and thalamus (BGT), brainstem (BST), the ventricular cerebrospinal fluid (vCSF), extra-ventricular cerebrospinal fluid (eCSF), and the cerebellum (CB). In addition, the ischemic lesion (IL) is reported for the baseline images. Among the baseline scans four showed ischemia in the left and five in the right hemisphere. Outliers on the follow-up scans are due to cerebrospinal fluid replacing brain tissue. Fig. 6. Boxplots of the reference brain tissue volumes in mL in the baseline (n = 9) and follow-up scans (n = 12). Results are displayed for the hemispheres affected by and not affected by ischemia in each plot: white matter (WM), gray matter (GM), basal ganglia and thalamus (BGT), brainstem (BST), the ventricular cerebrospinal fluid (vCSF), extra-ventricular cerebrospinal fluid (eCSF), and the cerebellum (CB). In addition, the ischemic lesion (IL) is reported for the baseline images. Among the baseline scans four showed ischemia in the left and five in the right hemisphere. Outliers on the follow-up scans are due to cerebrospinal fluid replacing brain tissue. Furthermore, two follow-up scans had lesions in both hemispheres. Hence, both hemispheres were considered affected by ischemia in these patients.
it takes little effort to manually correct.
Previous automatic segmentation methods reported accurate segmentations, but neither differentiated brain tissues per hemisphere. The study by Moeskops et al. (2016) segmented the same brain tissue classes and unmyelinated white matter in preterm infants imaged at termequivalent age. Given that Moeskops et al. used nearly the same definition of the brain tissue classes and the age of the infants was similar, this set enables comparison with performance on the baseline scans in the current study (Moeskops et al., 2016). Despite the presence of pathology in the images in our study and more complex hemispherewise analysis, we achieved comparable performance on the segmentation of the baseline scans (Average Dice coefficients reported in (Moeskops et al., 2016)     , and the cerebellum (CB),the ischemic lesion (IL) and, the hemisphere separation (HS) by Observer 1. The ratings of each tissue type are divided by the severity of the artifacts that afflicted them (None, Mild and Moderate). Note that scans with severe artifacts were considered unsuitable for automatic analysis and were therefore not rated. Results are given in percentages.  (None, Mild and, Moderate). Note that scans with severe artifacts were considered not suitable for automatic analysis and were therefore not rated. Results are given in percentages.

Table 7
For the baseline and follow-up scans the accuracy between the ratings of Observer 1 and Observer 2 are shown on a randomly selected subset of 30 scans. The classes for which the accuracy is calculated are: white matter (WM), gray matter (GM), basal ganglia and thalamus (BGT), brainstem (BST), the ventricular cerebrospinal fluid (vCSF), extra-ventricular cerebrospinal fluid (eCSF), the cerebellum (CB), the area surrounding the formerly ischemic lesion (ST), the ischemic lesion (IL), and the hemisphere separation (HS). For the baseline scans, accuracy between two ratings of the same observer (Observer 1) is additionally shown. The study by Ding et al. segmented three brain tissue classes (GM, WM, CSF) (Ding et al., 2020). Furthermore, this study used more data for training and their grey matter class incorporated brain tissues other than cortical grey matter, such as the brainstem, cerebellum, amygdala, and hippocampus. Due to the differences in tissue definitions and population of patients quantitative results are not comparable. Our method is the first deep learning-based method that segments ischemic lesions on DWI scans in a large unique dataset of perinatal ischemic stroke patients. Other work by Išgum et al. used a linear discriminant classifier to segment the ischemic lesion. However, the method neither segmented other brain tissue classes, nor indicated the hemisphere in which the lesion was located. Unfortunately, this work did not report overlap or boundary metrics (Isgum et al., 2011).
The most comparable stroke lesion segmentation problem are the methods that use deep learning to segment ischemic lesion on DWI scans in adult patients (Zhang et al., 2018;Chen et al., 2017;Woo et al., 2019). We observe that our method achieved comparable performance (Dice coefficient: 0.79 in our study vs. 0.79 (Zhang et al., 2018); 0.67 (Chen et al., 2017), 0.85 for a single U-Net (Woo et al., 2019) despite our network being trained on two orders of magnitude fewer data. However, none of the methods that were developed for scans of adult patients additionally provide brain-tissue segmentations, nor do they indicate which hemisphere is affected by the ischemia.
The automatic analysis of brain tissue volumes and the ischemic lesion in follow-up and baseline scans would allow neuroregenerative treatments to be evaluated (Wagenaar et al., 2018). For example, treatment of PAIS by intranasal administration of mesenchymal stromal cells has recently been shown to be feasible and safe (Baak et al., 2022). However, large-scale placebo controlled trials still need to be conducted. Given that our automatic method (which includes pre-and postprocessing and inference by multiple networks) analyzes a scan in less than 4 min, the method would significantly reduce the analysis burden and could facilitate large scale studies.
Despite the accurate segmentations provided by our method, our study has several limitations. First, our study used a limited number of training and quantitative testing data with manual reference annotations at the baseline (n = 9) and follow-up (n = 12). Furthermore, these scans did not contain imaging artifacts. Hence, we have not been able to quantitatively evaluate performance on a large set of scans or scans that contain imaging artifacts. To ameliorate this limitation in our evaluation, we conducted a qualitative evaluation in a large set of MRIs that showed that most brain tissue class segmentations on scans without artifacts could be used without any or after minor manual correction.
Second, our network instances were trained on scans acquired under very limited circumstances. For example, the scans were acquired on three types of MRI scanners: a 1.5T Philips Achieva scanner, a 3T Philips Achieva scanner, and a 3T Philips Ingenia Elition X scanner. By only having data from three scanners available to train on, the performance of our networks on data acquired by different scanner types from different manufacturers may not generalize. Furthermore, the scans were made at specific time points; the baseline scan was usually made the week after birth and the follow-up was made two to three months after birth. During the first year of life the brain develops rapidly (Stiles and Jernigan, 2010). Hence, by training our networks only on images acquired at specific time points performance may not generalize to MRI made at other infant ages. Future research should include scans from various stages of the brain development and from a greater variety of scan types and scanner vendors.
Third, our results indicated that our method segmented the ischemic lesion in the right hemisphere more accurately than in the left hemisphere. Using the Dice coefficient to evaluate the segmentation, we observe lower performance for the left (0.72) than for the right hemisphere (0.86). Analysis of the results revealed that this is likely caused by the size of the ischemic lesions, which are smaller in the left hemisphere (19 vs 93 mL). In smaller lesions, small variations in the segmentation have a larger impact on the Dice score (Taha and Hanbury, 2015). On a larger test set, lesions with more variation in location and size would likely show less discrepancy in the performance between the left and right hemisphere.
Fourth, the intra-and interrater reliability were assessed by using the accuracy. This metric was used because the data were skewed due to the majority of the automatic brain tissue segmentations being rated as Excellent. However, caution must be applied when interpreting the accuracy alone because it does not correct for observations that occur due to chance alone.
Fifth, although the Brain Extraction Tool (BET) has been mostly used in adult data, it has been evaluated on the MRI scans in perinatal ischemic stroke scans . In this study a Dice coefficient of 0.95 and an MSD of 1.33 were obtained. Our retrospective analysis of the results revealed that BET never led to undersegmentation and occasionally led to minor oversegmentation of the brain. The oversegmentation has not hampered subsequent brain tissue segmentation.
To conclude, we presented a method for automated segmentation of brain tissue and ischemic lesions in each brain hemisphere. The automatic segmentation method may allow evaluation of brain development and the efficacy of treatment methods in large datasets that involved MR imaging of infants affected by perinatal arterial ischemic stroke.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
The data that has been used is confidential.