Neural networks for classification of strokes in electrical impedance tomography on a 3D head model

We consider the problem of the detection of brain hemorrhages from three dimensional (3D) electrical impedance tomography (EIT) measurements. This is a condition requiring urgent treatment for which EIT might provide a portable and quick diagnosis. We employ two neural network architectures -- a fully connected and a convolutional one -- for the classification of hemorrhagic and ischemic strokes. The networks are trained on a dataset with $40\,000$ samples of synthetic electrode measurements generated with the complete electrode model on realistic heads with a 3-layer structure. We consider changes in head anatomy and layers, electrode position, measurement noise and conductivity values. We then test the networks on several datasets of unseen EIT data, with more complex stroke modeling (different shapes and volumes), higher levels of noise and different amounts of electrode misplacement. On most test datasets we achieve $\geq 90\%$ average accuracy with fully connected neural networks, while the convolutional ones display an average accuracy $\geq 80\%$. Despite the use of simple neural network architectures, the results obtained are very promising and motivate the applications of EIT-based classification methods on real phantoms and ultimately on human patients.


Introduction
Electrical impedance tomography (EIT) is a noninvasive imaging modality for recovering information about the electrical conductivity inside a physical body from boundary measurements of current and potential. In practice, a set of contact electrodes is employed to drive current patterns into the object and the resulting electric potential is measured at (some of) the electrodes. The reconstruction process of EIT requires the solution of a highly nonlinear inverse problem on noisy data. This problem is typically ill-conditioned [1,2,3] and solution algorithms need either simplifying assumptions or regularization strategies based on a priori knowledge.
In recent years, machine learning has arisen as a data-driven alternative that has shown tremendous improvements upon the ill-posedness of several inverse problems [4,5,6]. It has been already successfully applied in EIT imaging [7,8,9,10]. The purpose of this work is to apply machine learning to the problem of classification of brain strokes from EIT data.
Stroke, a serious and acute cerebrovascular disease, and a leading cause of death, can be of two types: hemorrhagic, caused by blood bleeding into the brain tissue through a ruptured intracranial vessel, and ischemic, caused by vascular occlusion in the brain due to a blood clot (thrombosis). Visible symptoms are precisely the same in both cases, which makes it extremely difficult to differentiate them without advanced imaging modalities. Ischemic stroke can be treated through the use of thrombolytic (or clot-dissolving) agents within the first few hours [11], since human nervous tissue is rapidly lost as stroke progresses. On the other hand, thrombolytics are harmful, or even potentially fatal, to patients suffering from hemorrhagic stroke, thus it is essential to distinguish between the two types. A rapid and accurate diagnosis is crucial to triage patients [12,13] to speed up clinical decisions concerning treatments and to improve the recovery of patients suffering from acute stroke [14].
In this work we study how to accelerate the diagnosis of acute stroke using EIT. Currently, stroke can be classified only by using expensive hospital equipment such as X-ray CT. On the contrary, an EIT device is cheap, compact and could be carried in an ambulance (even though measurements would need to be taken while the patient is not moving). The main challenge, for emergency use, is that data are collected at a single time frame: this excludes time-difference imaging [15] and leaves absolute and frequency-difference [16] imaging as the only options. Another important application, where measurements at different times are available, is bedside real-time monitoring of patients after the acute stage of stroke. In both scenarios, getting a full image reconstruction with the existing inversion algorithms is computationally heavy and time-consuming, and thus machine learning techniques can be used to expedite the process. In this work we focus on the case of absolute imaging, i.e., where EIT measurements are available at a single time frame and at a single frequency, which can be considered the most challenging scenario.
Although EIT for brain imaging has been studied for decades [17,16,18,19,20,21,22,23], there are only few recent results that employ machine learning algorithms for stroke classification. The work [24] proposes the use of both Support Vector Machines (SVM) and Neural Networks (NN) for detecting brain hemorrhages using EIT measurements in a 2-layer model of the head. The main weakness of the model, however, is that it does not take into account the highly resistive skull layer, which is known to have a shielding effect when it comes to EIT measurements. Also, only a finite set of head shapes is considered and the model lacks the ability to generalize to new sample heads. The more recent work [25] considers a 4-layer model for the heads and uses data from 18 human patients [26] that are classified using SVM. The main difference with our method is that our classification is made directly from raw electrode data, while in [25] a preprocessing step involving a precise knowledge of the anatomy of the patient's head is required. Moreover, only strokes of size 20 ml or 50 ml are considered in four specific locations, while our datasets include strokes with volume as small as 1.5 ml located anywhere within the brain tissue. Another methodology is shown in [27], where first Virtual Hybrid Edge Detection (VHED) [28] is used to extract specific features of the conductivity, then neural networks are trained to identify the stroke type. This approach is very promising but currently limited to a 2D model. Applications of deep neural networks to EIT have been also considered in [29,30,31,7,8,9]. One could also use some machine learning techniques to form a model for the head shapes: see [10] for an approach that could potentially be applicable to head imaging.
In this work we consider two different types of NN, a fully connected and a convolutional NN, that we feed with absolute EIT measurements and produce a binary output. These measurements, which form the training and test datasets, are simulated by using the so-called complete electrode model (CEM) [32,33] on a computational 3-layer head model, where each layer corresponds to a different head anatomical region: scalp, skull and brain tissue.
The training and test datasets are made of pairs of simulated electrode measurements at a single time frame and a label which indicates whether the data are associated with a hemorrhagic stroke or not. More precisely, label 1 is meant to indicate a hemorrhagic stroke, while label 0 stands for either an ischemic stroke or no stroke. This is motivated by the fact that detecting the presence or absence of hemorrhage may be sufficient to initiate appropriate treatment. More precisely, the datasets contain EIT measurements in the following proportions: 50% hemorrhages, 25% ischemic stroke and 25% healthy brains. We chose to include a large number of healthy patients in order to cover a broader range of potential applications and not restrict ourselves only to the emergency setting. The measurements in the training and test datasets are generated by varying the conductivity distribution, the electrode positions and noise, the shape of the scalp, the skull and the brain tissue. We model a hemorrhage as a volume of the brain with higher conductivity values with respect to the brain tissue, and an ischemic stroke with lower values, based on the available medical literature [34,35]. In the training dataset the strokes are modeled as a single ball inclusion of higher or lower conductivity, while in the test datasets we consider different shapes of multiple inclusions. Concerning the variations in the geometry, a joint principal component model for the variations in the anatomy of the human head [36] is considered, so that we are able to generate realistic EIT datasets for brain stroke classification on a 3D finite element (FE) head model. The training dataset is made of 40 000 pairs of electrode data and labels, while every test dataset is made of approximately 5 000 samples. No validation was used in the training of the fully connected network, while for the convolutional one the training set was randomly split (83% training, 17% validation). These test datasets take into account a variety of possible errors in the measurement setup: slight variations in the background conductivity and in the contact impedances are considered, along with misplacement of electrodes and mismodeling of the head shape. The functionality of the chosen methods is demonstrated via the measures of accuracy, sensitivity and specificity of the networks trained with noisy EIT data.
Our numerical tests show that the probability of detecting hemorrhagic strokes is reasonably high, even when the electrodes are misplaced with respect to their intended location and the geometric model for the head is inaccurate. We find that both fully connected neural networks and convolutional neural networks are efficient tools for the described classification. More precisely, in our experiments we observe that a shallow fully connected neural network generalizes better to the test datasets than a convolutional one.
This paper is organized as follows. In Section 2 we recall the CEM and the parametrized head model with the workflow for mesh generation. Neural networks and their specifications are introduced in Section 3. Section 4 presents the experiment settings, while numerical results are described in Section 5. Finally, Section 6 lists the concluding remarks.
where ν ∈ L ∞ (∂Ω, R 3 ) denotes the exterior unit normal of ∂Ω. Moreover, the isotropic conductivity distribution σ describing the electric properties of Ω is assumed to belong to A physical justification of (1) can be found in [32]. Given an input current pattern I ∈ R M , a conductivity σ and contact impedances z with the properties described above, the pair (u, U ) ∈ H 1 (Ω)⊕R M is the unique solution of the elliptic boundary value problem (1) according to [32,33]. Note that the use of R M corresponds to systematically choosing the ground level of potential so that the mean of the electrode potentials is zero. The measurement, or current-to-voltage map of CEM is defined as the mapping I → U , from R M to R M .
2.2. Head model. The head model used in this work follows the same approach as in [36], though slightly modifying and upgrading the setting to a three-layers model. We define a layer for each one of the anatomical structures that we are considering for this particular head model. There are L = 3 different layers: the scalp layer, i.e., the outer one corresponding to the skin, the resistive skull layer and the interior brain layer (see Figure 1).
For each layer, the library of n = 50 heads from [37] is used to build the model for the variations in the shape and size of the human head. We can represent the crown of the lth layer in the jth head, for l = 1, 2, 3 and j = 1, . . . , n, as the graph of a function S l j : where S + is the upper unit hemisphere, i.e., and r l j : S + → R + gives the distance from the origin to the surface of the lth layer of the jth head as a function of the directionx ∈ S + , where the origin is set at approximately the center of mass of each bottom face of the heads (see Figure 1).

Figure 1.
Top row: two different head models, oriented with the forehead on the left and the back of the head on the right. The 32 electrodes are at their intended positions, with the FE mesh associated to the head appropriately refined around them. Bottom row: the corresponding bottom face of each head model, where the three layers associated to scalp, skull and brain tissues are visible. Note that the first model has a more flattened forehead (top row) and it is narrower in the coronal plane (bottom row, y direction), corresponding to a shorter distance between the ears. Also, the thickness of the scalp layer is clearly different. The unit of length is meter.
Then, for each layer l, we introduce the average head and perturbations r l = 1 n n j=1 r l j and ρ l j = r l j −r l , j = 1, . . . , n, l = 1, 2, 3, wherer l describes the lth layer of the average head and ρ l 1 , . . . , ρ l n are the corresponding perturbations that define the employed library of heads. We assume the functions ρ l 1 , . . . , ρ l n belong to H 1 (S + ) and are linearly independent for every l.
Mimicking the formulation in [36], we introduce a joint principal component model involving all the three layers. A single head in the library defines an object in the space [H 1 (S + )] 3 , that is, a three dimensional vector whose components are in the space H 1 (S + ) and define the three layers. The reason for choosing H 1 (S + ) is two-fold. First of all, according to the numerical tests in [36], the simplest option s = 0 in H s (S + ) leads to undesired cusps in the optimal basisρ 1 , . . . ,ρ n defined below. On the other hand, s ∈ N \ {1} would require the use of higher order conforming finite elements and the implementation of the needed inner products on S + for s ∈ R + \ N would lead to unnecessary technical considerations. The construction of the new principal component model for the head library is performed as in [36], with the exception that the inner product between two elements of [H 1 (S + )] 3 , say, v and w, is defined as Our aim is to look for anñ-dimensional subspace for allñ-dimensional subspaces W of the Sobolev space [H 1 (S + )] 3 . The purpose is to find a low dimensional subspace that on average contains the best approximations for the perturbations , where the quality of the fit is measured by the squared norm of [H 1 (S + )] 3 . Following the approach in [36], we define the matrix R that takes into account the variations in every layer: By applying Lemma 3.1 from [36] with minor modifications, we obtain the following set of orthonormal basis functions for Vñ: where λ k , v k are eigenvalues and orthonormal eigenvectors of R ∈ R n×n and we have defined ρ l = [ρ l 1 , . . . , ρ l n ] T : S + → R n , for l = 1, 2, 3. The positive eigenvalues λ k are listed in descending order, and the corresponding l-dependent eigenvectors are employed in the definition ofρ l k for all l = 1, 2, 3.
The parametrization for the lth layer in our head model can then be written as whereρ l k are defined as in (7), α k are free shape coefficients and 1 ≤ñ ≤ n is chosen appropriately (cf. [36]). When generating random head structures for one numerical experiment, the vector of shape coefficients α ∈ Rñ is drawn from a Gaussian distribution N (0, Γ α ), with the covariance matrix: see once again [36] for further details.

2.3.
Generating the FEM mesh. Our workflow for generating a tetrahedral mesh for the head model consists of three steps: generation of an initial surface mesh, insertion of electrodes and tetrahedral mesh generation. The initial surface mesh is constructed by subdividing k times a coarse surface partition consisting of four triangles, where k ∈ N can be chosen by the operator of the algorithm, then M electrodes are inserted in the surface mesh following the process described in [36]. A (dense) mesh T m for the resulting polygonal domain is generated using the Triangle software [38]. After inserting all the electrodes, the process is completed by generating a tetrahedral partition for the whole volume by TetGen [39] starting from the formed surface mesh.
With the current head library, the average head size is approximately 20 × 16 cm in the axial plane, while its height is about 9 cm. The head shapes obtained by changing the shape parameters in (9) are variations from this average, up to a maximum of approximately 2 cm difference in each dimension. The thickness of the scalp varies within the range 10 − 20 mm, while the skull is about 2 − 15 mm thick. Throughout all the experiments, the number of electrodes is chosen to be M = 32 and we selectñ = 10.
Remark 2.1. Despite being an upgrade with respect to the computational head model used in [36], this three-layer model is clearly still a simplified version of the true head anatomy. In particular, it does not take into account the shunting effect of the highly conductive cerebrospinal fluid (CSF) layer inside the skull. The CSF is known to represent a major challenge in EIT brain imaging, for it is extremely difficult to distinguish it from a bleed.

Neural networks
There is a wide range of machine learning classification algorithms available in the literature. We chose to use neural networks (NN) since they consistently outperformed kernel methods for this specific kind of nonlinear dataset in our preliminary numerical tests.
We consider a 2-layer fully connected network and a convolutional neural network for the classification of brain strokes from electrode data. In both cases, as detailed in Section 4, the input is a vector of size M (M − 1), which represents a single set of electrode measurements, where M is the number of electrodes. The output layer is a single scalar value between 0 and 1. A rounding is then applied to the output in order to obtain a binary value for the classification. Since in our experiments M = 32, the input layer is composed of 992 neurons.
3.1. Fully connected neural networks. We consider a fully connected neural network (FCNN) that takes as input electrode data generated as discussed in Section 2 and gives a binary output: 1 for hemorrhage, 0 for no hemorrhage.
Our FCNN has two layers with weights, as shown in Figure 2. The input layer has 992 neurons, while the second and the third layer have 7 and 1 neurons, respectively. The size of the hidden layer has been chosen on the basis of the results obtained in preliminary tests on smaller datasets. The network can be represented by the following real-valued function: where we denoted by θ = {W 1 , W 2 , b 1 , b 2 } the set of weights and biases and are the weights and the bias of the first layer, • W 2 : R 7 → R, b 2 ∈ R are the weights and the bias of the second layer, hidden layer (7) output (1) Figure 2. An illustration of the architecture of our FCNN, with one input layer with 992 nodes, one hidden layer with 7 nodes and one output layer with a single node for the binary classification.
The network f (x, θ) is trained by minimizing the cross-entropy loss where the sum is over the samples x j in the training set, with y j being the corresponding true binary label.
3.2. Convolutional neural networks. We also consider a convolutional neural network (CNN) for our classification task. We refer to [40] for more details on the architecture of a CNN. As depicted in Figure 3, our network has six layers with weights. The first two are convolutional and the last four are fully-connected. As with the FCNN, our CNN is trained by minimizing the cross-entropy loss (12). The architecture was motivated by similar CNNs used in image classification [40], and the hyperparameters have been chosen after preliminary tests on smaller datasets. Regarding the two convolutional layers, we chose to use 1D kernels. This choice might not be optimal since we are losing some geometric information about the electrode configuration. On the other hand, even though a single set of electrode measurements can be represented as a matrix of size 32 × 31, there is no obvious advantage in considering it as an image.
In the first layer of our CNN, the 992 × 1 input vector is filtered with 6 kernels of size 3 × 1 with stride 1 and zero padding. Then a max-pooling layer, with kernel and stride of size 2 and zero padding, is applied to each output channel of the first layer. This is then filtered, in the second convolutional layer, with 16 kernels of size 3 × 1 (with stride 1 and zero padding). Then another max-pooling layer, with kernel and stride of size 2 and zero padding, is applied to the output. The third, fourth, fifth and sixth layers are fully connected and have sizes 240, 120, 84 and 1, respectively (see Section 3.1 for more details). The rectified linear unit (ReLU) activation function g(x) = max(0, x), is applied to the output of each fully connected layer, except the last one. . . , M } is the label of the so-called current-feeding electrode and e k denotes a standard basis vector. Such current patterns have been used in [36] and with real-world data in [41] and [42]. In our tests, the current-feeding electrode is always the frontal one on the top belt of electrodes (cf. Figure 4). The potential measurements are stacked into a single vector and we analogously introduce the stacked forward map Here, the conductivity σ ∈ R N + is identified with the N ∈ N degrees of freedom used to parametrize it, i.e., the number of nodes in the mesh, the contact impedances are identified with the vector z = [z 1 , . . . , z M ] ∈ R M + , α ∈ Rñ is the parameter vector in (9) determining the shape of the computational head model, and θ ∈ (0, π/2) M and φ ∈ [0, 2π) M define the polar and azimuthal angles of the electrode center points, respectively.
For each forward measurement, the parameters α ∈ Rñ defining the shape of the head are drawn from the distribution N (0, Γ α ), where Γ α ∈ Rñ ×ñ is the diagonal covariance matrix defined componentwise by (10) (14). Counting upwards from the bottom belt, there are altogether M = 16+10+6 = 32 electrodes of radius R = 0.75 cm. The current-feeding electrode is p = 27, i.e., the frontal one on the top belt of electrodes, highlighted in yellow. The FE mesh is refined appropriately around the electrodes. The unit of length is meter.
Every set of measurements is performed with M = 32 electrodes of radius R = 0.75 cm, organized in three belts around the head (see Figure 4). The expected values for the polar and azimuthal angles of the electrode centers,θ andφ, correspond to the correct angular positions of the electrodes, i.e., the positions where one originally aims to attach the electrode patches. The actual central angles of the target electrodes, i.e., θ and φ, are then drawn from the distributions N (θ, Γ θ ) and N (φ, Γ φ ), where Here, I ∈ R M ×M is the identity matrix and ς θ , ς φ > 0 determine the standard deviations in the two angular directions. Notice that ς θ and ς φ must be chosen so small that the electrodes are not at a risk to overlap or move outside the crown of the computational head. The relative contact impedances z m ∈ R + , m = 1, . . . , M are independently drawn from N (z, ς 2 z ), wherez > 0 is chosen so much larger than ς z > 0 that negative contact impedances never occur in practice.
Finally, the conductivity distributions are defined as follows. For each layer (scalp, skull and brain) and stroke type (hemorrhagic or ischemic) we draw parameters σ scalp , σ skull , σ brain and σ h or σ i from Gaussian distributions of the form N (σ, ς 2 σ ) (see Table 1). With these parameters we construct a conductivity that is constant on each layer and in the stroke region (if present). The final conductivity σ ∈ R N + corresponds to a piecewise linear parametrization on a dense FE mesh with N ≈ 20 000 nodes and about 85 000 tetrahedra associated to the head defined by the shape parameters α and refined appropriately around the electrodes, whose positions are determined by θ and φ. The number of principal components for the shape parameters α 1 , . . . , αñ (see Section 2.2) is chosen to beñ = 10 in all the experiments. We approximate the ideal data U(σ, z, α, θ, φ) by FEM with piecewise linear basis functions and denote the resulting (almost) noiseless data by U ∈ R M (M −1) . The actual noisy data is then formed as where η ∈ R M (M −1) is a realization of a zero-mean Gaussian with the diagonal covariance matrix with I ∈ R M (M −1)×M (M −1) the identity matrix. The free parameter ς η > 0 can be tuned to set the relative noise level. Such a noise model has been used with real-world data, e.g., in [41]. . This choice has been motivated by practitioners, since ruling out the presence of a hemorrhage allows them to start treating the patient immediately with blood-thinning medications. The simulated stroke is represented as a single ball with varying location inside the brain tissue, varying volume and varying conductivity levels drawn from Gaussiann distributions. The chosen parametersσ and ς σ for all conductivity values are in line with conductivity levels reported in the medical literature [43,44,34,35,45] and are displayed in detail in Table 1. The radius of the ball defining the stroke is drawn from a uniform distribution r ∼ U(r min , r max ), with r min = 0.7 cm and r max = 2.3 cm, which corresponds to volumes ranging from 1.5 ml to 50 ml. The inclusion center (x c , y c , z c ) is chosen randomly under the condition that the whole inclusion is contained in the brain tissue.
For the azimuthal and polar angles, we have drawn different standard deviations ς θ and ς φ (cf. Eq (14)) uniformly from the set {0.01, 0.02, 0.03} (radians) for each forward computation, in order to take into account different levels of electrode movements. These might depend on, e.g., the initial misplacement of the electrode helmet on a patient's head, the differences between the geometry of the patients head and of the helmet, as well as the overall movement of the patient during the examination. For a better understanding on how the selected standard deviation affects the electrode positions, see Figure 4, where both ς θ and ς φ are chosen to be 0.03. In particular, this value is the highest standard deviation one could use in our computational head model before the electrode patches start overlapping, especially on the lower belt, where they are closer.
The relative noise level in (16) is set to ς η = 10 −3 (cf. [41], where such a noise level has been used with real-world data). A complete summary of the parameter values used in the random generation of the training data is reported in Table 1.
The training dataset contains around 40 000 samples, approximately split into 50% conductive inclusions (hemorrhage), 25% resistive inclusions (ischemia), 25% no inclusions (healthy). Let us emphasize that all samples, including those associated with no inclusions (strokes) correspond to different head and inner layer shapes, electrode positions and measurement noise realizations. This same statement applies also to all test datasets employed in assessing the performance of the neural networks.
The computations presented were performed with a MATLAB implementation using computer resources within the Aalto University School of Science "Science-IT" project. Measurements were computed in parallel over 200 different nodes on Triton [46], the Aalto University high-performance computing cluster, and the overall computation time for generating the training data did not exceed two hours.

Test datasets.
While training is performed on the large dataset introduced above, testing the accuracy of the classifiers is realized on independent test sets, based on three different models for the stroke geometries. More precisely, we constructed 14 test sets that originate from the parameter ranges of Tables 1 and 4 different variations, over three families of geometric models for the strokes.
The three models chosen for the test strokes are the following (cf. Figure 5).
(1). A single ball: a test conductivity sample has one ball-shaped inclusion or no inclusion. The corresponding label is equal to 1 if the inclusion corresponds to a hemorrhagic stroke, 0 otherwise. We approximately have 50% conductive inclusions, 25% resistive inclusions, 25% no inclusion. (2). A single ball or cylinder: a test conductivity sample still exhibits one or no inclusions, but different shapes are considered; a single ball or a cylinder-shaped stroke. The height of the cylinders is uniformly drawn from the distribution h ∼ U(h min , h max ), with h min = 1 cm and h max = 3 cm, while the radius is the same as for the corresponding balls, with the values reported in Table 1. This again corresponds to volumes ranging from 1.5 ml to 50 ml. The same labels as in the first case are used. As before, we have 50% conductive inclusions, 25% resistive inclusions and 25% healthy cases; the inclusion shape can be either a ball or a cylinder with a 50% chance. (3). Balls or cylinders: we consider cases with one, two or zero inclusions of different shapes (balls and cylinders). The label is equal to 1, if there is at least one hemorrhagic inclusion, and 0 if the nature of the inclusion(s) is resistive or the brain is healthy. Again, approximately half of the cases have at least one conductive inclusion, 25% of cases have one or two resistive inclusions, 25% have no inclusion. In particular, it is possible that label 1 corresponds to one hemorrhagic and one resistive inclusion (cf. Figure 5), which is a case not encountered in the training data.
For each stroke model (1)-(3), we constructed 5 datasets for which we considered: • standard parameters, i.e., the ones listed in Table 1, • two different ranges for the expected inclusion radius, • lower level of electrode misplacement, • a higher amount of relative noise added to the data.
See Section 5 for more details. These alterations are to be understood in comparison to the random parameter models listed in Table 1 for the generation of the training data. This results in 14 different test sets with 5 000 samples each.  In this setting, strokes are represented by a well defined ball or cylinder of a constant conductivity value embedded in a homogeneous background, whereas cases of nested inclusions are not considered. In fact, the presence of a penumbra or hypodense tissue would have a huge impact on the performance metrics. A penumbra is a region of normal to high blood volume that surrounds an ischemic stroke as the brain tries to balance the net blood pressure and flow. On the other hand, the hypodense tissue is formed around a hemorrhage when there is shortage of blood corresponding to a lower conductivity. Both situations are critical and detecting them is of vital importance, but we leave their investigation for future studies.

Results
We start by first reviewing the computational details about the training and testing process. Next, we present and discuss the classification performance on each dataset.
For each classification learner we evaluate the results in terms of standard performance metrics such as sensitivity, specificity and accuracy on the test dataset. Sensitivity, or true positive rate, measures the proportion of actual positives that are correctly identified as such, while specificity, or true negative rate, quantifies the proportion of correctly classified actual negatives. Finally, accuracy is the sum of true positives and true negatives over the total test population, that is, the fraction of correctly classified cases.

5.1.
Training the networks. The training dataset (see Section 4.2) was normalized by subtracting its mean and scaling by its standard deviation before the actual training. The mean and the standard deviation were stored and used to normalize the test datasets in the evaluation phase.
We trained our FCNN and CNN using the Adam optimizer [47] with batches of size 256 and learning rate 0.001. The FCNN was trained for 1500 epochs on the full training set with no validation. This choice was motivated by preliminary tests with validation that showed that our FCNN was not overfitting the training data. The CNN was trained for 120 epochs on a 83%/17% random training/validation splitting of the original dataset. In order to show the stability of the classification, the performance metrics for the FCNN and the CNN are reported as the average value computed over 10 different network trainings.
Training and tests were performed with a Python implementation on a laptop with 8GB RAM and an Intel CPU having clock speed 2.3 GHz. The overall training times were 30 and 40 minutes for FCNN and CNN respectively.  Tables 2 and 5 are separated from test cases with a thicker line. Overall, the leading trend is that the network performs considerably well in the case with only one ball-like inclusion, while accuracy somewhat degrades when testing the network in the other two cases, especially for datasets with one or two inclusions of potentially different shapes and type.
The top row of Table 2 corresponds to testing the network on the training data and thus the resulting accuracy of 0.9673 is expected to give an upper limit for the performance of the FCNN. However, as shown below, we find that the network performs slightly better on some test datasets. On the top rows of Tables 3 and 4 we present the results for the FCNN on the test sets generated with the same parameters as the training set, but with different geometric models for the stroke. In every table, the results on the other rows are obtained by testing the trained network on four different test sets, where each time some parameters are altered from the values in Table 1.
The second rows of all three tables correspond to testing our classifier with strokes of larger volumes, where the lower bound for the ball radius is increased to r min = 1 cm, leading to the inclusion radius being drawn from r ∼ U(1, 2.3) cm (4.20 ml -50 ml for ball-shaped inclusions). In the case of a cylindrical inclusion, its radius is modified accordingly, while its height remains in the range [1,3] cm, as in Table 1 (9.40 ml -50 ml for cylinder-shaped inclusions). This choice of parameters leads to a better performance of the network, due to the better average visibility of the test inclusions to EIT measurements. Accuracy ranges from 91% in the case (3) with one or more inclusions of different shapes up to 97% accuracy for the test (1) with only one ball inclusion, which is actually higher than the accuracy on the training set.
Conversely, as shown on the third rows of each table, when only smaller inclusions are considered, the overall performance metrics decrease, with the lowest accuracy of 87% reached for the last geometric stroke model (3). Again, the height of the possible cylinder-shaped inclusions remains as in Table 1. It should also be noted that in Table 2, as well as in all our other tests, test datasets with larger and smaller inclusions are not simply subsets of the training set, but they have instead been simulated with random choices for the remaining parameters (cf. Table 1), providing hence new and unseen sets of data for the neural network.
In the experiments documented on the fourth rows, we considered a lower degree of electrode misplacement, that is, a lower level of inaccuracy in the electrode position when a helmet of electrodes is placed on a patient's head. This translates to selecting a lower standard deviation in formula (14) when simulating the test data, with ς θ , ς φ being drawn from {0.005, 0.01, 0.02} radians. Results in this case are notably better than those obtained with the baseline dataset for each of the three random stroke models (1)-(3), with the accuracy ranging from 91% to almost 97%. This confirms that electrode positioning does indeed significantly affect the classification accuracy.
Finally, to test the network in a more realistic setup, we increased the relative noise level in the measurements in (16) from ς η = 10 −3 to ς η = 10 −2 in the test data. The resulting performance indicators are listed on the last rows of Tables 2-4. Despite the (inevitable) decrease in performance, the accuracy was still over 90% for all three models for test stroke generation (1)-(3).

5.3.
Results with convolutional neural network. Following the same workflow as in Section 5.2, Tables 5-7 present the analogous results for the CNN trained on the training dataset and tested in the three cases related to the different geometric models for the inclusions. It can clearly be seen that the overall accuracy is significantly lower than in the case of FCNN. This is arguably due to the fact that our CNNs tend to overfit the data, resulting in a high accuracy for the training set (first row of Table 5), which significantly degrades in the other test cases. With the exception of the case of the smaller strokes, the overall accuracy remains steadily over 80% and roughly follows the same trends as for the FCNN. Tests in the case of larger inclusion volumes have by far the best accuracy level, ranging from 84% to almost 89%, while the least accurate classification is indeed obtained for the strokes with smaller volumes. The mean accuracy with a lower level of electrode uncertainty and increased relative noise is between 81% and 86%, depending on the model for generating the test strokes.

Conclusions
This work applies neural networks to the detection of brain hemorrhages from simulated absolute EIT data on a 3D head model. We developed large datasets based on realistic anatomies that we used to train and test a fully connected neural network and a convolutional neural network. Our classification tests show encouraging results for further development of these techniques, with fully connected neural networks achieving an average accuracy higher than 90% on most test datasets.
Since the datasets included a large proportion of healthy patents, this work could motivate the development of a new, cheap, non-invasive screening test for brain hemorrhages.
The results demonstrate that the classification performance is affected by many factors, including the size of the stroke, the mismodeling of the electrode position and the noise in the data. Among these, the size of the stroke is the one most significantly altering the performance, which indicates that the method is probably unable to detect very small hemorrhages. Note that in the dataset generation we considered strokes with volumes of as small as 1.5 ml. Further, the FCNN, trained on single ball-shaped strokes, was able to generalize to data generated from multiple strokes of different shapes with little loss in the accuracy. We also want to stress that in every dataset the position of the stroke was chosen uniformly at random within the brain tissue. This means that on the one hand we did not restrict or assume to know in which brain hemisphere the stroke was located. On the other hand we considered strokes taking place potentially very close to the skull layer or very deep inside the brain tissue, a situation which is very challenging to analyze from EIT data.
The simulated datasets have several limitations nonetheless. The three-layer head model is a simplified version of the true head anatomy. In particular, it does not take into account the cerebrospinal fluid and other subtle anatomical tissues. Besides, the considered conductivity distributions have an average skull-to-brain ratio of 3 : 10, which has been shown to be a difficult quantity to estimate [44,45] and appears to affect the performance metrics. A brain stroke is a complex phenomenon which evolves with time, and it is so far unclear how to precisely model its conductivity values, both in the hemorrhagic and in the ischemic case.
This work leaves many open directions for future studies. A natural one is to consider more realistic datasets. To our knowledge, the only publicly available dataset with real EIT data on human patients has been released by University College London [26]. The dataset comprises data from only 18 patients, which is far from the size of our training set (40 000 samples), but could be used as a test set after training on synthetic data. Datasets from phantom data would also be an important intermediate step to validate the generation of synthetic measurements. On a related note, cases of nested inclusions should also be taken into account.
Another important direction is to implement more refined machine learning algorithms. We found that a simple 2-layer FCNN ourperforms a more sophisticated CNN, though we did not investigate in depth the problem of designing an optimal network architecture for our EIT data. This certainly leaves room for improvements in the performance.
Finally, bedside real-time monitoring of stroke patients could become another valuable application of the presented techniques. The aim would be to predict whether an hemorrhage is increasing in volume over time or it is stable. This could be studied with a similar approach based on machine learning algorithms trained on datasets with several measurements taken at different times.