A bi-atrial statistical shape model for large-scale in silico studies of human atria: model development and application to ECG simulations

Large-scale electrophysiological simulations to obtain electrocardiograms (ECG) carry the potential to produce extensive datasets for training of machine learning classifiers to, e.g., discriminate between different cardiac pathologies. The adoption of simulations for these purposes is limited due to a lack of ready-to-use models covering atrial anatomical variability. We built a bi-atrial statistical shape model (SSM) of the endocardial wall based on 47 segmented human CT and MRI datasets using Gaussian process morphable models. Generalization, specificity, and compactness metrics were evaluated. The SSM was applied to simulate atrial ECGs in 100 random volumetric instances. The first eigenmode of our SSM reflects a change of the total volume of both atria, the second the asymmetry between left vs. right atrial volume, the third a change in the prominence of the atrial appendages. The SSM is capable of generalizing well to unseen geometries and 95% of the total shape variance is covered by its first 23 eigenvectors. The P waves in the 12-lead ECG of 100 random instances showed a duration of 104ms in accordance with large cohort studies. The novel bi-atrial SSM itself as well as 100 exemplary instances with rule-based augmentation of atrial wall thickness, fiber orientation, inter-atrial bridges and tags for anatomical structures have been made publicly available. The novel, openly available bi-atrial SSM can in future be employed to generate large sets of realistic atrial geometries as a basis for in silico big data approaches.


Introduction
A wide range of machine learning approaches have already been proposed for classifying cardiovascular pathologies based on the 12-lead electrocardiogram (ECG) (Hannun et al., 2019;Perez Alday et al., 2020;Strodthoff et al., 2020). Since the ECG is a cost-effective, non-invasive and commonly available tool in clinical practice, it is particularly desirable to identify and diagnose cardiac pathologies only based on the ECG and without the need of further expensive imaging techniques or invasive procedures. However, the training of such classifiers requires a large, balanced, and reliably labeled dataset. Oftentimes, not all of these prerequisites are met when using clinically recorded data. Additionally, expert annotations are commonly relied upon to generate the ground truth labels describing the underlying pathologies for clinical datasets coming along with inter-and intra-observer variabilities significantly affecting the reliability of the ground truth labels (Hannun et al., 2019).
These limitations call for simulated synthetic ECG as a source for large, representative and well controlled datasets. These datasets can be used to directly deduce diagnostic criteria visually (Andlauer et al., 2018) or to train machine learning classifiers to discriminate between different cardiac diseases and healthy individuals (Andlauer et al., 2018). The advantage of using simulated over clinical data lies not only in the precisely known ground truth of the underlying pathology that was defined for the simulation, but also in the possibility to generate a virtually infinite amount of signals for each pathology class.
Nevertheless, atrial, ventricular and thoracic geometrical models are needed for conducting electrophysiological simulations to obtain the 12-lead ECG. In this regard, statistical shape models (SSM) allow to compile a wide range of realistic geometries that represent the variability observed in the cohort used to build the SSM. While SSMs of the human ventricles (Bai et al., 2015) and torsos (Pishchulin et al., 2017) exist and are publicly available, an open shape model covering both atria covering all relevant anatomical structures for EP simulations (atrial body, appendages, PVs) is still lacking to the best of our knowledge. Different statistical atlases of the human whole heart anatomy (Ecabert et al., 2008;Hoogendoorn et al., 2013;Lötjönen et al., 2004;Ordas et al., 2007;Unberath et al., 2015;Zhuang et al., 2010) have been constructed for segmentation of magnetic resonance (MR) or computed tomography (CT) images by means of active shape modeling approaches. However, those models are usually built based on a small number of sample segmentations or were not made publicly available. Furthermore, different SSMs of only the left atrium (LA) have been presented in various studies either for the purpose of simulations  or for characterizing changes in shape of the LA (Bieging et al., 2018;Cates et al., 2014;Depa et al., 2010;Varela et al., 2017) in patients with atrial fibrillation. These LA models are built based on a solid number of instances, but lack the right atrium (RA) and often also the left atrial appendage (LAA). However, these anatomical structures are not only indispensable for the use case of ECG simulations. They are also of particular importance when investigating the mechanisms of typical atrial flutter in the RA or bi-atrial flutter and fibrillation. Additionally, the LAA is highly relevant for studies examining blood clot formation causing stroke (Masci et al., 2019), LAA occlusion as potential therapy (Aguado et al., 2019) and the role of the LAA in the onset and maintenance of atrial fibrillation (Nishimura et al., 2019). Due to the lack of ready-to-use models of the atria, a bi-atrial SSM would cater the need of generating geometrical atrial models representing inter-subject anatomical variability. These could be employed to gain a comprehensive understanding of e.g. the underlying mechanisms of the onset and perpetuation of re-entrant activity during atrial flutter and fibrillation not only in personalized computer models but also in a large patient cohort. Thus, including the shape of the RA in the model as well as making the bi-atrial SSM available to the community, enables large-scale simulation of atrial signals. Although the focus of this paper is the application of the SSM for ECG simulations, its field of application is not limited to this particular use case. The bi-atrial model can also be exploited for other in silico approaches like continuum-mechanics and fluid simulations. Furthermore, active shape modeling approaches using the novel bi-atrial SSM could facilitate automated segmentation of the atria from CT or MRI datasets.
In this work, we built an SSM of both atria from manual segmentations of 47 MR and CT scans. Furthermore, we propose a workflow to generate a volumetric atrial model based on instances of the SSM. We also provide the bi-atrial SSM under the creative commons license CC-BY 4.0 together with 100 exemplary volumetric models derived from it including fiber orientation, inter-atrial bridges and material tags (Nagel et al., 2020).

Material and methods
The geometric representation as well as the variation in shape among a set of individual three dimensional objects can be described by SSMs. Point distribution models (Cootes et al., 1995) are the most common subclass of SSMs and require a vectorized point-based representation sn of any individual geometry Γn comprising M consistently sampled surface points [xn, yn, zn] T : sn = [xn,1, yn,1, zn,1, . . . , xn,M , yn,M , zn,M ] T . (1) Assuming that the spatial variations of the surface points follow a multivariate normal distribution, a compact representation of the mean and covariance matrix describing the shape variations can be obtained by applying a principal component analysis (PCA) to the observations s. In this way, all N individual shapes Γ can be represented by a linear combination of N − 1 basis functions v: with s being the mean shape vector as well as σ and v representing the eigenvalues and eigenvectors of the covariance matrix, respectively. r n,k represent the weighting coefficients for the individual eigenvectors. To obtain this parametric representation of the shape variations from clinical MRI or CT data, a number of preprocessing steps have to be performed: i) segmenting the images, preferably in a (semi-)automatic manner, ii) rigidly aligning the resulting shapes in space, iii) establishing a dense correspondence between the individual shapes to obtain the shape vectors s C that were then subject to PCA.

Dataset
Three independent multi-center, multi-vendor databases (Karim et al., 2018,1;Tobon-Gomez et al., 2015) were used to build the SSM. Their properties are summarized in Table 1. The images originate either from healthy subjects or from patients suffering from atrial fibrillation. Some of the available images were excluded due to low signal to noise ratio or incomplete capture of the inferior right atrial body. Only subjects with 4 pulmonary veins (PVs) were considered because of the limited ability of SSMs to capture only continuous changes of vertex locations.

Segmentation
In order to obtain the individual instances of both atria, a semi-automatic segmentation of the blood pool representing the endocardial surface of the left and the right atrium from MR and CT images was performed Left atrial wall thickness challenge 9 CT 0.8 x 1 x 0.4 mm (Karim et al., 2018) Figure 1: Segmentation inaccuracy due to different automated segmentation methods. The different rows represent the axial, sagittal and the coronal plane, respectively. The images in the left, middle and right column show the segmentation errors due to region growing, 3D interpolation and a partly excluded LAA in the ground truth data colored in red, respectively. The green contours mark the manual corrections tailored to the correction of inaccurately found automated segmentations.
using CemrgApp . 2D region growing in several selected slices as well as 3D interpolation were applied to each image stack. To reduce the impact of noise or image artefacts on the segmentation outcome, details were manually corrected. Fig. 1 shows examples of incorrect segmentation results due to insufficient region growing performance (left column) and 3D interpolation (middle column) that made a manual correction indispensable. Automated segmentation with region growing especially failed in 2D planes where both, the LA and the left ventricle are visible, because the mitral valve shows the same image intensity as the LA and the left ventricle. Therefore, a cutting plane between atrium and ventricle was inserted manually. The drawbacks of 3D interpolation are particularly affecting the areas around the PVs where the interpolated surface tends to close small gaps between the PVs and the atrial body. For 20 images in dataset 2, a segmentation of the LA was provided. However, the LAA was truncated in close proximity to the left atrial body (Tobon-Gomez et al., 2015) in these segmentations. Since we aimed at incorporating the variability of the LAA shape in our model, the LA segmentations of dataset 2 were adapted to include the full volume of the LAA as shown in Fig. 1 (right column). The resulting segmentations were exported as triangular meshes.

Rigid Alignment
After segmenting the individual instances Γn of the atria, the different shapes were rigidly aligned in space to avoid a representation of translation and rotation parameters in the eigenmodes of the SSM. This was performed automatically by means of the iterative closest point (ICP) algorithm that provides a solution to the orthogonal Procrustes problem (Chen and Medioni, 1992). In each iteration i, candidate correspondences [xn,ŷn,ẑn] T R,i between a target mesh Γn and the reference mesh ΓR were found by attributing to each node in Γn the point with the the smallest Euclidean distance in ΓR. Procrustes analysis was then used to estimate the linear transformation Ti -consisting of rotation and translation -which yields the best match of the candidate correspondence points Figure 2: Gaussian process morphable model for establishing correspondence at the PVs. The arrows indicate the movement of the respective vertices in the model for a variation of the four leading eigenvectors by −3σ and +3σ.
[xn,ŷn,ẑn] T R,i with the reference points [xR, yR, zR] T . After applying the estimated transformation Ti to the points in mesh Γn: new candidate correspondences [xn,ŷn,ẑn] T R,i+1 are recursively found between the transformed mesh Γn,i+1 and ΓR at each iteration i and used for solving the Procrustes problem. If after several recursive calls of the function, either the maximum number of 150 iterations is exceeded or the convergence criterion is fulfilled, an optimal transformation matrix for the alignment of both shapes Γn and ΓR is found resulting in a set of N rigidly aligned shapes Γ A .

Establishment of Correspondence
Each aligned individual instance n comprises Mn surface points [x A n,1 , y A n,1 , z A n,1 , . . . , x A n,Mn , y A n,Mn , z A n,Mn ] T . In order to describe the variations in shape of the aligned instances Γ A by means of a point distribution model, correspondence between the vertex IDs among all individual shapes have to be established. Establishing correspondence requires to retrieve concordant points in all shapes Γ A , so that the N aligned shapes are not only represented by the same amount of surface points M but also that each point [x A n,m , y A n,m , z A n,m ] with a specific ID m represents the same anatomical landmark in any arbitrary shape n. For this purpose, we used Gaussian process morphable models (GPMM) (Luthi et al., 2018) and ScalismoLab (Bouabene et al., 2020) to subject a reference shape Γ A R to a generic deformation in such a way that the deformed shape Γ A R,n matches the individual aligned shape Γ A n in the best possible way. This process then yields a set Γ C of aligned shapes that are characterized by homologous, corresponding surface points. For this process, we defined three independent generic deformations by Gaussian process (GP) models. Gaussian kernels described the similarity between two distinct points x and x : were approximated by the leading eigenfunctions of their Karhunen-Loève expansion as described in Luthi et al. (2018). They were further employed to fit the orientation of the left and right pulmonary veins (LPV, RPV), the atrial body, as well as the left and the right atrial appendages (LAA, RAA). The separation into three different models (atrial body, appendages, PVs) served two different purposes. On the one hand, we were able to account for the high anatomical variability of the appendages by allowing smaller inter-dependencies spanning between the points located on the appendages. On the other hand, the generic model varying the points located on the PVs was designed such that only the orientation of the PV ostia but not their length was affected. The optimization problem of fitting the GP model Γ A R,n to the individual aligned shapes Γ A n was solved with an L-BFGS optimization minimizing the mean squared error between the vertex coordinates of the deformed model Γ A R,n and the target shape Γ A n . To accurately fit the PVs, a kernel representing the orientation of the four PVs in anterior-posterior direction in its first four eigenvectors was constructed (Fig. 2). Only the orientation of the PVs and not their length were fitted since the latter highly depends on the segmentation approach and does therefore not represent a proper observed anatomical variation considering the heterogeneous input data used in this study. A low rank approximation of a GP model with a variance of s b = 50, l b = 40 at points representing the general Figure 3: Example of one randomly generated instance with a homogeneous wall thickness from two different views. The models on the left represent the anatomical labels and the inserted inter-atrial bridges. The fiber orientations are visualized on the models in the right column.
atrial body (b) and sa = 20, la = 20 at the appendages was used to account for the higher anatomical variability in the appendage (a) regions.

Shape Model Construction
The vertices at the distal parts of the superior and inferior caval veins, the coronary sinus, the PVs, as well as the mitral valve and the tricuspid valve were discarded from the N aligned shape vectors in correspondence s C to limit the model creation to atrial components relevant for electrophysiological simulations. Applying a PCA to these cut shape vectors in correspondence s C yields their mean shape s and N − 1 basis functions v along with their respective variances σ 2 . In this way, an exact reconstruction of any individual shape instance Γ C n is feasible by determining the coefficients rn in Eq. 2 with a standard least squares estimation. Furthermore, additional arbitrary variation shapes Γvar in the span of the N − 1 basis vectors v can be derived by varying r. Under the assumption that the values of r are normally distributed among the observed instances Γ C n , keeping r in the interval [−3, +3] N −1 yields realistic artificially generated shapes within the empirically observed variability.

Construction of Volumetric Meshes
The bi-atrial SSM represents the mean shape of the atrial endocardial surface and the variation of all point coordinates in space so that any arbitrary variation mesh Γvar can be derived from it. However, a volumetric model of the atria, including inter-atrial bridges, anatomical labels and fiber orientations is required to perform electrophysiological simulations and to obtain realistic body surface P waves. Since a segmentation of the epicardial surface from conventional MR images is usually not feasible due to an insufficient spatial resolution and a limited signal to noise ratio, the epicardial surface was augmented in a postprocessing step assuming a homogeneous atrial wall thickness. To approximate the epicardial surface, the endocardial surface of the variation mesh Γvar was dilated by 3 mm at each point along the normal direction calculated as the mean of the adjacent triangle normals. Both surfaces were merged and intersections and holes between epi-and endocardium were corrected and closed automatically using the iso2mesh toolbox (Tran et al., 2020). The closed surface was afterwards remeshed using Instant Meshes (Jakob et al., 2015) and transformed into a volumetric tetrahedral mesh with an average edge length of 1 mm using Gmsh (Geuzaine and Remacle, 2009). The algorithms described by Loewe (2016); Wachter et al. (2015) were used to automatically augment the models with Bachmann's bundle, a coronary sinus and an upper and middle posterior inter-atrial connection between the LA and RA as well as myocardial fiber orientation and anatomical labels. The augmented anatomical structures are visualized in Fig. 3.

Parameterization of Electrophysiological Simulations
100 random instances were derived from the bi-atrial SSM by drawing the eigenvector coefficients r of Eq. 2 from a uniform distribution in the range of the minimum and maximum value found in the dataset used to build the SSM. A fast marching (Loewe et al., 2019;Sethian, 1996) simulation was carried out for solving the Eikonal equation on these 100 geometries to obtain the spread of electrical activation and derive local electrical activation times (LATs) for each node. This sinus rhythm activation was initiated from a sinus node exit site located at the junction of the superior caval vein and the RAA . The atrial wall was separated into seven different anatomical regions: regular right and left atrium, inter-atrial connections, valve rings, pectinate muscles, crista terminalis and inferior isthmus. The conduction velocities along the fiber directions and the anisotropy ratios in the different regions were chosen as reported previously  and are given in in Table 2. The transmembrane voltage distribution on the atrial surfaces was obtained by shifting a pre-computed Courtemanche  Figure 4: Eigenmodes of the bi-atrial shape model. The three leading eigenmodes are displayed in different rows. Columns 1-3 represent the variation of the eigenvector coefficients. In the fourth column, the anatomical view used to capture the respective eigenmode is depicted. et al. (1998) action potential in time according to the calculated LATs as proposed by (Kahlmann et al., 2017). The ECG forward problem to derive the body surface potentials from the transmembrane distribution on the heart was solved by means of the boundary element method (Stenroos et al., 2007). Considering computational cost, the surface bounding the heart was sampled at a resolution of 3 mm. Laplacian blurring with optimal blurring parameters as described in Schuler et al. (2019) was applied to the source distribution. The mean shape of the human body SSM developed by Pishchulin et al. (2017) served as a reference shape of the torso. The P wave of the 12-lead ECG signal was extracted from the body surface potentials at the standardized electrode positions.

Eigenmodes of the Bi-atrial Shape Model
Applying a PCA to the cut shape vectors in correspondence s C as described in section 2.5 yields their mean shape s comprising 95.048 triangular cells and 47.528 vertices with an average edge length of 0.862 mm. Furthermore, the eigenvectors and -values of the bi-atrial model were obtained. Fig. 4 shows the shape changes caused by varying the coefficients of the first three eigenvectors. The first eigenmode represents a change in the total volume and size of both atria simultaneously (Fig. 4, first row). The second mode reflects the asymmetry of the LA vs. the RA volume, i.e., the increase of the LA volume and the concurrent decrease of the RA volume. The prominence of the right and left appendage are encoded in the third eigenmode (Fig. 4, third row). The orientation of the PVs to one another are represented -among other aspects -in the fifth, sixth, and eighth eigenmodes of the SSM.

Evaluation of the Bi-atrial Shape Model
The quality of the bi-atrial SSM was evaluated by first assessing the mean vertex to vertex distances between the meshes in correspondence and their respective original locations from the dataset. For the 47 meshes used to build the SSM, this distance was 1.33 ± 0.25 mm. Furthermore, three standard evaluation criteria for evaluating the SSM quality proposed by Huw Davies (2002) were considered: generalization, specificity, and compactness. The generalization metric addressed in section 3.2.1 refers to the ability of the SSM to recreate an instance whose shape vector was excluded from the dataset used to create the SSM. The specificity metric (section 3.2.2) assesses the goodness of the model in terms of generating realistic shapes. Furthermore, the compactness (section 3.2.3) metric of the model increases the more the set of eigenvectors can be reduced while still being able to describe the majority of the total shape variance present in the dataset.

Generalization
To evaluate the generalization quality of the SSM, we used leave-one-out cross-validation and defined N datasets with N − 1 meshes each by leaving out the final observation. Each of those was used to compute a reduced SSM. The excluded shape was reconstructed with the reduced SSM by determining the eigenvector coefficients using ordinary least squares. The similarity between the original excluded shape and the approximated one was assessed in terms of the Euclidean distance between the corresponding vertices. Fig. 5 shows the distribution of this error metric for all instances in the dataset. The median of the vertex to vertex distance was below 2.3 mm among all shapes which compares to the order of magnitude of the MRI cross-plane resolution (0.4 mm − 2.7 mm). Instance 4 holds the lowest Euclidean distances between the vertices of its original and reconstructed shape, whereas instance 37 is characterized by considerably high error values. Especially the 95th percentile bounds and the outlier values comprise large vertex to vertex distances. Fig. 6 depicts the approximated atria with the reduced SSM for instance 37. The vertex color represents the Euclidean distance to the corresponding vertex in the original shape instance 37. Vertices showing larger deviations were located predominantly on the distal part of the right inferior pulmonary vein (RIPV). The same phenomenon was observed also for instance 47.

Specificity
The specificity of the bi-atrial model was evaluated by generating 1000 random shapes according to Eq. 2 by uniformly sampling r in the interval [−3, 3]. The similarity between these random shapes and the respective closest shapes in the underlying dataset Γ C used to build the SSM was assessed in terms of the root mean square error (rmse) of all vertex-to-vertex distances between the randomly generated and the original shape. The rmse ranged from 5.29 to 11.73 mm among all 1000 random instances with a mean ± standard deviation of 7.82 ± 1.04 mm. Fig. 7 shows one randomly generated shape with an rmse of 7.58 mm in yellow together with its most similar instance from the dataset in blue. This case approximately represents the mean rmse value among all 1000 random shapes.

P wave Simulations
Calculating the P wave on the mean shape of the proposed SSM as described in section 2.7 yields the signals for the Einthoven, Goldberger and Wilson leads shown in Fig. 9. The P wave duration was extracted for each of the 12-lead ECGs simulated on 100 random instances by considering the time difference between the latest detection of the P wave offset and the earliest P wave onset across all 12 leads. The values of the P wave durations across all 100 instances ranged from 80 ms to 118 ms with a mean and standard deviation of 104.8 ± 8.5 ms.

Openly Available Dataset
The bi-atrial SSM is provided under the Creative Commons license CC-BY 4.0 together with 100 exemplary volumetric models derived from it including fiber orientation, inter-atrial bridges and anatomical labels (Nagel et al., 2020). The SSM is available as an h5 file encoding information about the mean shape's spatial vertex locations and their triangulation. Also the eigenvectors and -values resulting from the applying the PCA are included. The 100 geometries were generated by varying the eigenvector coefficients r in the [−3, +3]σ range. Figure 9: P waves of the 12-lead ECG calculated on the mean shape of the proposed bi-atrial shape model. The red markers indicate the earliest onset and the latest offset of the P wave.
These anatomical models are provided in VTK file format including fiber orientation as 3D vectors and material tags as scalar values in the cell data section.

Discussion
The main result of this study is a point distribution model incorporating the shape variations of the right and the left atrium as well as their appendages and the PVs. Moreover, we presented a workflow for building a volumetric atrial model from an endocardial surface derived from the SSM. Together with 100 example volumetric geometries generated by varying the coefficients of the principal components uniformly in the [−3, +3]σ range including fiber orientation, inter-atrial bridges and anatomical labels, the SSM is openly available (Nagel et al., 2020). Electrophysiological simulations covering atrial excitation spread and propagation of electrical potentials to the body surface were conducted on these 100 example shapes. The resulting P wave durations obtained with the proposed SSM of 104.8 ± 8.5 ms are in agreement with the P wave durations of 100-105 ms reported for individuals with a low atrial fibrillation risk in an extensive cohort study based on 285,933 ECGs (Nielsen et al., 2015). On the one hand, this suggests that our model is capable of producing a large variety of variation shapes leading to realistic ECG feature values when compared to clinical recordings. On the other hand, it implies that the additional P wave duration variability observed in individuals with increased atrial fibrillation risk (92-116 ms range covering 20-80% percentiles in Nielsen et al. (2015)) is either due to pathological anatomical variability not represented in the healthy dataset used to build this SSM or due to non-anatomical, functional changes such as conduction velocity slowing due to fibrotic infiltration of the atrial tissue (Caixal et al., 2020).
In our model, 23 eigenvectors are necessary to explain 95% of the total variance of the dataset. In the LA-only SSM built by Corrado et al. (2020), the first 15 eigenmodes cover 95% of the entire shape variance. Cates et al. (2014) reported that only 8 eigenvectors account for 95% of the total variance. However, these two models do neither consider the LAA nor the RA which explains the higher complexity of our model and in turn the need to include more eigenvectors to cover the majority of the shape variance. By allowing only a variation of the PVs' orientation in anterior-posterior direction during correspondence retrieval, we prevent changes in the PV lengths and diameters to be reflected in the model's eigenmodes, which was reported as a possible limitation of the model by Cates et al. (2014).
Varying the first eigenvector in the LA SSM published by Varela et al. (2017) causes a variation of the total LA volume as it is also the case in our model (Fig. 4). Also Corrado et al. (2020) and Cates et al. (2014) report a change of the total LA size in the first eigenmode. In the latter, the first mode represents a dilation of the LA mainly in anterior-posterior direction, which is also the case for the third eigenmode of the SSM built by Corrado et al. (2020), where the first eigenmode rather represents an elongation of the LA in medial-lateral direction. Cates et al. (2014) also constructed a separate SSM of the LAA and found that its first shape parameter corresponds to a change in the LAA length which is as well described by the third eigenmode of our model.
The shape variations encoded in the leading eigenmodes of our novel bi-atrial SSM are consistent with previously published LA-only SSMs as far as the LA is concerned. This demonstrates that our model is able to reflect the same main shape variations even though it is based on a dataset of less than half the size compared to the other models. The second eigenmode of our model represents the asymmetry between right and left atrial volume. This further manifests the novel insights into inter-subject atrial geometry variations revealing with our model since this shape change cannot be captured with two separate RA and LA SSMs.
The generalization results demonstrate that our model is able to accurately predict the shape of a previously unseen atrial geometry. Only the reconstruction of the distal part of the PVs caused vertex-to-vertex distances to the original instances of > 1 cm in two particular cases. The specificity results of 7.82 ± 1.04 mm leave room for improvement. However, the low specificity scores do not result from unanatomical characteristics of the generated shapes. They occur rather due to the small dataset of 47 instances available for selecting the closest shape during evaluation. Considering the MR slice thickness of predominantly 2.7 mm (Tab. 1), a specificity rmse of 7.82 mm is in the range of less than 2 voxel diameters with segmentation uncertainty adding to it (Karim et al., 2018). The specificity evaluation of our model therefore indicates that randomly generated shapes produce valid shapes with an accuracy in the range of the error susceptibility during segmentation.
The atria segmented for this study originate from datasets comprising images of not only healthy subjects but also patients with a known history of atrial fibrillation. LA enlargement has been linked with an increased risk for this arrhythmia (Andlauer et al., 2018;Broughton et al., 2016;Hamatani et al., 2016). To ensure that our model is not based on a biased dataset with predominantly enlarged left atria, the LA volume (excluding LAA and PV ostia) of the N segmented geometries (82.16 ± 19.16 ml) was compared to reference values. Pritchett et al. (2003) considered all age and BMI groups in healthy individuals. Translating their 2D measurements to 3D volumes as suggested by Al-Mohaissen et al. (2013) yields a [−3, +3]σ range of 10-130 ml with the largest LA volume in our dataset (122 ml) being within that range.
To showcase a potential application, we conducted multi-scale electrophysiological simulations on 100 instances of the shape model. This proof of concept was deliberately based on a simple model not considering locally heterogeneous atrial wall thickness (Azzolin et al., 2020;Karim et al., 2018), disease-induced remodeling of membrane dynamics (Loewe et al., 2014) or diffusive aspects during cardiac depolarization. Also only one torso shape and no rotation and translation of the atria within the torso are considered. The pipeline to generate volumetric models and simulation setups from the SSM is prepared for such extensions, though. Future studies focusing on repolarization could replace the simplistic Eikonal coupling employed here with a reaction-Eikonal scheme as suggested by Neic et al. (2017).
The main advantage of the novel bi-atrial SSM consists in the automated generation of a large number of atrial geometries. In this way, the cumbersome and time-consuming process of anatomical model generation involving MRI segmentation and defining bundles and fiber orientation can be facilitated and expedited. Potse et al. (2018) examined the influence of electrical and structural remodeling on the maintenance of complex reentrant activitiy. With our proposed bi-atrial SSM, also the influence of the general atrial anatomy on the perpetuation and initialization of atrial fibrillation can be quantified. Saha et al. (2016) investigated the effect of endo-epicardial activation delay on the P wave morphology. However, only one atrial geometry was used to deduce models of different atrial wall thicknesses in this study and the authors state the lack of using different geometries as a limitation of their work. The same limitation is listed in the study of Pezzuto et al. (2018) aiming at quantifying the beat-to-beat variability of P waves in patients with atrial fibrillation. With our SSM, a larger number of different volumetric atrial models is easily acquirable.
By means of our bi-atrial SSM, scale-large cohort studies using computer models for simulating atrial activity become feasible. Luongo et al. (2020a,2) found a significant influence of the number of atrial anatomies included on the classification of different types of atrial flutter with a machine learning approach. With the proposed SSM, a large number of geometries can be deduced and used as a basis for in silico big data approaches such as to produce extensive datasets for machine learning applications. The provided instances are ready to be used off the shelf in available computational simulation environments such as openCARP for electrophysiology (Sánchez et al., 2020), openFOAM for fluid dynamics (Jasak, 2009) or FEniCS for continuum-mechanics (Alnaes et al., 2015).

Conclusions
To the best of our knowledge, we built the first SSM incorporating both atria, their appendages and the orientation of the PVs. The model itself and 100 random volumetric atrial geometries including rule-based fiber orientation and anatomical labelling were made publicly available to the community. These models are ready to be used off-the-shelf for electrophysiological simulations. Established quality criteria indicate that the novel SSM can be reduced to a set of 23 eigenvectors and is capable of generalizing well to unseen geometries. P waves simulated on 100 random instances derived from the SSM reproduce the P wave distribution observed in clinical ECGs of healthy individuals. As such, the SSM is suitable to generate comprehensive model cohorts covering the relevant anatomical variability as a basis for large-scale in silico simulations including, but not limited to, ECG simulations for machine learning applications.

Funding Statement
This work was supported by the EMPIR programme co-financed by the participating states and from the European Union's Horizon 2020 research and innovation programme under grant MedalCare 18HLT07.