Anatomically-adaptive multi-modal image registration for image-guided external-beam radiotherapy

Image-guided radiotherapy (IGRT) allows observation of the location and shape of the tumor and organs-at-risk (OAR) over the course of a radiation cancer treatment. Such information may in turn be used for reducing geometric uncertainties during therapeutic planning, dose delivery and response assessment. However, given the multiple imaging modalities and/or contrasts potentially included within the imaging protocol over the course of the treatment, the current manual approach to determining tissue displacement may become time-consuming and error prone. In this context, variational multi-modal deformable image registration (DIR) algorithms allow automatic estimation of tumor and OAR deformations across the acquired images. In addition, they require short computational times and a low number of input parameters, which is particularly beneficial for online adaptive applications, which require on-the-fly adaptions with the patient on the treatment table. However, the majority of such DIR algorithms assume that all structures across the entire field-of-view (FOV) undergo a similar deformation pattern. Given that various anatomical structures may behave considerably different, this may lead to the estimation of anatomically implausible deformations at some locations, thus limiting their validity. Therefore, in this paper we propose an anatomically-adaptive variational multi-modal DIR algorithm, which employs a regionalized registration model in accordance with the local underlying anatomy. The algorithm was compared against two existing methods which employ global assumptions on the estimated deformations patterns. Compared to the existing approaches, the proposed method has demonstrated an improved anatomical plausibility of the estimated deformations over the entire FOV as well as displaying overall higher accuracy. Moreover, despite the more complex registration model, the proposed approach is very fast and thus suitable for online scenarios. Therefore, future adaptive IGRT workflows may benefit from an anatomically-adaptive registration model for precise contour propagation and dose accumulation, in areas showcasing considerable variations in anatomical properties.


Introduction
One of the major challenges during external-beam radiotherapy (EBRT) is addressing the geometrical uncertainties introduced by the changes in shape and location of the tumor and the organs-at-risk (OARs) over the course of the treatment (Roach et al 2011). In case such uncertainties are not taken into consideration during the planning and delivery of EBRT, there is a high risk of under-dosage to the tumor, while at the same time over-irradiating adjacent healthy tissues (Chavaudra and Bridier 2001, Roach et al 2011, Jaffray 2012). However, the recent integration of on-board imagers within the radiotherapy delivery systems has allowed visualizing the treated area and its surroundings during all phases of an EBRT work-flow: planning, delivery and response assessment (Guckenberger 2011. This allows clinicians to identify the anatomical areas of interest on the acquired images and in turn, reduce the impact of geometric shifts and deformations on the overall treatment. Changes in the shape and location of the tumor and the OARs over the course of image-guided radiotherapy (IGRT) are typically determined manually by experienced physicians (Eisenhauer et al 2009, Mundt andRoeske 2011). However, given the current tendencies towards imaging protocols which may include several imaging modalities and contrasts over the full course of the treatment, this manual process can become severely time-consuming and error-prone. In addition, the time required for manual contouring may render a smooth clinical work-flow in an online setting unfeasible, in particular for on-the-fly correction strategies with the patient already on the treatment table. Therefore, an automatic solution would be preferred instead.
A feasible solution for automatic estimation of changes in the shape and location of organ and pathological tissue boundaries over the course of the treatment is multi-modal deformable image registration (DIR) (Hill et al 2001, Mani and Arivazhagan 2013, Sotiras et al 2013. Such methods have the capability to estimate voxel-wise deformations across images acquired either with the same or a different modality and/or contrast. In effect, the patient anatomy can be practically tracked in an automatic manner within each acquired image, over the entire duration of the treatment. Moreover, the estimated deformations allow the up-stream mapping into the reference space of the therapy planning image of the dose delivered by each radiation fraction, allowing thus dose accumulation in a spatially consistent manner. This, in turn, gives way to potential adaptations of the therapeutic plan over the course of the therapy (Kontaxis et al 2017).
A particularly attractive type of multi-modal DIR methods for IGRT are the so-called variational approaches (Weickert et al 2003). Due to their high accuracy and precision, low number input parameters and rapid convergence, such algorithms were demonstrated to be especially beneficial in applications demanding on-the-fly corrections or even real-time performance (e.g. during therapy delivery), where short computational latencies are paramount (Ries et al 2010, Glitzner et al 2015, Zachiu et al 2017b, Lafitte et al 2018. As a functioning principle, such methods estimate deformations between two or more images as the minimizer of a cost function comprising two terms: a data fidelity term and a regularization term. The data fidelity term quantifies the similarity between the images to be registered and decreases as the alignment of the images improves. Data fidelity terms for variational multi-modal DIR methods have been built around concepts such as mutual information (Pluim et al 2003, Maes et al 2003, modality-independent descriptors (Heinrich et al 2012, Reaungamornrat et al 2016 and normalized gradient fields (Denis de Senneville et al 2016, Spahr et al 2018). However, the minimization of the data fidelity term alone usually involves solving an under-determinate system of equations, which therefore leads to a non-unique solution or divergence of the algorithm. Therefore, the regularization term of the cost function adds further constraints on the system, rendering it over-determinate and implicitly the minimization problem becomes well-posed. From a practical perspective, the regularization term limits the degrees of freedom of the estimated deformations and, for anatomically meaningful deformations, should be chosen according to the deformation patterns and physical characteristics of the observed anatomy . Multiple previous studies have chosen spatial smoothness as a suitable constraint on the estimated deformations (Viergever et al 2016). While in many cases this has proven to be a reasonable choice, particularly for aligning organ boundaries, it has been nevertheless demonstrated that within boundaries of elastic soft tissues this type of regularization may result in anatomically implausible deformations (Zachiu et al , 2020. More precisely, smoothness as the regularization constraint frequently leads to considerable compressions and expansions within the deformation vector field, which are physically implausible in such near-incompressible tissue structures. This, in turn, may have a direct impact on the precision and accuracy of the radiation dose mapping and accumulation process. In order to address this, previous studies have suggested replacing the smoothness constraint with a penalty on the Jacobian determinant of the deformations (Rohlfing et al 2003, Haber and Modersitzki 2004, Haber and Modersitzki 2006. While this has led to acceptable boundary alignment and more anatomically meaningful deformations, it does so only for near-incompressible anatomical regions. For areas which do undergo volumetric changes, incompressibility as a motion constraint frequently leads to misregistrations, in particular in purely gas/liquid-filled compartments of the human anatomy. Since variational algorithms evaluate their respective cost function over the entire field-of-view (FOV), the resulting misregistration of such problematic regions may affect the estimation accuracy in adjacent incompressible regions, or worseshould such a problematic region dominate the cost function -even prevent convergence entirely. Thus, a regularization term which takes into consideration the local properties of the observed anatomy would be preferred.
The idea of introducing local constraints on the estimated deformations has been previously explored in the context of mono-modal registration of CT images depicting bony anatomies aside elastic soft tissues. Given that bones do not undergo deformations during typical anatomical motion, previous studies suggest fitting the parameters of a rigid transformation to the deformations estimated within the bones by a smoothness-regularized DIR algorithm (Ruan et al 2006, Reaungamornrat et al 2014, König et al 2016. Thus, the displacement of the bones is determined by the fitted rigid transformation whereas the deformations estimated within the elastic soft tissues are constrained to be spatially smooth. A different approach to anatomically-adaptive DIR employs the minimization of the elastic potential induced by the estimated deformations within the observed tissues (Franz et al 2007, Modersitzki 2007. Depending on the elastic modulus and Poisson ratio of each individual tissue, the deformations estimated in different anatomical areas will have a particular contribution to the total elastic potential and in turn they will be regularized to different extents. While promising, such an approach is complicated by the requirement of having precise knowledge of the individual biomechanical tissue properties. In effect, the aforementioned studies were conducted on synthetic data and 2D slices of MR and CT images, employing distinct mechanical parameters for the bone and muscle tissues. Moreover, the complex numerical optimization scheme required for the minimization of the total elastic potential can lead to convergence times which may impede a smooth IGRT workflow. Previous studies have also proposed spatial adaptation of the regularization by the means of anisotropic diffusion, for improved motion estimation in the proximity of sliding interfaces within CT/CBCT and MR images of the lung and/or liver (Pace et al 2013, Papież et al 2018. In the current work we propose an anatomically-adaptive multi-modal variational DIR algorithm, which locally employs a smoothness or an incompressibility penalty, depending on the anatomical properties of the observed anatomy. A test bench is also constructed, which demonstrates the benefits of such an approach compared to previously proposed registration models which employ a smoothness or incompressibility regularization on a global level. Moreover, despite the more complex nature of an anatomically adaptive approach, the numerical implementation has been optimized such that the performance remains entirely compatible with the requirements of online IGRT.

DIR algorithms employing a global regularization
In the scope of this work we have selected as a point-of-comparison to our anatomically-adaptive approach (which is described in detail in the subsequent paragraph), two previously-proposed algorithms which respectively also employ smoothness and incompressibility as a regularization constraint but, contrary to the new approach, assume validity of their respective regularization over the entire field-of-view (FOV). Both methods were previously validated for specific scenarios within the context of adaptive image-guided radiotherapy (Zachiu et al , 2020. (a) The EVolution registration algorithm (abbreviated EVO in the scope of this manuscript), initially proposed in (Denis de Senneville et al 2016), estimates the deformation between two images I 1 and I 2 as the minimizer of the following functional: where: with ⃗ u = (u 1 , u 2 , u 3 ) being the 3D displacement, Ω the image domain,⃗ r the voxel position, ⃗ ∇ is the spatial gradient operator, ∥ · ∥ 2 is the Euclidean norm, α is a parameter linking the two terms of the functional and Γ is a symmetric cubic neighborhood around ⃗ r. The data fidelity term (i.e. the first integrand) of the cost function in equation (1) favors the alignment of strong gradients (e.g. edges) present in both I 1 and I 2 . The regularization term (i.e. the second integrand) addresses the ill-posedness of minimizing the data fidelity term alone, by assuming that the underlying anatomical deformations are smooth/differentiable. The regularization parameter α controls the amount of smoothness that the estimated deformations should showcase. Additional details on the numerical minimization of equation (1)  (b) The incompressibility-regularized EVolution algorithm (referred to as EVI for the remainder of the manuscript), is a variation of EVO, initially designed as an improved solution for estimating anatomically meaningful deformations within elastic soft-tissues . Similar to EVO, it is a multimodal variational image registration algorithm, with the deformations being estimated as the minimizer of: being the Jacobian determinant of the deformations. The rest of the terms maintain their meaning from equation (2). The regularization term in equation (3) penalizes the occurrence of compressions and expansions within the estimated deformations and has been demonstrated to be a better choice within near-incompressible anatomical structures compared to the smoothness regularization of EVO (Zachiu et al , 2020. Note however, that the penalty on the Jacobian determinant only penalizes deviations from one but does not guarantee a value of exactly one. Depending on the value of data fidelity term and the regularization parameter β, deviations from one can occur within the estimated deformations.

Proposed anatomically adaptive DIR solution
In the scope of this paper we propose replacing the global regularization employed by EVO and EVI, with a localized regularization which depends on the specific anatomical and physical properties of the observed anatomical structures. In effect, this new registration model which we will call adaptive-EVolution (AEVO) provides the deformations between two images as the minimizer of the following functional: where M S and M I are binary 3D matrices, defining the regions where the observed anatomy undergoes smooth or incompressible deformations, respectively. Similar to equation (2) and (3), the regularization parameters γ and δ control the amount of smoothness and incompressibility within their respective image regions. The two binary masks M S and M I were chosen here to be complementary (i.e. M I = 1 − M S ). Alternatively, both regularizations could be employed simultaneously (i.e. M S = M I = 1) and allow parameters γ and δ to be spatially variant instead. The amount of smoothness and incompressibility of the estimated deformations could then be weighted in accordance with the biomechanical properties of the observed tissues. The minimization of the cost function in equation (5) was performed by running an iterative fixed-point scheme on the resulting Euler-Lagrange equations, similar to the approach described in (Denis de Senneville et al 2016) and  for EVO and EVI. The registration process was performed in a coarse-to-fine manner, by iterating it from a 16-fold downsampled version of the images up to their original resolution, with an upsampling factor of two. Together with the images, the binary masks M S and M I were also downsampled accordingly, using an averaging downsampling kernel. At each resolution level, we have also employed iterative refinement of the deformations, which implied restarting the registration process several times at the same resolution, using as an initial value the deformations from the previous refinement iteration. Further details related to a coarse-to-fine strategy with iterative refinement of the deformations for estimating large displacements can be found in (Brox et al 2004). The registration process at each resolution level was stopped when the relative average difference between the motion fields estimated by two successive iterations was lower than 5% (chosen empirically). The stopping criterion was evaluated individually for both the compressible and incompressible areas. Note that the EVO and EVI cost functions were minimized using the approach described in their original manuscripts (Denis de Senneville et al 2016, Zachiu et al 2018).

Experimental setup
The AEVO, EVO and EVI registration methods were evaluated within the context of two clinically relevant applications of IGRT: CT/conebeam CT(CBCT)-guided radiotherapy for lung cancer (Hardcastle et al 2013, Samavati et al 2015, Cole et al 2018, Rigaud et al 2019 and MR-guided radiotherapy of prostate cancer (Dang et al 2018, in Kim et al 2019, Dunlop et al 2020. The two anatomical sites were also chosen due to the considerably different deformation patterns showcased by the respective structures present within the field-of-view, i.e. compressible areas (lungs, bladder and rectum) situated in the immediate vicinity of near-incompressible structures (such as the pathological tissue itself). In the following we will describe the medical image data used for this purpose, together with the employed evaluation methodology for the algorithms.

Selection of the medical image datasets.
For CT/CBCT-guided radiotherapy for lung cancer, we have selected a 4D-CT image sequence from five patients, with each sequence sampling the 3D anatomical changes over the course of their respiratory cycle (from 0 to 90 %, with an increment of 10%). The data was downloaded from 'The Cancer Imaging Archive' , originally collected in the scope of the following studies (Roman et al 2012, Clark et al 2013, Balik et al 2013, Hugo et al 2016, Hugo et al 2017. For each dynamic image volume of the selected 4D-CT sequences, the curators of the database have also provided contours for the lungs, the tumor and several other structures. Out of each 4D-CT sequence we have only selected image pairs which showcase the largest anatomical differences between them. In effect, the full-inhalation image (the 0% phase) was established as reference image for the registration process, whereas the 40 − 70 % images were considered to be due for registration. The resulting dataset was then used to evaluate the performance and behavior of EVO, EVI and AEVO for CT to CT registration. Note that for the AEVO method, the lungs, the heart and the area outside the body were considered to be compressible, while the rest of the structures present in the FOV were deemed near-incompressible.
In order to evaluate the performance of the algorithms for CT to CBCT registration (potentially beneficial for applications described in detail in (Zachiu et al 2017b)), a set of CBCT images were synthesized from the 40-70% CT images mentioned above. This was achieved by a three-fold down-sampling of their sinogram and subsequent reconstruction via the TIGRE toolbox (Biguri et al 2016). As showcased in one of our previous studies, this results in a set of synthetic CBCT images with a level of streaking artifacts similar to that provided by a regular CBCT imaging device (we refer the interested reader to (Zachiu et al 2017b) for further details). Thus, registering the synthetic 40-70% CBCT images to the original 0% CT via the three algorithms, allowed comparing their performance for CT-CBCT registration. In terms of acquisition parameters, the original CT images were acquired on a Varian Trilogy scanner, in-plane image size 512 × 512 and a variable number of slices, voxel size 0.97 × 0.97 × 3.0 mm 3 .
Additionally, we have also conducted an evaluation of the accuracy of the three algorithms for CT -CT registration, by their employment on the ten image pairs available from the DIR-Lab database (Castillo et al 2009, Castillo et al 2009a. Aside the images, the curators of the database have also made available the location of 300 landmarks for each image pair, allowing the evaluation of the estimated landmark displacements relative to the ones extracted from the database. For further details on the characteristics and format of the data, we refer the interested reader to the database hosting website 1 . The evaluation of the algorithms for MR-guided radiotherapy for prostate patients was performed on image data acquired as part of a cohort study on the Unity MR-Linac installed at the UMC Utrecht, Utrecht, The Netherlands. In-line with the potential requirements of adaptive IGRT on the MR-Linac, we have evaluated the algorithms for: 1) Mono-modal MR image registration; 2) Multi-modal/cross-contrast MR image registration and 3) CT to MR image registration. For each of these instances, we have selected image pairs from five patients, with contours for the prostate, bladder, rectum, femurs and the pelvis being available for each individual image. The contours were validated for accuracy by two experienced radiation oncologists. For registration using the proposed AEVO method, the prostate and the bony structures (i.e. the pelvis and the femurs) were considered to be near-incompressible, while the remainder of the areas (including the region outside the body) were allowed to smoothly deform. The acquisition parameters for the reference and the moving images in the three datasets are reported in table 1. Note that all MR images were acquired on a 1.5 T Philips Achieva scanner, Philips Healthcare, Best, The Netherlands, while the CT prostate cancer images were acquired on a Philips Brilliance Big Bore scanner, Philips Healthcare, Best, The Netherlands. For instances in which the reference and the moving image had a different FOV and/or voxel size, the moving image was initially re-sampled on the grid of the reference. This was subsequently followed by resizing both images to 256 × 256 × 256, for computational purposes. For the CT -MR dataset, we have also performed an approximate rigid manual realignment of the images prior to applying deformable registration.

Algorithm evaluation criteria
The three algorithms were comparatively evaluated against three criteria: contour propagation performance, anatomical plausibility of the estimated deformations and accuracy of the estimated deformations.
(a) In order to determine the comparative performance of the algorithms for contour propagation, we employed the dice similarity coefficient (DSC) before and after registration (Dice 1945): where A and B are two contours whose overlap is determined and | · | corresponds to the number of voxels within a contour. (b) For anatomical plausibility, the voxel-wise Jacobian determinant of the estimated deformations was evaluated. It is known from continuum mechanics that deformations of near-incompressible materials have a Jacobian determinant that is close to one. Similarly, within anatomical areas which are known to undergo volumetric changes (such as the lung, bladder and rectum), a value which is close to one may be indicative of misregistrations. For the mathematical expression of the Jacobian determinant of a 3D deformation field we refer the reader to equation (4). (c) For the comparative evaluation of the accuracy of the algorithms, we have initially evaluated the target registration error (TRE) between the estimated and the known displacements of the landmarks extracted from the DIR-Lab database. Mathematically, the TRE was calculated as: where (x I , y I , z I ) and (x J , y J , z J ) are the known landmark coordinates in the reference and the moving image, and (u 1 (⃗ r I ), u 2 (⃗ r I ), u 3 (⃗ r I )) is the estimated 3D displacement for the landmark. As a second experiment in this sense, we have also applied a set of known deformations on one of the CT lung cancer images and one of the MR prostate cancer images. In this manner, two pairs of synthetically deformed images were generated and subsequently registered to one-another via the investigated algorithms. The voxel-wise root square error between the known and the estimated deformations was then calculated and used as an evaluation criterion: where ⃗ u est and ⃗ u known are the estimated and the known deformations, respectively, ∥ · ∥ 2 is the Euclidean norm and⃗ r is the voxel position. The known deformations were generated via a finite element modeling (FEM) of the organ displacements in each of the two images. In summary, the available organ contours were initially imported into the FEM software PreView v2.1.4 (Maas et al 2012), which allowed establishing the physical properties of the tissues and the motion actuators. The latter implied a lung inflation for the CT data and an increase of the bladder volume and rectum for the MRI. This was followed by the effective finite element simulation of the resulting deformations via the FEBio v2.9.1 software (Maas et al 2012).
Finally, the simulated deformations were inspected using PostView v2.4.4 (Maas et al 2012) in order to confirm their similarity to the typical deformation patterns occurring in these anatomical regions. The resulting vector fields were then used to deform the original CT and MR image, respectively. For modeling the organs in Preview v2.1.4 we used a tetrahedral mesh with an average element volume of ∼ 15 mm 3 . For the lungs, lung tumor, bladder and rectum a neo-hookean material model has been employed, while the rest of the anatomical structures were assumed isotropic elastic. During the simulation, sliding contacts were implemented for lung -body, liver -body and spleen -body interfaces, with the rest of the anatomical structures situated in-contact with one-another (e.g. prostate -rectum, prostate -bladder) being connected using layers of fat. As boundary conditions, the displacements for the spine (for the CT data) and for the pelvic bones and femurs (for the MR data) were set to zero. The employed elastic modulus (E) and the Poisson ratio (ν) for the tissues of interest were adopted from several previous studies (Qiao et

Algorithm implementation and calibration
All algorithms were implemented using the Compute Unified Device Architecture (CUDA) and executed on a Nvidia TITAN V graphics card. The regularization parameters α, β, γ and δ (see equation (1), (3) and (5)

Performance of the investigated algorithms for organ boundary alignment
3.1.1. CT/CBCT-based deformation estimation in lung cancer patients. Figure 1 provides a visualization of the typical registration results for the CT -CT and CT -CBCT registration of lung cancer data (see section 2.2.1). It can be observed that, to different extents, EVO, EVI and AEVO improve the alignment between the registered images. However, while the alignment of the tumor appears to be similar for the three algorithms, EVI showcases a considerable post-registration misalignment in the caudal part of the lungs.
These observations are re-confirmed by the results illustrated in figures 2(a) and (b), which show separately for the CT-CT and CT-CBCT datasets, the statistical distribution of the pre-and post-registration Dice Similarity Coefficient (DSC) (see section 2.2.1). Note that individually for the CT-CT and CT-CBCT image datasets, the DSC values were pooled together from the five patients, for each of the anatomical structures-of-interest. The box limits of each boxplot are represented by the 25th and 75th percentiles, while the whiskers are approximately the 1st and the 99th percentiles, respectively. While all three algorithms lead to improvements of the post-registration DSC, there is a noticeable tendency of the EVI algorithm to under-perform for the lungs, compared to both EVO and AEVO. On the other hand, there are no considerable differences between the algorithms for the tumor itself (assumed near-incompressible). Figure 3 displays the alignment before and after registration, for one of the image pairs selected from each of the monomodal MR, multimodal MR and CT-MR prostate cancer datasets (see section 2.2.1 for details). It can be noticed that in all cases, the three evaluated algorithms lead to a similar post-registration alignment for the prostate. On the other hand, we can observe that EVI showcases particular difficulties in aligning the bladder and the rectum boundaries.

MR-based deformation estimation in prostate cancer patients.
For a more quantitative comparative evaluation of the boundary alignment capabilities of the three algorithms for these datasets, table 3 reports the DSC before and after registration. The values in the table represent the DSC averaged over the five image pairs, reported individually for the bladder, prostate and rectum. Similar to the CT/CBCT data, all three algorithms have led to a notable improvement of the DSC in  all cases, compared to pre-registration. On the other hand, while for the prostate itself all three algorithms perform similarly, for the bladder and rectum, the EVI algorithm has a tendency to misregister the images in these areas, with differences of the average DSC of up to 7% compared to EVO and AEVO. The latter two algorithms, however, showcase no notable differences in terms of the DSC. Figure 4 illustrates the voxel-wise Jacobian determinant of the deformations estimated by the evaluated algorithms on one of the image pairs from each of the five datasets. It can be observed that in all cases, the EVO algorithm displays moderate to high deviations of the Jacobian determinant from one, regardless whether the underlying anatomy is near-incompressible or not. On the other hand, as the EVI algorithm penalizes deviations of the Jacobian determinant from one, such deviations are considerably reduced over the entire field-of-view. This also includes anatomical structures which are expected to change volume such as the lung and the rectum. The AEVO method, on the other hand, maintains a Jacobian determinant close to one for structures which were considered to be near-incompressible in the scope of this study (enclosed by the red dashed lines), while for the remainder of the structures moderate to high deviations can be observed. Another interesting observation is that for the illustrated CT-CBCT case, the AEVO algorithm showcases a more uniform spatial distribution of the Jacobian determinant within the lung, compared to the EVO method.

Anatomical plausibility of the estimated deformations
To support the observations stemming from figure 4 and 5 shows the statistical distribution of the Jacobian determinant provided within incompressible structures by the three registration algorithms. The CBCT-based registration. The two images showcase the statistical distribution of the DSC for the lungs and the tumor before and after registration using the three algorithms. The lungs were considered to be compressible, while the tumor was considered near-incompressible. illustrated boxplots were generated by pooling together the Jacobian determinant of the deformations within incompressible structures from all registered image pairs in each of the five datasets. It can be observed that, to different extents, the EVO algorithm leads in all cases to considerably larger variations of the Jacobian determinant compared to EVI and AEVO. On the other hand, within incompressible structures, AEVO demonstrates similar ranges of the Jacobian determinant to EVI. Table 4 reports the mean and standard deviation of the target registration error (TRE) for the evaluated algorithms, when employed for the registration of the ten DIR-Lab image pairs (see sections 2. 2.1 and 2.2.2). The values reported in the second column (i.e. 'No registration'), correspond to replacing u 1 , u 2 and u 3 in equation (7) with zero, indicating the misalignment between the landmarks in the absence of registration. It can be observed that for all ten of the datasets, all three algorithms lead to an improvement of the TRE, compared to when no registration is performed. However, we can observe that as the deformation between the registered images increases (corresponding to larger lung volumetric changes), the accuracy of EVI deteriorates considerably compared to EVO and AEVO, which perform similarly for all datasets. Particularly Figure 3. Example of pre and post-registration image alignment for the mono-modal MR, multi-modal MR and CT -MR prostate datasets. The images showcase the color-coded overlap, in a sagittal slice, between the reference and the moving image (first column) and the images registered by the three evaluated registration algorithms (columns two to four). Table 4. Accuracy of EVO, EVI and AEVO following the registration of the ten DIR-Lab image pairs. The results are reported under the shape of mean ± standard deviation of the TRE for the 300 landmarks per dataset.

Dataset
No registration ( problematic for all three algorithms is dataset #8, for which the average and standard deviation of the TRE is notably higher compared to the rest of the datasets. Figure 6 showcases a selected slice from the anatomical images used in the FEM experiment (see section 2.2.2). More precisely, the images illustrate a coronal (figures 6(a)-(c)) and a sagittal (figures 6(d)-(f)) slice selected from the reference image, the moving image and their color-coded fusion, associated to the lung CT and prostate MR FEM simulation. Within the color-coded fusion, the magenta channel corresponds to the moving image, while the green channel is the reference. As intended, it can be observed that the anatomical differences between the images stem from an inflation of the lungs for the CT data and a volumetric increase of the bladder and rectum for the prostate cancer data. figure 7 illustrates, for several structures-of-interest, the spatial and the statistical distribution of the magnitude of the FEM-generated deformations for the two Figure 4. Anatomical plausibility of the deformations estimated by the three algorithms on the five datasets -example. The first column showcases a slice from the reference image of one of the registered image pairs from each of the five datasets. Columns two, three and four display the spatial distribution of the Jacobian determinant of the deformations estimated by EVO, EVI and AEVO, respectively. The red dashed lines indicate the anatomical areas which were considered to be incompressible in the scope of this work. The blue dashed lines within the first two rows enclose the lungs, which were considered compressible. datasets. For the lung dataset, the magnitude of the deformations increases in the cranio-caudal direction, locally exceeding 18 mm in the caudal part of the lungs, whereas for the lung tumor, the displacements remain in the vicinity of 8 mm. For the FEM-generated prostate MR data, the joint bladder/rectum deformations, which were designed to undergo volumetric changes during the FEM simulation, reach up to ∼ 18 mm, with the larger displacements occurring within the anterior part of the bladder. In turn, the volumetric changes of the bladder and rectum lead to displacements of the prostate of up to 6 mm.
The spatial distribution of the Jacobian determinant of the FEM gold standard deformations and the ones estimated by the three algorithms on the FEM-generated data is illustrated in figure 8. The results are consistent with the observations made for the clinical data: the EVO and AEVO algorithms showcase moderate to high spatially inhomogeneous deviations from one within compressible structures such as the lungs, bladder and rectum, whereas the values for EVI remain in the proximity of one. On the other hand, within incompressible anatomies, the EVO method showcases moderate to high deviations from one, while AEVO and EVI provide similar values. This is again re-confirmed by the statistical analysis of the Jacobian determinant within incompressible structures, showcased within figure 9. Figures 10 and 11 illustrate the spatial and the statistical distribution of the RSE within several structures of interest, between the FEM-generated deformations and the deformations estimated by the three evaluated algorithms. We observe that, while EVO tends to provide low-to-moderate errors towards the outer boundaries of the lung and the majority of the bladder and the rectum, within low signal areas of the lung, the RSE locally increases to above-average values of up to ∼ 4 mm. Such areas are in good correspondence with the location of inhomogeneities within the Jacobian determinant map from figure 8. For the bladder and the rectum on the other hand, the EVO RSE values remain under 2 mm. EVO also showcases moderate-to-high RSE values within near-incompressible structures such as the lung tumor (of up to 3 mm), the liver (> 6 mm) and the prostate (of up to ∼ 2 mm) which are, to different extents, in spatial correspondence with deviations of the Jacobian determinant from one in figure 8. EVI on the other hand, leads to sub-millimeter errors within the near-incompressible lung tumor and the prostate. However, the RSE is increasing considerably towards the caudal part of the lungs and within the bladder, exceeding 7 mm for the former and 10 mm for the latter. The proposed AEVO algorithm demonstrates overall low-to-moderate RSE values of under 2 mm for the lungs and mostly under 1 mm for the bladder and rectum, while for the near-incompressible lung tumor and prostate, the RSE remains under 1 mm and 2 mm, respectively. Table 5 reports the average computational time required by the three evaluated registration algorithms, for each of the five registered image modalities. Recall that all images were re-sampled on a 256 × 256 × 256 grid prior to registration (see section 2.2.1).

Computational time of the proposed registration algorithm
It can be observed that in all instances, the average convergence time remains under 15 s, with the EVO algorithm converging notably faster compared to the other two methods.

Discussion
Several applications within image-guided radiotherapy such as daily positioning, contour propagation, dose accumulation and adaptive re-planning may potentially benefit from the inclusion of DIR algorithms in their workflows, with the aim of improving the geometric accuracy of treatment planning, delivery and response assessment , Paganelli et al 2018. Since current IGRT may include multiple imaging modalities and contrasts over the course of the treatment, multi-modal DIR methods are of particular interest in this context. Additionally, in case such methods are employed for applications requiring on-fly-processing, for example if the patient is already positioned on/in the therapy system, computational times and robust convergence become an important factor. In this sense, variational multi-modal DIR  Here, we propose a variational multi-modal registration algorithm, which adapts its registration model in accordance with the observed underlying anatomy (AEVO). In order to determine the benefits such an approach may provide, we have comparatively evaluated it against two existing solutions: one with a regularization term stemming from the field of digital image processing (EVO) (Denis de Senneville et al 2016) and its variation which has been adapted for estimating elastic soft tissue deformations (EVI) . The rationale is hereby that the majority of the existing cross-modality variational registration algorithms such as EVO or EVI employ a spatially invariant regularization across the entire field of view. The underlying assumption is hereby, that (a) either the overall tissue structure is sufficiently homogenous with respect to their biomechanical properties to justify this generalized choice of the regularization, or (b) that the data fidelity term of the variational is strong enough "to pull" the algorithm even in those regions to the anatomically correct result, where assumption (a) is (partially) violated. As our results and the results of numerous papers have shown, this assumption is generally well fulfilled for the evaluated EVO and EVI algorithms and can furthermore be addressed by choosing the appropriate regularization parameter. However, in the direct vicinity of biomechanically heterogeneous regions, and in particular for imaging modalities (or anatomical regions) associated with poorer soft tissue contrast, these assumptions become problematic due to the limited contribution of the data fidelity term to the variational in these regions. As it has been shown in previous studies for finite element-based registration algorithms (Balik et al 2001, Brock et al 2005, Zhong et al 2012, Velec et al 2017, taking local biomechanical properties as prior knowledge into account can significantly improve the result of the registration process. Unfortunately, for most biomedical image modalities such as MRI, CT or CBCT it is generally not possible to derive precise local knowledge of all biomechanical parameters on an inter-individual basis from the images themselves (Franz et al 2007, Modersitzki 2007. Nevertheless, almost all these imaging modalities allow a coarse classification of imaged tissue into the categories air, liquid, soft-tissue and bone. This, in-turn, unlocks the possibility to employ variational algorithms with a locally variant regularization, which matches locally best for a given type of tissue.
An additional point to be made is that all three algorithms employ the same data fidelity term, allowing thus to study the impact the different regularization terms have on the estimated deformations.

Algorithm performance for contour propagation
The contour propagation capabilities of all three algorithms was evaluated in the context of registering multiple imaging modalities which may be involved in an IGRT workflow: CT -CT, CT -CBCT, same-contrast MR, multi-contrast MR and CT -MR. The images were acquired as part of a protocol for lung and prostate cancer patients, respectively. Both of these anatomical sites contain neighboring regions with considerably different deformation patterns. For example, while the lungs can undergo significant volumetric changes over the respiratory cycle, the surrounding tissues are only displaced/transported from one part of the FOV to another, with limited changes in volume. A similar situation arises in the prostate, which may be displaced under the effect of volumetric changes within the bladder and/or rectum.
In order to compare the contour propagation performance of the three algorithms, we have evaluated the pre-and post-registration dice similarity coefficient (DSC) for several structures of interest in each of the five datasets. As shown in figure 2 and table 3, all algorithms have led to a notable improvement of the DSC compared to the pre-registration case. However, for anatomies which can undergo volumetric changes (such as the lungs, bladder and rectum), the EVI algorithm leads to systematically larger errors in terms of contour alignment, compared to EVO and AEVO. This behavior can also be observed in the registration results exemplified by figures 1 and 3, where it can be seen that EVI leads to a notable misalignment of the lung, bladder and/or rectum boundaries. Such an outcome is, nevertheless, in line with the design of EVI, which globally penalizes deviations of the Jacobian determinant from one and implicitly volumetric changes. Concerning AEVO, while there is a tendency to provide slightly lower values for the DSC compared to EVO, this situation only occurs within isolated cases, with limited differences of 1-2 %.
On the other hand, for near-incompressible structures such as the pathological area, all three algorithms provided similar DSC values in both the lung and the prostate cancer cases. It is worth noting that for EVO and EVI this is in good correspondence with our previous findings in (Zachiu et al , 2020.

Anatomical plausibility of the estimated deformations
As previously discussed, the Jacobian determinant of near-incompressible anatomies should be close to one. Thus, large deviations from this value in such areas are physically implausible and indicative of misregistrations (Schreibmann et al 2012. In turn, this may lead to mapping errors of quantitative information such as Houndsfield units, radiation dose and/or diffusion/perfusion values, as discussed and demonstrated by previous studies (Zachiu et al , 2020. In effect, we have evaluated here the spatial and statistical distribution of the Jacobian determinant for all three of the algorithms. What is already apparent from a visual analysis of figures 4 and 8 is that both EVO and EVI provide a spatial distribution of the Jacobian determinant with consistent characteristics over the entire FOV. While EVO showcases moderate to high deviations from one within both compressible and near-incompressible structures, EVI overall penalizes such deviations, including within structures which typically undergo volumetric changes. Particularly for the FEM experiments (see figure 8), where large volumetric changes were simulated for the lungs (especially in the caudal area) and the bladder, a Jacobian determinant close to one within these structures is an indication of misregistration. This is further confirmed by the spatial distribution of the RSE illustrated in figure 10. AEVO, on the other hand, limits deviations of the Jacobian determinant from one solely within structures which were considered to be near-incompressible, with the areas outside these structures showcasing volumetric changes to approximately the same extent as EVO. A noteworthy observation is that, within the lungs, EVO showcases considerable spatial inhomogeneities of the Jacobian determinant within areas of low image intensity, indicative of non-uniform volumetric changes (see figures 4 and 8). From a clinical point-of-view, spatially-inhomogeneous volumetric changes are physiologically plausible for lung cancer patients. As demonstrated by previous studies, tumors and fibrotic tissue may end-up changing the local mechanical properties of the lung tissue, leading to a spatially-variant ventialation capacity (Kipritidis et al 2019). However, in the case of EVO, the volumetric inhomogeneities indicate spatial transitions from expansions to compressions, which is an unexpected deformation pattern in between two images acquired at two different phases of the respiratory cycle. The implausibility of such deformation patterns is also re-confirmed by the EVO RSE map illustrated in figure 10, which showcases peaks in areas with large inhomogeneities of the Jacobian determinant. By comparison, the AEVO algorithm leads to smoother spatial variations of the Jacobian determinant, with sudden expansion-compression transitions being dampened. We hypothesize that this stems from the multi-resolution scheme used to minimize the AEVO variational (see section 2.1.2). At the lower resolution levels, both the smoothness and incompressible regularizations may end-up operating on the same voxels. This can lead to an incompressible deformation pattern, partially being present within smooth regions of the motion field used as an initializer for the higher resolutions. In turn, this leads to a more uniform spatial distribution of the Jacobian determinant within compressible areas. Overall however, this may allude to the fact that the smoothness regularization may not be a suitable choice when estimating lung deformations during respiration and that alternatives might have to be considered.
Within near-incompressible areas on the other hand, it can be seen in figures 5 and 9 that while AEVO and EVI provide values within a similar range, EVO leads to a systematic ∼ 20-40% higher range of the Jacobian determinant. In effect, the deformations provided by AEVO and EVI are more anatomically plausible in near-incompressible anatomical structures, compared to EVO. Note however, that both EVI and AEVO explicitly penalize (in a global and local manner, respectively) deviations of the Jacobian determinant from one. Thus, the anatomical plausibility of the deformations with respect to this criterion is to be expected, once the algorithms converge.

Accuracy of the algorithms
The experiment conducted on the DIR-Lab dataset provided an indication of the accuracy of the three algorithms in high-contrast areas of the lungs, represented by anatomically identifiable landmarks (see table 4). For datasets #1-#3 (showcasing the lowest amount of deformations), all three algorithms performed similarly, with AEVO having only a slight tendency to outperform the other two. Particularly for EVI, this would indicate that as long as volumetric changes remain under a particular threshold (determined by the regularization parameter β in equation (3)), the algorithm manages to estimate deformations just as well as the smoothness-regularized algorithms. This is also evident from figures 8 and 10, where the EVI algorithm showcases sub-millimeter RSE values in lung areas in which the FEM-generated deformations have low deviations of the Jacobian determinant from one (e.g. in the cranial and mid part of the lungs). However, as the magnitude of the deformations increases and implicitly that of the volumetric changes (particularly for DIR-Lab datasets #6 -#9 and the caudal part of the lungs in the CT FEM experiment), the EVO and AEVO algorithms visibly outperform EVI. It is also worth noting that for all DIR-Lab datasets EVO and AEVO perform similarly, with AEVO providing marginally better results. The most problematic among the ten DIR-Lab datasets remains #8, for which while EVO and AEVO outperform EVI, their accuracy is considerably worse relative to the rest of the datasets. In this case we hypothesize that the landmark displacements between the image pairs were so large that they exceed the estimation capabilities of the coarse-to-fine strategy used in the registration process (see section 2.1.2). In effect the data fidelity term most likely converged to a strong local optimum situated far from the global one.
The FEM experiment allowed the analysis of the accuracy of the algorithms not only in high contrast image areas such as organ boundaries and/or anatomical landmarks, but also within homogeneous and/or low-signal/contrast image areas (e.g. the tumor, liver, spleen and low-signal areas in the CT data and the prostate in the MR data). Similar to the DIR-Lab dataset, all three algorithms improve to different extents the alignment between the registered images, given that the range of the RSE values reported in figures 10 and 11 are below the magnitude of the known FEM deformations (illustrated in figure 7). Nevertheless, figures 10 and 11 re-confirm that, within the near-incompressible lung tumor and prostate, the EVO algorithm is outperformed by EVI and AEVO, whereas within compressible anatomical areas, EVI is outperformed by EVO and AEVO. It is worth mentioning however, that for the FEM-generated CT dataset, while the liver is near-incompressible from a physical point-of-view, the EVI algorithm leads to large errors (see figure 10). The reason behind this lies in its limited ability to re-align the caudal part of the lungs which, since the lung-liver interface is the main high-contrast feature in that particular region, has an impact on the accuracy of the deformations estimated within the liver. At the same time, while EVO demonstrates low RSE values in the caudal part of the lung and the cranial part of the liver, it estimates non-physiological deviations of the Jacobian determinant within the contrast-devoid areas of the liver. In turn, this leads to a corresponding increase of the RSE to over 6 mm as shown in figure 10. By comparison, the AEVO method demonstrates notably lower errors, the majority of which are under 3 mm, within the regions immediately caudal to the lungs. Thus, together with the results obtained from the analysis on the DIR-Lab images, figures 10 and 11 indicate that AEVO has the better overall accuracy among the three.
It has to be taken into consideration however, that the accuracy of the evaluated algorithms may be dependent on image acquisition parameters such as image resolution and/or signal-to-noise ratio (SNR). It is expected that, as the image resolution decreases, the accuracy of the algorithms will also deteriorate, due to the loss of high-contrast details, necessary to guide the registration process (Glitzner et al 2015). However, from our experience with variational registration algorithms, as long as the image resolution roughly over-samples the deformations by a factor of two, the resolution is not the limiting factor of the registration accuracy. Since image resolution and SNR, which also influences the registration performance (Zachiu et al 2015a), are coupled, this becomes a more complex analysis which requires a completely separate set of experimental test benches and thus will be the topic of future studies.

Overall algorithm performance
Depending on the specific particularities and requirements of the application at hand, a registration algorithm may or may not be suitable for the task. In case contour propagation within (near-incompressible) elastic soft tissues is of interest, the results in figure 2 and table 3 indicate that either of the three evaluate algorithms would provide satisfactory results, according to the quality assurance guidelines of DIR algorithms in radiotherapy proposed by AAPM Task Group 132 and related studies , Paganelli et al 2018. For anatomical structures which undergo volumetric changes however, the EVI method is by design unsuitable to handle such deformation patterns.
On the other hand, in case the application requires the mapping of quantitative information (such as radiation dose, Houndsfield units, diffusion/perfusion values, etc) within elastic soft tissues, the EVO algorithm may lead to mapping errors due to its tendency to provide moderate to high deviations of the Jacobian determinant from one. As we have demonstrated in our previous studies (Zachiu et al , 2020, particularly for mapping radiation dose in the scope of spatially consistent dose accumulation, this can lead to unrealistic cold and hot spots within the mapped radiation dose. In the eventuality that such information is used for further adaptations of the therapeutic plan, this can end-up impacting the therapeutic end-point with serious potential consequences for the patient. As shown in figures 5 and 9, variations of the Jacobian determinant from one within incompressible areas are considerably dampened by AEVO and EVI. Nevertheless, within compressible anatomical structures, EVI is unsuitable for mapping quantitative information as well, due to its limited ability to register volumetric changes. In effect, the proposed anatomically-adaptive approach appears to be the optimal choice among the three: it is able to accurately track the boundaries of both compressible and incompressible anatomical areas, while at the same time maintaining the anatomical plausibility of the estimated deformations within elastic soft tissues. This assertion is also supported by the DIR-Lab data analysis and the two FEM experiments, which have demonstrated that the AEVO algorithm overall outperforms in terms of accuracy (to different extents) the EVO and EVI methods (as discussed in section 4.3).
Compared to EVO and EVI, however, the AEVO algorithm requires a map of compressible/ incompressible regions as an input. This basically provides the registration method with prior information regarding the deformation characteristics of different anatomical regions present within the FOV (Franz et al 2007, Modersitzki 2007, Garau et al 2019. While this may seem as a limitation of the approach, contours of the tumor and the organs-at-risk (allowing the definition of such areas) are readily available from the therapy planning step of the IGRT. Additionally, the recent developments within the domain of deep learning-based segmentation, present the potential to automate the generation of the required maps.
Another distinguishable difference between AEVO and the two other methods is the requirement to calibrate two regularization parameters instead of one (see equation (1), (3) and (5)). While this leads to longer calibration times, we have observed that from a practical point-of-view, this only needs to be performed once per registration modality. In effect, similar to EVO and EVI, the regularization parameters can be prospectively optimized and, as long as the MR/CT acquisition parameters do not change significantly, the same values can be maintained over practically any number of cases. Nevertheless, for future studies we intend to investigate methodologies which allow the automatic calibration of the proposed AEVO method.
In terms of computational requirements, the average convergence time for AEVO remained under 15 s for all registered image modalities (see table 5). While the overall computational requirements are comparable to those of EVI, there is a notable time penalty relative to EVO. We hypothesize that this is due to the increased numerical complexity associated with the minimization of the AEVO cost function, compared to the single smoothness or incompressibility regularization. Nevertheless, the convergence time of the proposed AEVO algorithm remains compatible with clinical workflows in an online setting. Naturally, the computational requirements of all three of the evaluated methods may scale with the size of the images, the employed hardware and magnitude of the deformations.
An additional aspect worth taking into consideration is the suitability of the incompressibility constraint incorporated by the EVI and AEVO algorithms. In the current study we have investigated scenarios in which the pathological area did not undergo large volumetric changes and thus it was assumed near-incompressible. Naturally, over the course of radiotherapy, the tumor may change its volume as response to radiotherapy, beyond the estimation capabilities of an incompressibility-regularized algorithm. Similarly, the patient may suffer weight loss during therapy. Therefore, the incompressibility regularization is most suitable for intra-fraction scenarios, whereas for inter-fraction adaptive radiotherapy it should be employed selectively, depending on the expected deformation characteristics of the tracked anatomical structures. For AEVO however, this would simply imply adjusting the masks M S and M I in equation (5). For the interested reader, further details related to the suitability of an incompressibility regularization term for variational registration methods employed in IGRT can be found in (Zachiu et al , 2020.

Conclusion
The present study proposes a novel multi-modal variational DIR algorithm, which adapts its registration model in accordance with the local properties of the observed anatomy. The method was comparatively evaluated against two existing DIR methods EVO and EVI, which employ a smoothness and an incompressible regularization over the entire field-of-view, respectively. In terms of contour propagation, both the proposed AEVO and the existing EVO have shown a similar performance, whereas EVI was demonstrated to be sub-optimal for areas which undergo volumetric changes. On the other hand, while the EVO algorithm can lead to anatomically implausible deformations within elastic biological soft-tissues, the proposed AEVO method addresses this shortcoming, leading to a local Jacobian determinant similar to that of EVI. As demonstrated in previous studies, this can lead to a more precise and accurate mapping of quantitative information such as the delivered radiation dose. The FEM experiments have also demonstrated that overall, the AEVO algorithm has the potential to estimate deformations with a higher accuracy, compared to the existing methods. Moreover, despite the more complex registration model and numerical implementation, the proposed anatomically adaptive approach leads to computational times which remain compatible with clinical scenarios in an online setting. Therefore, we can conclude that, compared to the methods employing a global regularization, the proposed AEVO algorithm showcases better potential benefits for future adaptive IGRT workflows.