The application of metal artifact reduction methods on computed tomography scans for radiotherapy applications: A literature review

Abstract Metal artifact reduction (MAR) methods are used to reduce artifacts from metals or metal components in computed tomography (CT). In radiotherapy (RT), CT is the most used imaging modality for planning, whose quality is often affected by metal artifacts. The aim of this study is to systematically review the impact of MAR methods on CT Hounsfield Unit values, contouring of regions of interest, and dose calculation for RT applications. This systematic review is performed in accordance with the PRISMA guidelines; the PubMed and Web of Science databases were searched using the main keywords “metal artifact reduction”, “computed tomography” and “radiotherapy”. A total of 382 publications were identified, of which 40 (including one review article) met the inclusion criteria and were included in this review. The selected publications (except for the review article) were grouped into two main categories: commercial MAR methods and research‐based MAR methods. Conclusion: The application of MAR methods on CT scans can improve treatment planning quality in RT. However, none of the investigated or proposed MAR methods was completely satisfactory for RT applications because of limitations such as the introduction of other errors (e.g., other artifacts) or image quality degradation (e.g., blurring), and further research is still necessary to overcome these challenges.


| INTRODUCTION
Radiotherapy (RT) is one of the primary curative treatment options for different types of cancer, for example, cancers of head and neck, prostate, cervix, breast, as well as sarcomas. RT aims to deliver therapeutic ionizing radiation dose to a treatment target, while sparing healthy organs at risk (OARs) as much as possible. The therapeutic radiation dose is delivered to the target using beams produced by a clinical linear accelerator (LINAC) in external beam radiation therapy (EBRT), while in brachytherapy radioactive sources invasively placed near or inside the target are used. Typically, RT workflows include a simulation stage and a treatment delivery stage. During the treatment planning process at the simulation stage, computed tomography (CT) scans serve as a primary source of anatomical information to identify and delineate the target and OARs. In addition, they are used to calculate the electron densities which are derived from the Hounsfield unit (HU) values of those CT scans. This electron density information in combination with the delineations of the anatomical structures is used to calculate the therapeutic radiation dose.
CT scans with insufficient quality may greatly affect the treatment planning process, potentially resulting in the target receiving insufficient dosage and/or extra toxicity to the OARs. Metal implants or metal components inside the body of the patient can induce errors during the CT reconstruction, which appear as artifacts on the resulting CT scans. These metal artifacts are typically bright and/or dark streaks (see Fig. 1) and are produced by beam hardening, photon starvation, edge gradient effect, scatter, or their combination. 1,2 The degree of metal artifacts mainly depends on the atomic number, density, size, and shape of this metal component as well as its orientation with respect to the CT scan plane. 3,4 Among others, dental implants or dental fillings in the head and neck (H&N) area, bilateral or unilateral metal prostheses in the hip region, and metal screws in the spine produce a large amount of metal artifacts and, thus, significantly deteriorate the quality of CT scans. [5][6][7] Dark streaks near the metal components result from highly attenuated polychromatic x-ray beams, which become for this reason harder. 8 Because of this, insufficient photons reach the CT detectors (photon starvation), resulting in large statistical errors in data acquisition, which induce fine bright and dark streaks along the direction of highest attenuation 7,8 . As a consequence, the appearance of these streak-shaped metal artifacts adversely affects the accuracy of organ contouring and the electron density calculation. This can eventually result in errors in planned radiation dose distributions and particle range measurements in photon and particle beams, respectively. 9,10 In the literature, several papers have been published on algorithms which perform metal artifact reduction (MAR) on CT scans.
The working principle of traditional MAR algorithms may be categorized into three overall approaches: image inpainting techniques, 11 sinogram inpainting techniques, 12 and model-based iterative reconstruction (MBIR) techniques 13 or their combination. The image inpainting techniques are applied to already reconstructed CT scans and they replace artifact corrupted CT pixels with good-estimated values. The sinogram inpainting techniques follow a similar principle, but are used on projection data (sinograms) instead of on reconstructed CT slices. Finally, MBIR techniques are advanced CT reconstruction techniques which use probabilistic forward and backward models to reduce error propagations during CT reconstruction. 14 Recently, thanks to the increasing availability of computational resources, very promising results in the field of medical imaging have been produced using machine learning (and in particular its subset deep learning), [15][16][17][18] including metal artifact reduction in CT scans. 19,20 For example, the performance of convolutional neural networks (CNNs) has been assessed in combination with sinogram inpainting for artifact correction. 19,21 The deep learning techniques are powerful in learning and capturing the detailed features and patterns of the metal artifacts.
In general, the application of a MAR method on a CT scan with artifacts (CT art ) results in the creation of a corrected CT scan (CT cor ) on which the impact of the artifacts is reduced, either in terms of image quality or dosimetric outcome on the treatment. To measure the effectiveness of the methods, several different metrics have been introduced in the literature to compare CT art and CT cor . Image quality metrics proposed include visual inspection, quantification of HU values, artifact index, 22 contrast-to-noise ratio (CNR), signal-tonoise ratio (SNR), peak SNR (PSNR), structural similarity (SSIM), Hausdorff distance (HD), 23 and the Dice similarity coefficient (DSC). [24][25][26] To evaluate the dosimetric impact, instead, the calculated dose distributions on CT art and CT cor for the target and OARs provided by a treatment planning system (TPS) can be compared. Various dose metrics can be used to express the dosimetric impact, including gamma (γ) index, 27,28 dose-area histogram (DAH), 29 quantifications of D 90% , D 100% , V 100% , and V 150% , 30 and therapeutic range calculation (in particular, water equivalent thickness (WET) 31  | 199 artifacts on treatment planning and the potential dosimetric improvements resulting from the application of various MAR methods. 10 The article addressed the impact of sinogram inpainting and MBIR on the dose distributions, mainly focusing on research-based MAR methods while, among the commercially available MAR methods, only the Orthopaedics Metal Artifact Reduction (O-MAR (Philips Health System, Cleveland, USA)) algorithm was reported. Instead, this systematic review article aims to include all MAR methods which have been investigated or proposed for RT applications in the last 5 yrs at the time of publication (2015-2020). These methods include commercial MAR methods and research-based MAR methods based on either traditional algorithms or deep learning. In addition, our review extensively reports not only the works on dosimetric impact of the methods but also on the ones evaluating the effects on organ contouring, and image quality and HU restoration for RT applications.

2.A | Literature search
The systematic review search was performed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. 32,33 A comprehensive electronic search from the databases of PubMed® (U.S. National Library of Medicine, USA) and Web of Science (Clarivate Analytics, USA) was performed in October 2020. Combinations and synonyms of the main keywords "metal artifact reduction," "computed tomography," and "radiotherapy" were used. The search was limited to the English language and to the last 5 yrs. Initially, title and abstract of the identified articles were read to screen their suitability for the selection. Then, the full texts were read from the selected articles to check their eligibility for inclusion. Finally, a manual search was performed using the list of references of the included articles to find any additional data missed by the initial database searches (see Fig. 2).

2.B | Inclusion and exclusion criteria
An article was considered if it investigated the use of one or multiple MAR methods on CT scans for RT applications, with the exclusion of dual-energy CT (DECT), dental cone-beam CT (CBCT), C-arm CT, spectral CT, micro-CT, or photoacoustic CT. Also, editorial commentaries and book chapters were excluded.

| RESULTS
A total of 40 full-text publications were selected for this systematic review, including one review article, as mentioned in the Introduction section. The selected publications (except the review article) have further been categorized into application of commercial methods (n = 25), and application of research-based MAR methods (n = 14). The category of commercial methods includes the articles on commercial MAR algorithms (n = 21) and TPS-based density correction (n = 4). Research-based MAR methods include the articles on traditional MAR algorithms (n = 11) and deep learning-based MAR algorithms (n = 3).  37 These algorithms work on projection data (projection-based MAR algorithms) and they typically use an image-based metal segmentation method as a starting point. 7 Their basic concept is to detect and segment the corrupted projection data which corresponds to the metal components. Subsequently, the corrupted data are replaced by esti-  A density correction or a density override method on the TPS can also be used to reduce the metal artifacts on CT art . When these approaches are used, regions corrupted by metal artifacts and metal regions are identified manually through contouring on CT art . Then, a manual density override or density correction is performed by replacing the metal artifacts commonly with the density of water or by replacing the physical density of a metal implant by the appropriate value. 38  . Subtraction (c) of the tissue class sinogram from the original sinogram results in a difference sinogram. Then, the metalonly sinogram is used to mask (d) the difference sinogram and a correction image is produced after filtered back projection (e). Subtraction of the correction image from the input (f and g) produces a corrected image which then undergo (h) in order to apply further corrections.

3.A.2 | Evaluation of organ contouring
The commercial MAR algorithms have been also evaluated on their ability to improve organ contouring in RT (see Table 2 53 The GTV tongue increased (P = 0.267) from 28 ± 6 cm 3 (mean ± STD) on CT art to 30 ± 7 cm 3 (mean ± STD) on CT cor . However, the mean volume of the parotid as an OAR was reduced in this study. Moreover, the authors evaluated the size of the prostate GTVs on bilateral implanted pelvis CT scans, and they were reduced (P = 0.168) from 87 ± 44 cm 3 (mean ± STD) on CT art to 75 ± 22 cm 3 (mean ± STD) on CT cor after iMAR application. For the OARs on CT cor after iMAR application in the pelvis case, the mean volume for rectum and bladder was reduced and increased, respectively. The DSC of the contours with respect to the reference increased more for the CT cor after iMAR application than for the CT art . Both the GTV of the tongue and the prostate on CT cor after iMAR application were underestimated in comparison with the predefined reference. Nevertheless, it improved the confidence in contouring, as indicated by higher DSC values. Axente et al. assessed the image quality and visual conspicuity of CT art and CT cor after iMAR application. 35 Different types of clinical images were used, such as hip cases with unilateral or bilateral metal implants, H&N cases with dental fillings, a spine with metal implants, a knee with prosthesis, and a breast with expander.
The median score for the image quality and visual conspicuity of CT art and CT cor after iMAR application increased from 3 to 4 of 5.
During the image quality assessment, new secondary artifacts were identified on CT cor . Another study investigated iMAR for its anatomical delineation accuracy. 54  The SEMAR algorithm was evaluated by Shiraishi et al. to quantify its ability to improve the detection accuracy of implanted iodine seeds which contain Ti and silver (Ag) in brachytherapy. 55 To identify seeds on both CT art and CT cor after SEMAR application an automatic seed finder was used, and the results were compared with reference positions. The mean true-positive fraction (TPF) was calculated, and it had significantly higher values (P < 0.05) for CT cor after SEMAR application (0.992 ± 0.0103, [mean ± STD]) than for CT art (0.982 ± 0.0159, [mean ± STD]). Thus, the application of SEMAR on CT art improved implanted seed detection.
Overall, CT cor after application of the above-mentioned commercial MAR algorithms improved the anatomical conspicuity and contouring accuracy. On the other hand, new artifacts induced by the MAR algorithms appeared on resulting CT cor and it is clear that the external factors such as physician's knowledge and experience considerably influence these results.

Dosimetric impact of commercial MAR algorithms
The commercial MAR algorithms were also evaluated to assess their ability to improve the dose calculation accuracy in RT. Table 3   and of a pelvis with a metal prosthesis. 45 The average conformity index (CI), D 99% , and V 100% were calculated on CT cor after Smart MAR application and CT art and then compared. The average percentage (mean ± STD) differences in CI, D 99% , and V 100% on H&N CT scans were −0.3% ± 0.9%, −0.1% ± 0.1%, and −0.1% ± 0.5%, respectively. For the CT scans of the pelvis, they were (mean ± STD) −8.8% ± 11.4%, −0.1% ± 0.4%, and −8.8% ± 12.1%, respectively.
Also, this study found that the calculated dose differences between PUVANASUNTHARARAJAH ET AL.
| 209 the CT cor and CT art were not significant. In another study, Inal and Sarpün evaluated Smart MAR for dose calculation accuracy in 12 different intensity-modulated radiation therapy (IMRT) 64 plans with 5-, 7-, and 9-field beam arrangements and segment numbers. 46

Dosimetric impacts of density correction methods
The density correction methods which are available in the TPSs can also be used to reduce the metal artifacts on CT scans for RT applications, see Table 4  For the metal deletion technique (MDT) 75 (Fig. 4) which uses sinogram inpainting iteratively, initially pixels which contain metal data are segmented from CT art (Fig. 4, image 2). Then, linear interpolation (LI) 83 and edge-preserving blur filters (Fig. 4, image 4) are applied on this CT scan to calculate the missing pixel values and to reduce the noise, respectively. Subsequently, the linearly interpolated and noise-reduced image is forward projected to create an initial sinogram (Fig. 4, number 5). This sinogram is used iteratively (four iterations in total) to replace the pixels which contain metal artifacts in the original sinogram. On each iteration, rays that pass through the metal are replaced with the value from the previous iteration ( Fig. 4, number 6). This procedure results in a corrected sinogram.
Finally, the filtered back-projection of the corrected sinogram (Fig. 4, image 7) with added metal data produces the CT cor .
An MRI-based MAR algorithm was proposed by Park et al. 79 The  Finally, the application of filtered back projection on the corrected kVCT sinogram produces a kVCT scan with corrected artifacts.
A study by Kim et al. proposed to acquire an additional tilted CT scan in which less metal artifacts are present. 76 First, an artifact map is generated from the denoised initial CT scan (CT art ) and the addi- gamma map is used to identify and replace the artifact-corrupted pixels on the CT art slice by the corresponding pixels on the artifact-free CT slice.
An MAR method which includes a sinogram precorrection (which is not described) and a hardware adaptation for proton therapy was proposed by Jin et al. 77 The hardware adaptation includes an increase in the X-ray energy from 120 kVp (standard energy) to 180kVp, and an increase in the triggering rate of data acquisition system (DAS) 88 of the X-ray detectors. In the end, the prelog iterative CT reconstruction method 89

Evaluation of HU values retrieval and image quality
Research-based MAR algorithms based on traditional image processing methods were evaluated for their ability to restore HU values and their ability to improve image quality of CT scans which will be used for RT applications.  84 During the image quality evaluation, CT scans from a custom-made veal shank phantom with metal inserts and clinical H&N CT scans with dental implants were used. In addition to kerMAR, also O-MAR was applied to the CT scans for a com-

Dosimetric impacts of traditional MAR algorithms
In addition to the improvement of HU and image quality, dosimetric impacts of research-based traditional MAR algorithms were evaluated in several research studies.

| 215
The MRI-based MAR algorithm 79 was evaluated to assess its ability to improve the proton range error (ΔWET) for proton therapy applications. On clinical CT scans of a brain and H&N, the application of this proposed method improved the absolute ΔWET from 2.4 cm in brain and 1.7 cm in H&N to less than 2 mm for both cases. In a study by Nielsen et al., 84   A similar study, also using a residual learning-based artifact reduction CNN (RL-ARCNN) and paired data for artifact reduction, was proposed for brachytherapy applications by Huang et al. 95 103 : an instance normalization layer 104 replaced the BN layer in the generator, the kernel size of convolutional layer was set to 4X4 with a stride of 2 instead of the maximum pooling filter, the image size was expanded by using the upsampling filters and one filter convolution layer was applied followed by a tanh activation function as the last filter. For all the other layers, a leaky rectified linear unit (LeakyR-eLU) with a slope of 0.2 was used as an activation function. In this proposed method, the CT art is translated into CT cor using adversarial loss, cycle consistency loss, 105 and identity map loss which were introduced to regularize the generator during the conversion.

Evaluation of HU values restoration and image quality
The proposed dual-stream deep network, RL-ARCNN, and DL-MAR were evaluated for their ability to restore HU values and image quality. The findings of these studies are summarized in Table 7. the plan dose distribution on CT cor after application of density correction may not reflect the actual dose distribution. Reduces the HU errors better than LI and NMAR.

| DISCUSSION
Additional radiation burden.
ALIR 92 Improves the dose calculation accuracy more than density correction method and LI do.
Increases the calculated doses for OAR.
MAR with hardware adaptation 77 Addresses the photon starvation during the artifact reduction.
Modified CT image acquisition in comparison with the standard.
MRI-based CT MAR 79 Does not require sinogram and thresholdbased tissue classification.
kerMAR 84 Reduces the metal artifacts better than O-MAR in H&N cases.

Requires aligned MRI & CT scan.
Applicable only for H&N case.
MVCBCT & kVCT method 81 Performs better than LI and NMAR for artifact reduction.
Hybrid sinogram-based MAR 82 Performs better than LI for artifact reduction.

Deep learning-based MAR algorithms
Dual-stream CNN with residual learning 96 Reduces the remaining metal artifacts on CT scans after NMAR application.
Requires paired data and depends on the performance of NMAR RL-ARCNN 95 Does not require sinogram data. Needs paired data.
DL-MAR 97 Does not require paired data. Comparable performance to density correction for accuracy in dose calculation.
Applicable only in H&N cases.
PUVANASUNTHARARAJAH ET AL. Moreover, recent developments of MAR algorithms which utilize deep learning, for example, Cycle GANs do not require paired clinical CT scans for artifact reduction. Therefore, developing a MAR method while targeting a specific pattern of metal artifacts and a specific anatomical structure using a deep learning approach will be a promising solution. This method should be explored further and then evaluated for the RT applications.

CONF LICTS OF INTEREST
The authors declare that they have no conflict of interest.

AUTHORS CON TRIBUTIONS
Sathyathas Puvanasunthararajah was involved in conceptualization, data extraction, and drafting of the manuscript. Davide Fontanarosa and Marie-Luise Wille were involved in conceptualization, reviewing, and supervision. Saskia M. Camps was involved in conceptualization, and checked the accuracy of data extraction, reviewing, and supervision.