Evaluation of cassette‐based digital radiography detectors using standardized image quality metrics: AAPM TG‐150 Draft Image Detector Tests

The purpose of this study was to evaluate several of the standardized image quality metrics proposed by the American Association of Physics in Medicine (AAPM) Task Group 150. The task group suggested region‐of‐interest (ROI)‐based techniques to measure nonuniformity, minimum signal‐to‐noise ratio (SNR), number of anomalous pixels, and modulation transfer function (MTF). This study evaluated the effects of ROI size and layout on the image metrics by using four different ROI sets, assessed result uncertainty by repeating measurements, and compared results with two commercially available quality control tools, namely the Carestream DIRECTVIEW Total Quality Tool (TQT) and the GE Healthcare Quality Assurance Process (QAP). Seven Carestream DRX‐1C (CsI) detectors on mobile DR systems and four GE FlashPad detectors in radiographic rooms were tested. Images were analyzed using MATLAB software that had been previously validated and reported. Our values for signal and SNR nonuniformity and MTF agree with values published by other investigators. Our results show that ROI size affects nonuniformity and minimum SNR measurements, but not detection of anomalous pixels. Exposure geometry affects all tested image metrics except for the MTF. TG‐150 metrics in general agree with the TQT, but agree with the QAP only for local and global signal nonuniformity. The difference in SNR nonuniformity and MTF values between the TG‐150 and QAP may be explained by differences in the calculation of noise and acquisition beam quality, respectively. TG‐150's SNR nonuniformity metrics are also more sensitive to detector nonuniformity compared to the QAP. Our results suggest that fixed ROI size should be used for consistency because nonuniformity metrics depend on ROI size. Ideally, detector tests should be performed at the exact calibration position. If not feasible, a baseline should be established from the mean of several repeated measurements. Our study indicates that the TG‐150 tests can be used as an independent standardized procedure for detector performance assessment. PACS number(s): 87.57.‐s, 87.57.C


I. INTRODUCTION
Digital radiography (DR) has been widely adopted in the clinic in the recent years. A variety of clinical implementations of digital X-ray detectors have been developed. Although some image quality control (QC) tools are available for DR, there is an absence of standardized metrics needed to ensure the constancy of digital detector performance. (1) Therefore, the American Association of Physics in Medicine (AAPM) formed Task Group 150 (TG-150) to address this issue. Its detector subgroup drafted a report for digital image detector testing and developed a set of recommended tests (E. Gingold, personal communication, July 2012). Automated analysis software had been previously developed in the MATLAB (MathWorks, Natick, MA) environment, and validated by two independent groups. (2)(3)(4) However, at this early stage, there remains a lack of understanding of expected outcomes and challenges that will be encountered performing these tests. This study performed eight quantitative tests out of eleven recommended tests for DR (Table 1): detector response, signal, noise, signal-to-noise-ratio (SNR), nonuniformity, anomalous pixels, correlated artifacts, and modulation transfer function (MTF). The other three tests are visual inspection, large signal capacity, and assessment of correlated image artifacts, or a contrast-to-noise ratio (CNR) test, which requires an additional test object. The results of the CNR test may depend on the test object because the aluminum step phantom recommended by the Task Group can be expected to affect the beam quality differently for different calibration beam conditions. This study also documents interferences that the authors experienced when implementing these tests with mobile DR systems and wireless DR rooms in direct comparison with each vendor's automated QC tests. This study is intended to improve our understanding of the proposed TG-150 detector tests and their relationship to commercial QC metrics.

A. Tests with Carestream wireless detectors
Images were acquired using seven Carestream (Carestream Health Inc., Rochester, NY) DRX-1C (cesium iodide, CsI) detectors with four new Carestream DRX-Revolution and one Carestream retrofit (based on a GE AMX-4, GE Healthcare, Milwaukee, WI) mobile DR systems. These detectors consist of 20 × 24 detector chips. Each detector chip has 128 × 128 detector elements producing images with a matrix size of 2560 × 3072. Three of the seven detectors (Detector 1, 6, and 7) were tested on the same DRX-Revolution mobile unit. Detector 1 was the original detector of the mobile unit but was replaced by Detector 6 after Detector 1 was damaged. Detector 6 was then replaced by Detector 7 one month later when electronic problems developed with Detector 6. The DR detector (Detector 5) on the retrofit system is the same model as those supplied with the DRX-Revolution systems, although its X-ray tube and generator are different. DRX detectors were calibrated according to manufacturer instructions using the built-in calibration procedure: 1. Place the detector in the plastic tray provided by the manufacturer for calibration and testing. 2. Set the tray on top of a lead apron spread on the floor to exclude backscatter from the calibration. 3. Adjust the X-ray tube to a source-to-image distance (SID) of 182 cm. 4. Center the detector in the light field and open the collimator to cover the entire detector. 5. Insert a filter consisting of a 0.5 mm copper sheet and a 1 mm aluminum sheet into the built-in filter holder of the tube housing with the copper side facing the tube. 6. Run the automatic calibration procedure to calibrate the detector with four preset mAs stations at 80 kVp. The four mAs stations include the following: (i) a "calibration condition" (~9 mAs), (ii) a high level exposure (~18 mAs), (iii) an intermediate level exposure (2.8 mAs), and (iv) a low level exposure (~ 0.71 mAs).
The mAs stations varied slightly for each unit depending on the specific tube output determined by service engineer during installation.
After the DRX detectors were calibrated, without repositioning the detectors and X-ray tubes, four flat-field images were acquired at the previously mentioned four mAs stations at 80 kVp to determine the detector nonuniformity, signal-to-noise ratio (SNR), detector response, and anomalous pixels. Images of a Type 53 bar pattern (Fluke Biomedical, Cleveland, OH), oriented in directions parallel and perpendicular to the anode-to-cathode (A-C) axis, were also acquired at the "calibration condition" (~ 9 mAs, 80 kVp) for measurement of the MTF. The TQT test (DIRECTVIEW Total Quality Tool, Carestream) was then performed without repositioning the detector and X-ray tube to exclude any effects from a change in geometry. TQT analysis was performed using built-in software.

B. Tests with GE wireless detectors
The TG-150 tests were performed with four GE FlashPad detectors on two Discovery XR656 systems (GE Healthcare, Milwaukee, WI), one in a general radiographic room and the other in a chest radiographic room. The FlashPad detector consists of eight scan modules and eight data modules. Each module contains 256 scan or data lines (J. Sabol, personal communication, April 2016). Thus, the detector pixels are addressed by the readout electronics in 64 individual blocks, each 256 rows × 256 columns. For consistency of analysis with the Carestream DRX detectors, we treated the FlashPad detectors as if they were composed of a 16 × 16 array of 128 × 128 detector elements, producing images with a total matrix size of 2022 × 2022. However, in our opinion, the underlying detector read-out configuration should not affect our results as long as the images were analyzed in a consistent manner.
The cassette-based wireless detectors are not integrated into the exposure stations in the general radiographic room, but are interchangeable between five possible exposure positions: the wall stand, the table Bucky, the tabletop, the cross-table holder, or arbitrary positions such as for use with a gurney. However, the system maintains only three distinct calibration files for each individual detector for the first three positions: one for wall stand use, one for table Bucky use, and one for tabletop use. In addition to nonuniformities in gain and offset of detector elements, each calibration file also corrects for the exposure gradient that is presented to the detector from the heel effect under each of those specific conditions of exposure. The tabletop calibration is performed with the A-C axis orientation similar to the table Bucky. The calibration file for the tabletop is then used for all the rest of the positions: tabletop, cross-table, and  arbitrary positions. However, for exams in the positions other than the wall stand and table  Bucky position, the system has no control over the detector orientation. Hence, the detectors  could be put at the reversed calibration position during the exams in the tabletop, cross-table,  and arbitrary positions. Each GE FlashPad detector is calibrated using a 20-mm aluminum filter at 80 kVp with the grid removed. However, the calibration SID varies for different detector positions (e.g., 180 cm for wall stand position, 100 cm for table Bucky position, and 127 cm for the crosstable position) and therefore, the calibration mAs setting varies accordingly to compensate for the difference in the calibration SID to yield a similar exposure level to the detector. Thus, with the 20-mm aluminum phantom in the field, the four flat-field images for TG-150 were acquired also with four mAs stations at 80 kVp: 1) the calibration mAs, 2) half of the calibration mAs, 3) a low exposure condition (< 1 mAs), and 4) ~ 180% of the calibration mAs. The high level exposure was performed at 180% instead of 200% because the detectors were partly saturated at 200%. In addition to the flat-field images, two Type 53 bar pattern images were also acquired in two directions using the calibration mAs at 80 kVp, as were done with the Carestream DRX-1C detectors.
Two sets of TG-150 tests were performed on the table Bucky and cross-table detectors in the general radiographic room before their detector calibrations. One set was performed in the detectors' usual position, and the other set was performed after these two detectors' positions were exchanged (between the table Bucky and cross-table positions). Subsequently, without changing the detectors back to their usual positions, detector calibrations were performed on these two detectors, which made the exchange permanent. Then, a third set of TG-150 tests was immediately performed on these two detectors after the calibrations, which tested the detectors in their "new" usual positions. The detectors in the wall stand of the general radiographic room and the dedicated chest room, however, were only tested in their usual positions without immediate detector calibrations.
The Quality Assurance Process (QAP, GE Healthcare) is performed weekly in our clinic and after detector calibrations by service engineers. (5) The results of the QAP testing reside on the system temporarily. The results are automatically harvested remotely via ftp, parsed, and stored in a database for longitudinal analysis. The QAP data used in this study were extracted from the test reports closest to dates corresponding to the TG-150 testing. The QAP test uses the 20-mm Al filter for uniformity tests, but not for images of their Image Quality Signature Test (IQST) Phantom, from which the MTF was calculated. (6) There was a difference in the time intervals between when the QAP was performed and the TG-150 testing of the GE detectors, and this was not the case for the TQT and the TG-150 testing of the Carestream detectors. This larger time interval may have resulted in larger differences between the QAP and TG-150 measurements of the GE detectors than between TQT and TG-150 measurements of the Carestream detectors. This was necessary because the GE units were all in clinical service, affording us less flexibility in scheduling testing and detector calibrations by vendor service engineers in the clinics. However, we ensured that there were no inconsistencies in the values reported in routine weekly QAP data before and after the TG-150 tests; therefore, we do not expect this difference in time between testing affected our comparisons.

C. Image quality metrics and image analysis
For-processing images (aka "ORIGINAL DATA"), which had a linear response to exposure levels, were obtained for the TG-150 tests. These images were harvested from the mobile systems directly through a built-in system function with the linearization function enabled. The forprocessing images from the XR656 systems are minimally processed raw images that were first sent to Philips iSite Enterprise PACS (Picture Archiving and Communication System) and then downloaded. These for-processing images were analyzed using MATLAB software previously validated and reported. (2) Minimum SNR, global and local signal, noise and SNR nonuniformity, MTF, number of anomalous pixels, and detector response were measured from images acquired at the calibration mAs according to the definitions provided in the TG-150 detector subgroup draft report, as described below (E. Gingold, personal communication, July 2012).

C.1 Regions of interest (ROI)
The image analyses for nonuniformity, anomalous pixels, and minimum SNR were based on ROI sampling techniques. The TQT used 128 × 128 ROIs in order to match the dimensions of the detector chips. For TG-150 testing, we analyzed the GE images using 128 × 128 ROIs. (7) Three JPEG image files automatically generated by QAP and stored on the GE acquisition console suggest that different dimensions may be used for different metrics and the specifications of these ROIs were provided by GE ( Table 2, J. Sabol, personal communication, September 2015).
TG-150 analysis of signal nonuniformity requires an array of ROIs to be defined across the flat-field image. This study arranged the ROIs in a two-dimensional array to cover the central 82.5% of the flat-field image, which was similar to the ROI arrangement in the TQT and QAP.
The effect of the ROI sizes on the TG-150 image metrics was investigated using four sets of ROIs ( Fig. 1

C.2 TG-150 metrics of detector nonuniformity
The mean and standard deviation (SD) of pixel values in the i-th ROI for each of the four sets of ROIs are denoted by μ i and σ i and the SNR of this ROI is denoted by . Then, the following metrics were defined by TG-150: where μ μ j,k = 1,2,…8 are means of eight ROIs neighboring to the j-th ROI where σ j,k = 1,2,…8 are standard deviation of eight ROIs neighboring to the j-th ROI are the SNR of eight ROIs neighboring to the j-th ROI.

C.3 TG-150 MTF
An interactive MATLAB subroutine was created (2) to position an ROI over each bar pattern group by having the user designate two lines on the image. One line is drawn along the group indicators of bar pattern (arrow A in Fig. 2) and the other is drawn across the pattern group (arrow B in Fig. 2). The user then positions two circular ROIs to calculate the mean signal behind the pattern (0% modulation) and outside the pattern (100% modulation; arrow C in Fig. 2). The variance of each bar pattern ROI was translated into discrete values of the MTF by means of the square wave response. (8) This method is reported to be valid down to one-third of the cutoff frequency, f c (3.60 lp/mm for the DRX-1C detector). A continuous function was fitted to the MTF values using the Logit method and a weighted least-squares fit. (9) However, since the TQT and QAP only reports MTF at certain frequencies, this study only reported MTF with the bar pattern method at the corresponding frequencies.

C.4 Definition of anomalous pixels in TG-150
Anomalous pixels were defined as pixels such that p i, j -μ i ≥ 3 ⋅ σ i in all four flat-field images, where p i, j is the j-th pixel in the i-th ROI in the TG-150. (2,3) Anomalous pixel analysis was designed to find the pixels that were not identified or adequately corrected by the automated gain and offset calibration and dead pixel correction software.

C.5 Calculation of detector response in TG-150
Detector response in TG-150 is characterized by a simple linear regression of mean pixel values of all sampled pixels of four flat-field images on four mAs stations used to acquire corresponding images. The detector response is usually linear when the "for-processing" images are used, although it may be log-linear depending on how the images are obtained from these DR systems, and for other CR and DR systems. There are consequences for the values of metrics obtained from data that is log-linear in the exposure domain. (10,11) After the logarithmic transform, the distribution of the signal response is altered. However, the signal noise should be inversely proportional to the square root of the exposure. (10) The detector response reported in this study is the slope of the linear regression, which characterizes the detector sensitivity.

C.6 TQT image quality metrics
Based on 128 × 128 ROIs that match the detector chips, the TQT reports minimum SNR, as well as global nonuniformity metrics for signal, noise, and SNR with the flat-field images. However, TQT does not report local nonuniformity measurements. TQT measures MTF using the edge method (12) instead of the bar pattern method, and reports MTF in two directions perpendicular and parallel to the A-C axis at 1 lp/mm, 2 lp/mm, 50% and 95% Nyquist frequency (1.80 lp/ mm and 3.41 lp/mm, respectively) TQT reports values for "Normalized Defects," a proprietary metric, which is a weighted combination of several metrics for detector defects. It serves the same purpose as the analysis of anomalous pixels in TG-150, but the two values are not directly comparable.
TQT also includes contrast-to-noise ratio (CNR) and dark noise measurements. In TQT, CNR values are measured with multiple low-contrast patches included in its image quality phantom. CNR test was recommended by TG-150 but was not included in our limited implementation. Dark noise testing was discussed, but not recommended by TG-150.

C.7 QAP image quality metrics
The QAP software and its associated IQST phantom have undergone an evolution since described by Belanger et al. in 2004. (6) The original version of the QAP reported 17 metrics, as shown in Table 3, including Dynamic Range measurements from the image of a step-wedge, CNR measurements from the image of contrast patches, and resolution nonuniformity from the image of a uniform mesh. Noise power spectrum was also calculated and stored in a JPEG file. The original version was present on the XQi and XRd systems included in our original study. (2) Although the features in the IQST remained the same, the QAP software with the Definium systems stopped reporting the six metrics associated with dynamic range, CNR, and resolution nonuniformity, added an overall pass or fail result, and relaxed some of the performance limits.
With the introduction of the Discovery systems, the features necessary for the dynamic range, CNR, and resolution nonuniformity measurements were removed from the IQST phantom. The QAP for wireless systems now reports pass or fail results for "ARC analog and digital" tests, where ARC is an acronym for Apollo Readout Chip, an application-specific integrated circuit, and Apollo is a name given to its flat-panel X-ray detectors by GE Healthcare. The QAP software no longer reports MTF measurements when performed in the tabletop position where the IQST is not used, as is the same for the GE Optima AMX220, their mobile wireless DR system. The system considers a detector in the cross-table position to be tabletop; therefore, metrics for MTF are not reported for the cross-table positions either. Table 3. QAP image quality metrics.

GE Revolution XQi & XRd GE Definium and XR656
Model  To calculate QAP SNR nonuniformity, the QAP uses Eq. (4) for global SNR nonuniformity. However, it uses the difference of two flat-field images instead of a single flat-field image as the basis for the measurement.
QAP uses the edge method to measure MTF (12) and reports the average MTF of two vertical edges which are parallel to the A-C axis at 0.5 lp/mm, 1 lp/mm, 1.5 lp/mm, 2 lp/mm, and 2.5 lp/ mm. QAP also reports the number of bad pixels but is an incremental measurement relative to the last gain and offset calibration. Therefore, it is not directly comparable to the total number of anomalous pixels assessed by TG-150.
QAP also reports electric noise and correlated noise. Correlated noise and CNR are recommended by TG-150 but not included in our limited implementation.

D. Effect of positioning on TG-150 metrics
With a cassette-based detector used in a bedside application, unless the detector quality tests are performed immediately after the gain and offset calibration, there is limited coincidence between the exposure geometry during calibration and the geometry during testing. To determine the effects of uncertainty in positioning with respect to the A-C orientation during the calibration, four sets of TG-150 tests were performed on one detector using one of the Carestream Revolution systems in the following sequence: 1. Prepare to acquire images for Test Set 1. (Unknown calibration position with repositioning between repeated acquisitions); align the detector to the A-C axis, but without the knowledge of the A-C axis direction during the previous calibration. 2. Acquire four flat-field images at the four mAs stations specified in the Materials and Methods section A. item 6.i.-iv. above. 3. Acquire two bar pattern images in directions parallel and perpendicular to the A-C axis. 4. Reposition the detector and X-ray tube while maintaining the detector orientation relative to the A-C axis. 5. Repeat Steps 2 through 4 above four more times. 6. Prepare to acquire Test Set 2 (at the exact calibration condition without repositioning): calibrate the detector in a new position. 7. Repeat Steps 2 and 3 five times while maintaining the exact calibration position. 8. Prepare to acquire Test Set 3 (at the reversed calibration position without repositioning): rotate the detector 180° from the calibration orientation to the reversed calibration orientation. (The detector was rotated while holding down the plastic tray to immobilize it so that translational displacement of the detector was minimized). 9. Repeat Steps 2 and 3 in the reversed calibration position five times. 10. Prepare to acquire Test Set 4 (at the calibration condition with repositioning between repeats): return the detector to the calibration position. 11. Repeat Steps 2 through 4 five times.

E. Interferences
Interferences from artifacts (structured noise, gradient, image lag, detector saturation, and collimator misalignment) and features in the periphery of the image (penumbra, scatter, and artificially filled-in pixels) were encountered during our efforts to perform these tests in a clinical setting. They were observed using detectors from a variety of DR systems including the two cassette-based systems in the present study and other integrated DR systems (Siemens Aristos FX Plus, GE Revolution XQi, GE Revolution XRd, and GE Definium 8000). These illustrate the range of interferences that may be encountered in practice and their impact on the TG-150 metrics.

F. Statistical analysis
Paired t-tests were performed to compare the image metrics with 128 × 128 ROI to those of other sized-ROIs. The reported p-values were adjusted with Bonferroni correction (13,14) using Stata 11 (StataCorp LP, College Station, TX). Outliers were excluded according to Chauvenet's criterion. (15) Figure 3 shows the median and distribution of nonuniformity metrics, minimum SNR, and the number of anomalous pixels of all detectors. Figure 4 shows differences between these measurements with the 128 × 128 ROI and with the other ROIs. There were no statistically significant differences between the shifted 128 × 128 ROI and 128 × 128 ROI over all metrics, indicating well-calibrated gain and offset. However, compared with the matching ROIs (128 × 128), the statistical analyses indicate that the nonuniformity measurements were higher using smaller ROIs (64 × 64) but lower using larger ROIs (256 × 256). The only exception was that there was no significant difference between 128 × 128 ROIs and the 256 × 256 ROIs for local signal nonuniformity, likely attributable to the large variance with the 256 × 256 ROIs. The analysis of minimum SNR values shows that minimum SNR was lowest with 64 × 64 ROIs, and highest with 256 × 256 ROIs. The number of anomalous pixels is unaffected by ROI size.  Figure 5 shows the effect of positioning uncertainty on the TG-150 metrics that is derived from ROI analysis of flat field images. The first group of repeated measurements, acquired without knowledge of the orientation of the detector during calibration, appeared similar to those obtained in the reversed calibration orientation. This suggests that the detector was likely aligned to the reversed calibration orientation during the first sequence. The second sequence of repeated measurements, acquired after the detector calibration, agreed with those obtained at the exact calibration condition, as expected. However, the second group of repeated measurements varied somewhat and was usually inferior to those acquired at the exact calibration condition.  Unlike the ROI based metrics derived from the flat-field images that were highly sensitive to detector orientation and positioning, as reported by Dave et al., (4) the MTF test, which involves only small areas at the center of the images, appeared insensitive to detector positioning (Fig. 6). The maximum absolute difference in MTF between the repeated measurements after calibration and the one in the exact calibration position was only 1.8%.

C. Comparison of TG-150 tests and TQT
TQT nonuniformity measurements were systematically higher than those of TG-150 (Table 4), although they are similar. The two sets of values were not linearly correlated, which may be due to the fact that TQT uses a subset of 16 × 16 ROIs inside each of the 128 × 128 ROIs and the means of the subset ROIs are used to calculate mean and standard deviation of their parent 128 × 128 ROI, which results in a smaller variance than the mean and standard deviation of individual pixels. In addition to global nonuniformity measurements, TQT also reports minimum SNR which is linearly correlated with TG-150 measurement (R 2 = 0.76).
TQT measures MTF using the edge method instead of the bar pattern method. Despite this difference in methodology, the values were highly correlated with an R-squared value greater than 0.99 (Table 5 and Fig. 7).

D. Comparison of TG-150 tests and QAP
Local and global signal non-uniformities of TG-150 were linearly correlated to those of QAP (Fig. 8, R 2 ≈ 0.60), when one data point for local signal nonuniformity metrics was excluded as an outlier according to Chauvenet's criterion, but their SNR nonuniformity metrics were not (R 2 ≤ 0.07). In addition to the lack of correlation, TG-150 global and local SNR nonuniformity values decreased by 50%-60% after the detector calibration on Det2 and Det3, while in contrast, there was essentially no change in QAP SNR nonuniformity values before and after the calibration (Table 6). Therefore, it seems that TG-150 SNR nonuniformity metrics are more sensitive to image nonuniformity compared to QAP's SNR nonuniformity metrics. As for MTF values (Table 7), the absolute differences between MTF values measured by TG-150 and QAP were up to 24%, although they were linearly correlated ( Table 7, R 2 > 0.95 and Fig. 9).   Table  X-table  X-table  X-table  Table  Table  Wallstnd Number

TG 150
Position Tested Wallstnd Table  X-table  X-table  X-table  Table  Table  Wallstnd Characteristic   Table  table table  table  Table  Table  Wallstand Freq .  TG-TG-TG-TG-TG-TG-TG-TG

E.1 Image periphery
The periphery of a flat-field image can be affected by penumbra, scatter, collimator misalignment ( Fig. 10(a)) and artificially filled-in pixels ("padding," Fig. 10(b)). (3,8) An example of the last type occurs in images harvested from a Carestream Revolution mobile system. The system crops the four edges during postacquisition processing to produce "for-presentation" images but pads these edges with the maximum available value in harvested "for-processing" images (IEC nomenclature -ORIGINAL DATA). These extreme pixel values on edges can greatly increase the values of nonuniformity and anomalous pixels metrics when they are included for the analysis, even if the central field of view is fairly uniform.

E.2 Structured noise
Structured noise could adversely affect the detector nonuniformity metrics. (3) An improperly calibrated image may show structured noise such as detector blocks and the outline of Automatic Exposure Control (AEC) chambers ( Fig. 11(a) and (b)). These images should be excluded during the artifact evaluation and a service call should be placed for the detector. Some systems have fixed grids (e.g., GE Revolution XQ/i). The grid lines as well as the detector structure might appear in high exposure flat-field images (Fig. 11(c)), although they were not as noticeable with low exposure. Performing nonuniformity analysis on the subtraction of two images at the calibration condition is recommended by TG-150 for this scenario. However, images should be carefully evaluated for correlated noise since the subtraction will also eliminate them and result in a biased nonuniformity measurement.

E.3 Detector position relative to the A-C axis at calibration
Our results demonstrate that the relationship between the A-C orientation of the detector during calibration and its orientation during testing can affect the nonuniformity metric. This is not only an issue for the Carestream DRX wireless detectors, but for any cassette-based detectors that are not permanently integrated into the detector support assemblies including the GE Discovery XR656 wireless detectors. Because the detectors can be placed in the reversed calibration orientation (worst-case scenario) the gradient from the heel effect can be doubled (Fig. 12). We found that the magnitude of the gradient was 33% from one edge of the detector to the other, which was large enough to elicit complaints from radiologists viewing clinical images. The nonuniformity values, especially global nonuniformities for noise and SNR, increased as much as 186% from this effect (see Det2 Table Bucky vs. cross-table position, Table 6).

E.4 Image lag
Another phenomenon affecting nonuniformity is image lag. If the flat-field images are acquired immediately after acquisition of images with high contrast objects or background, such as lateral chest radiography with overexposed background (Fig. 13), the lag image will appear in the flat-field images. Without appropriate caution, inverse lag images can be burned into the gain and offset correction maps when detector calibration is performed shortly after the acquisition of images with high signal levels. In this scenario, the values of nonuniformity and anomalous pixel metrics are artificially increased.

E.5 Saturation
TG-150 suggested exposure levels for the additional flat-field exposures relative to the calibration condition. These exposure levels worked well for all systems that we tested, except the GE XR656. The calibration condition for the XR656 was increased to 40 mAs for the wall stand position with SID = 180 cm. When the mAs was doubled, some of the detectors were either partially or fully saturated (Fig. 14). In this scenario, the detector response will be skewed.
Although the large signal capacity test recommended by TG-150 was not performed in our limited implementation, which only evaluated nonuniformity metrics from images with high exposures, these metrics would also be adversely affected. Therefore, care should be exercised to ensure that the mAs selected for the high exposure flat-field image does not saturate the detector.

IV. DISCUSSION
Images should be carefully inspected before performing quantitative analyses since the presence of artifacts can render these analyses ineffective. Dave et al. (3) reported that anomalous pixel detection would fail if an unusually large number of defects or structured noise was present, leading to a large standard deviation in an ROI such that the criteria of mean ± 3•σ for anomalous pixels is no longer sufficiently sensitive. Lag images, temporary signals from past exposures, are superimposed on subsequent acquisitions such that quantitative measurements do not reflect true detector performance. Similarly, if the test images do not contain the entire field of view because of errors in autopositioning, misaligned collimation, excessive edge padding, or penumbra, the metrics do not correctly represent detector performance, either. Instead, they suggest system malfunctions. Consequently, their presence affects the quality of diagnostic images and, therefore, eventually may affect diagnosis. Thus, inspecting images carefully should be the first step of assessing the test images and, if unusual artifacts are identified, testing should be repeated only after the system is corrected. Nitrosi et al. (11) have published image quality metrics for detectors from four different manufacturers including older DR systems from Carestream and GE. Their work was based on a quality control protocol for direct DR systems advanced by the Italian Association of Physicists in Medicine or AIFM. (16) Their metrics included local and global nonuniformity of signal and SNR with definitions similar to TG-150. The numerators of their metrics are identical to TG-150 metrics in Eq. (2), (4), (5), and (7), however the denominators are different. The local nonuniformities are normalized to the local ROI mean rather than the global ROI mean. The global nonuniformities are normalized to the range of ROI means instead of the global ROI mean. Both differences tend to make the numerical values of the metrics smaller. Compared with our TG-150 measurements, their values are in the same order of magnitude as ours, despite differences in hardware and beam quality when the detectors are properly calibrated. However, when the detectors were miscalibrated, TG-150 reported local SNR nonuniformity of 30% ~ 50% and global SNR nonuniformity of 40% ~ 80%, which were much higher than the 4.6% ± 1% and 15.5% ± 1% reported for these metrics by Nitrosi et al. Hence, it seems that the detector miscalibration affects TG-150 metrics of SNR nonuniformity more than AIFM's.
MTF values determined by TG-150 testing were consistent with studies from other groups. One group recently reported MTF about 20% at spatial frequency of 2.5 lp/mm with GE FlashPad and Carestream DRX-1C, (17) compared to approximately 22% in our results. The same group also reported 1.25 lp/mm and 2.55 lp/mm for 50% and 20% MTF with Carestream DRX-1C detectors using a RQA5 beam with a 0.5 mm Cu + 1.0 mm Al filter, (16) which were similar to 1.1 lp/mm and 2.6 lp/mm at the same MTFs measured in our study, despite a difference in kVp (70 vs. 80 kVp).
The image quality metrics varied with the ROI sizes, as indicated by Figs. 3 and 4. The same metric varied up to threefold across the four sets of ROIs. For example, the median of global SNR nonuniformity for the 64 × 64 ROIs was 14.2% while the median for the 256 × 256 ROIs was 4.2%. The metrics for nonuniformity were usually higher for a smaller set of ROIs because extreme pixel values had a greater influence on the mean and variance when the ROIs were smaller. For the same reason, the minimum SNR was lower for a smaller set of ROIs. In the optimum case, the ROIs would be sized and aligned to coincide with the electronic components reading out the detector elements so that the metrics would be representative of specific portions of the detector subassembly.
The positioning tests showed that detector orientation affects all of the image quality metrics except for the MTF. After the uncertainty of orientation was ruled out, the repeated measurements were consistent with the ones at the exact calibration location; however, the variation in repeated measurements for the same detector from repositioning (Fig. 5) was greater than or equal to the variation among seven different detectors tested at the exact calibration location (Table 4). Repositioning not only introduced additional uncertainty, the repeated measurements were always worse than those measured at the exact calibration location.
Performing the tests immediately after calibration in the exact same location yields the optimum results and should ideally constitute the baseline values for testing. However, since the exact exposure location is unlikely to be reproduced unless the wireless detector is held in an exposure station and the X-ray tube is automatically repositioned to the proper location, subsequent tests are always going to be disadvantaged. These values may be misinterpreted as indicative of degraded performance. For the purpose of establishing a baseline, it may be wiser to use the mean of several repeated measurements after repositioning. Subsequent measurements to assess constancy of performance are more likely to reflect the mean than the optimum results at the exact calibration position. This problem is particular to the cassette-based DR detectors that do not have exposure position controls, such as bedside or tabletop applications, rather than the integrated DR detectors.
Conversely, the problem for integrated DR detectors and cassette-based detectors with position controls is that autopositioning must be accurate and reproducible. Otherwise, test metrics are neither accurate nor precise.
The metrics reported by both TQT and TG-150, in general, agree with each other. However, because the size of the ROIs and positioning affect the measurement, caution should be taken when comparing their numerical values.
The metrics reported by both QAP and TG-150 agreed with each other for local and global signal nonuniformity measurements, but not for SNR nonuniformity and MTF. Two possible causes exist for lower global SNR nonuniformity values reported by QAP. First, QAP measured noise by the subtraction of two flat-field images at the calibration condition, while, TG-150 measured noise using only one image at the calibration condition. The subtraction eliminates the contribution of structured noise and gradients, and would result in lower SNR nonuniformity, as we observed. Second, the ROI size, as illustrated with the DRX-1C detectors, might have affected the results. The ROIs used by QAP for the three nonuniformity metrics were larger ( Table 2, 150 × 150) than those used by our implementation of TG-150 (128 × 128). Our observation of the QAP measurements agreed with our findings with the DRX-1C (Figs. 3 and 4) -larger ROIs yield lower values for nonuniformity. Further investigation is needed to determine the relative magnitude of these effects.
Although MTF measured by TG-150 tests and QAP were highly linearly correlated (R 2 > 0.95), their absolute differences were as much as ~ 24%. The difference in the methods of measuring MTF (TG-150's bar pattern method versus QAP's edge method) was unlikely to be the cause because TQT also used the edge method and its MTF values agreed with those measured by the TG-150 bar pattern method. In the case of the QAP, the lower MTF reported by the TG-150 method might be explained by the difference in the beam quality between the two methods. The TG-150 test used a 20-mm aluminum filter to harden the beam to the calibration condition, which decreased the contrast compared to the unfiltered beam used in the QAP exposures of the IQST (aka, MTF phantom) for MTF measurements.

V. CONCLUSIONS
This study shows that the values of common image metrics employed by TG-150 and TQT, such as nonuniformity, minimum SNR, and MTF, are consistent. Comparisons between TG-150 and QAP indicate that their local and global signal nonuniformity values are similar and correlated, but not their SNR nonuniformity and MTF values. The inconsistency might be explained by the differences in methods, beam quality, and ROI used by the QAP and TG-150 tests. However, this study suggests that TG-150's SNR nonuniformity metrics are more sensitive to detector nonuniformity, compared to the QAP. The study also suggests that same ROI size should be used for consistent results since nonuniformity metrics and minimum SNR are dependent on ROI sizes. Detector testing at the exact calibration location after calibration would reduce uncertainty, but if this is not feasible, repeated measurements should be performed to establish a baseline. All images should be carefully reviewed before analyses to avoid any nonuniformity contributed by artifacts. Overall, TG-150 provides a good guide for digital detector testing, but the testing and analysis should be performed in a consistent manner.

ACKNOWLEDGMENTS
The authors would like to express their gratitude to Dr. Eric Gingold and the members of the Image Receptor Subgroup of TG-150 for their development of the proposed tests, Dr. John Yorkston for helpful discussions on TQT test results, Dr. John Sabol for helpful information and discussions on the GE QAP test, and Dr. R. Benton Pahlka and Steve Bache for their help with data collection.

COPYRIGHT
This work is licensed under a Creative Commons Attribution 3.0 Unported License.