Comparison of six phantoms for entrance skin dose evaluation in 11 standard X‐ray examinations

Entrance skin dose (ESD) is an important parameter for assessing the dose received by a patient in a single radiographic exposure. The most useful way to evaluate ESD is either by direct measurement on phantoms using an ionization chamber or using calculations based on a mathematical model. We compared six phantoms (three anthropomorphic, two physical, and one mathematical) in 11 standard clinical examinations (anterior‐posterior (AP) abdomen, posterior‐anterior (PA) chest, AP chest, lateral (LAT) chest, AP lumbar spine, LAT lumbar spine, LAT lumbo‐sacral joint, AP pelvis, PA skull, LAT skull, and AP urinary tract) for two reasons: to determine the conversion factors to use for ESDs measured on different phantoms and to validate the mathematical model used. First, a comparison was done between the three anthropomorphic phantoms (Alderson Rando, chest RSD‐77SPL, and 3M skull) and the two physical phantoms (Uniform and AAPM 31); for each examination we obtained “relative entrance skin dose factors.” Second, we compared these five phantoms with the mathematical phantom: the overall accuracy of the model was better than 14%. Total mathematical model and total ionization chamber uncertainties, calculated by quadratic propagation of errors of the single components, were estimated to be on the order of ±12% and ±3%, respectively. To reduce the most significant source of uncertainty, the overall accuracy of the model was recalculated using new backscatter factors. The overall accuracy of the model improved: better than 12%. For each examination an anthropomorphic phantom was considered as the gold standard relative to the physical phantoms. In this way, it was possible to analyze the variations in phantom design and characteristics. Finally, the mathematical model was validated by more than 400 measurements taken on different phantoms and using a variety of radiological equipment. We conclude that the mathematical model can be used satisfactorily in ESD evaluations because it optimizes available resources, it is based on direct measurements, and it is an easy dynamic tool. PACS number(s): 87.66.Xa


I. INTRODUCTION
Entrance skin dose (ESD) is an important parameter in assessing the dose received by a patient in a single radiographic exposure. The European Union has identified this physical quantity as one to be monitored as a diagnostic reference level in the hopes of optimizing patient dose. (1,2) It is possible to evaluate ESD either by direct measurements (on suitable phantoms using ionization chambers or on patients using thermoluminescent dosimeters, TLDs) or using mathematical model calculations based on the X-ray tube output. (1) Using TLDs is time-consuming in large hospitals. Therefore, in this paper ESDs were evaluated using both measurements taken by ionization chambers and values calculated by a mathematical model; this allowed us to study the accuracies inherent in different experimental setups. To evaluate the ESD, it is necessary to use "standard phantoms," (1,3) and it is important to know the difference between them because some may be bought commercially or "home-made." (4,5) There is no advice in the literature to help with phantom selection in different clinical situations. (6) Therefore, two or more similar phantoms are often available in medical physics departments for dosimetric measurements in conventional radiology. Moreover, these phantoms are not always available simultaneously (for instance, one phantom may be being used by someone else); it is also necessary to have conversion factors between different phantoms. This paper reports a comparison between ESDs measured by five phantoms in 11 standard clinical examinations (anterior-posterior (AP) abdomen, posterior-anterior (PA) chest, AP chest, lateral (LAT) chest, AP lumbar spine, LAT lumbar spine, LAT lumbo-sacral joint, AP pelvis, PA skull, LAT skull, and AP urinary tract) in order to have "relative ESD factors" (REFs) between each phantom and the others. These REFs can be used in a normal dosimetry routine where the ESD measurements have been done using different phantoms for the same kind of examination, performed either after long time intervals with the same radiological system (one X-ray tube and one generator in a well-identified radiological room) or using similar clinical technique factors with different radiological equipments (different X-ray tubes and generators but similar types of radiological apparatus). In addition, a comparison with ESDs calculated by a mathematical model (which can be considered as a sixth phantom) is made, because this is another possibility-easier but less accurate-for this kind of evaluation.

II. MATERIALS AND METHODS
As explained by Moores, (7) the phantoms for dose assessment can be anthropomorphic (they possess aspects of anatomical structure), physical (they do not attempt to reproduce anatomical details directly and may range from a single block of material to more sophisticated structures), or mathematical (they may be simple mathematical models to represent the interaction of X-ray beams with biological tissue in order to assess ESD or other dosimetric parameters). In this study, we first compared three anthropomorphic and two physical phantoms. Then these five phantoms were compared with one mathematical phantom, as described below.

A. Anthropomorphic and physical phantom comparison
The standard examinations, clinical technique factors used (with respective standard deviations), and number of both radiographic systems and measurements considered in this study are shown in Table 1. The technique factors reported for each examination are the following: the average kilovolt peak (kVp), milliampere × seconds (mAs), and focus-to-phantom surface distance (FSD) values used clinically in our hospital. Nearly all the X-ray generators were three-phase (6 or 12 pulse) models or high-frequency generators. All radiographic systems were controlled by an ISO 9001-2000 certified quality assurance program, which provides good equipment performance according to acceptance, status, and constancy tests. In particular, to control the reproducibility and the linearity of the X-ray tube output, three measurements freein-air at two different kVp values and at five different mAs settings (total: 30 output measurements, 15 at 80 kVp and 15 at 100 kVp, for each X-ray tube) were taken. These measurements were taken yearly or when a tube was replaced. The acceptability limit for both reproducibility and linearity was 10%.
The direct ESD measurements were made on five different phantoms, the first three anthropomorphic ( Fig. 1): Alderson Rando (ALD); chest RSD-77SPL (CHE), Radiology Support Device, Long Beach, USA; skull 3M (SKU); uniform (UNI), 25 cm × 25 cm × 20 cm polymethyl methacrylate (PMMA); American Association of Physicists in Medicine (AAPM) phantom as described in AAPM Report No. 31(A31), (8) which combines PMMA with aluminum sheets and air gaps, in order to simulate various anatomical parts. For each measurement, the phantoms were positioned as a standard patient, and for each phantom the same technical parameters (kVp, mAs, FSD, and X-ray field size as indicated by senior radiologists) were selected. The ESDs were measured as indicated in other papers, (9) by positioning at the surface of each phantom on the beam central axis an ionization chamber model 90X6-6 connected to a Radiation Monitor Controller model 9010 (Radcal Corporation, Monrovia, CA).
The experimental setup was chosen so that the variables possibly affecting the results were carefully controlled: the detector did not significantly perturb the photon fluence on the phantom surface underneath it; the cross-sectional area of the detector was significantly less than the area of the irradiated phantom; and the FSD was measured by taking into account the position of the focal spot inside the X-ray tube.
All instruments are calibrated yearly, with the calibration traceable to an SIT (National Calibration Service in Italy) center. To verify kVp accuracy, direct measurements during the exposures were taken with a noninvasive kVp meter model Mult-O-Meter 510 (Unfors Instruments, Billdal, Sweden).

B. Mathematical versus anthropomorphic and physical phantom comparison
Every ESD measured on anthropomorphic and physical phantoms was compared with the corresponding ESD calculated with the mathematical phantom.
To determine the output K of a diagnostic X-ray tube (in terms of absorbed dose to air or exposure free-in-air), many mathematical models have been suggested. (10)(11)(12)(13)(14) In this study, the model proposed by Harpen (14) was adopted: where parameters α and β depend on the type of X-ray generator, anode material, FSD, and X-ray tube filtration.
Equation (1) gives K as a function of kVp and mAs, by taking only two X-ray tube output measurements at two different voltages. To this end, the average of the 15 measurements taken at 80 kVp and the average of the 15 measurements taken at 100 kVp during the reproducibility and linearity quality controls mentioned above were used as the two output values.
Harpen's formula gives the absorbed dose to air, free-in-air; therefore, to determine ESD, some corrections must be made for backscatter factors (BSF). European guidelines (1) propose a simple and generic value for conventional radiography: 1.35.
For each phantom j and examination k, the integral accuracy A phantom j of the mathematical model relative to a single phantom and the differential accuracy A singlepoint relative to a single measurement were calculated. The A phantom j relative to examination k is defined to be (2) where N j is the number of measurements made with the phantom j in each examination k, MOD is the ESD value calculated by the mathematical model, and PHA is the ESD value measured on the phantom j. In this way, it was possible to calculate A ALD , A CHE , A SKU , A UNI , and A A31 for each specific phantom and examination.
The A singlepoint is defined by the equation (3) This is analogous to but slightly different than Eq. (2): the numerator of Eq. (2) uses the absolute value of the difference in the ESDs, whereas the numerator in Eq. (3) does not. To achieve better accuracy of the mathematical model, for each examination every MOD value was recalculated using the more accurate BSF values given in Harrison (15) and Grosswendt (16) (see Table 2 for examples of typical values used), obtaining new integral accuracies NEW A ALD , NEW A CHE , NEW A SKU , NEW A UNI , and NEW A A31 for each phantom.
The "overall accuracy of the mathematical model" A OVE relative to all the phantoms and all the examinations was also calculated. It is defined to be (4) where N jTOT is the number of measurements made with the phantom j in all examinations, and M is the number of examinations in which the phantom j was used. When the more accurate BSF values are used, a new overall accuracy NEW A OVE is obtained.

III. RESULTS
All the factors presented in this study are "relative" values. Therefore, to obtain a general idea and a comparison of the "absolute" values, consider Table 2. The typical measured values of ESDs at different typical anthropomorphic phantoms, the values obtained from the mathematical model (using the BSF values shown), and typical values of α and β of Eq. (1) are listed in Table 2 based on values in Table 1. The value of K depends linearly on the α value, which is related to a clinical technique, in particular, to the inverse of FSD, and exponentially on parameter β, which is more sensitive to intermachine variations and has almost a constant value.
The mean values of REFs measured on anthropomorphic and physical phantoms are given in Table 3; they too are based on Table 1. For each measurement the ratio between the ESD measured on Phantom1 to the ESD measured on the Phantom2 (in the columns Phantom1/Phantom2) was calculated. For each examination, the averages of the ratios taken with all radiological systems used are shown in Table 3. When a phantom is not appropriate for an examination (e.g., a chest phantom for an AP abdomen examination), N/A is indicated.
The reproducibility and the linearity of the output were better than 10% (typically, about 5%) for all radiographic systems. The kVp accuracy was ±3% for all measurements.
In Table 4 (fourth column) the accuracies A ALD , A CHE , A SKU , A UNI , and A A31 of the mathematical model relative to the other phantoms are given, based on Table 1. In this case, the BSF value used is 1.35. The averages of all accuracies are also provided to show how different phantoms compare in general, even though not all phantoms are suitable for all examinations.
The A OVE is better than 14%. This value is of the same order reported in other works. (11,13,14) Figures 2(a) to (e) show the accuracies A singlepoint of the mathematical model relative to the other phantoms plotted as a function of the kVp used. Using Eq. (3) to calculate this parameter should better visualize and take into account the displacements around the 0 value (which would mean perfect correspondence between phantoms).   Total mathematical model uncertainty, estimated on the order of ±12%, was calculated by quadratic propagation of errors of the single model components, that is, using only one BSF and the possible variation in X-ray tube output owing to the time elapsed from the latter quality control. Total ionization chamber uncertainty, calculated in the same way, was estimated to be of the order of ±3% due to both the calibration factor given by the SIT center and the experimental FSD measurements. The use of only one BSF value was the most significant source of error. The idea of reducing it to a wider set of BSF values has been adopted.
Many papers have been published regarding the calculated or measured BSF. (15)(16)(17)(18)(19)(20)(21) Listed in the fifth column of Table 4 are the new accuracies and their averages relative to the anthropomorphic and physical phantoms ( NEW A ALD , NEW A CHE , NEW A SKU , NEW A UNI , and NEW A A31 ) recalculated using the BSF from Harrison (15) and Grosswendt, (16) which are listed as a function of both the field size in the various examinations and the kVp values used. The NEW A OVE has now improved, better than 12%.

IV. DISCUSSION
The backscatter factor for a simple water phantom can be written as (5) where X (w) is the exposure at the surface of the water phantom, X (free) is the exposure at the same point in space without the phantom, and [µ tr /ρ] (w) w,air , [µ tr /ρ] (f w, ree) air are the ratios of the mass-energy transfer coefficients for water and air in the presence of scatter medium and in free space, respectively.
In Eq. (5), the ratios of the energy transfer coefficients are for the same media, but they cannot be cancelled because they are determined for different photon spectra: [µ tr /ρ] (w) w,air is averaged over the spectral energy fluence distribution of the beam at the phantom surface, and [µ tr /ρ] (f w, ree) air is averaged over the spectral energy fluence distribution of the primary beam without the phantom. If phantoms different from water are used, an analogous relationship of course can be applied. These theoretical considerations can explain why the backscatter factors and the ESDs are very strongly dependent on the material, shape, and size of the different phantoms.
Three of six phantoms used in this study are anthropomorphic. The ALD phantom is used more in radiotherapy, but it can be used at diagnostic energies as well. (22) The CHE and SKU phantoms are more useful in radiology because they are optimized for image quality studies. The two physical phantoms can be more easily available than the anthropomorphic phantoms, but the latter better simulates the full scatter properties of human tissue. The ALD, CHE, and SKU phantoms have soft-tissue-equivalent material, which almost exactly duplicates water (in radio-absorptive and scatter properties), and synthetic skeletons with cortical bone, which is radio-equivalent to natural bone and matches the volumetric electron densities and the mass attenuation coefficients of ICRU-44 (23) across the entire energy range of diagnostic energies. It is therefore interesting to select for every examination an anthropomorphic phantom to be used as a "gold standard" relative to the physical phantoms and to analyze the physical versus anthropomorphic phantoms comparisons.
The gold standard for the PA chest, AP chest, and LAT chest is the CHE phantom. In every measurement it gives lower ESDs than physical phantoms. A possible explanation for this result is that the backscattered radiation is less when lungs are involved, and the nonanthropomorphic phantoms are not able to take into account this physical process. Moreover, when automatic exposure controls (AECs) are used, the phantoms can have a significantly different "equivalent thickness" (24) and, therefore, a different ESD value. In AEC systems we investigated, there are usually three detectors located as the vertices of a triangle. Every detector can be used alone or together with the others, depending on the examination. For example, in PA chest the two lateral detectors are used; in LAT chest the central detector is used; and in AP pelvis all three detectors are usually used. Therefore, in chest examinations, if in AEC systems lung-centered detectors are used, the physical phantoms, which are soft-tissue-equivalent in terms of primary transmission, will transmit less radiation to a "lung-field detector" than would anthropomorphic phantoms, and the AEC will terminate exposure too late. This effect will lead to (ESD physical phantoms)/(ESD CHE) ratios being higher than one for chest examinations in Table 3.
The reference for PA skull and LAT skull is the SKU phantom. In every measurement it gives lower ESDs than physical phantoms. In Figs. 2(a) to (e) at lower kVp values there is a large spread of the A singlepoint values because the dependence of ESDs on different atomic num-bers of phantoms is stressed. The kVp values used in clinical practice for skull examinations are the lowest in all examinations considered. At these low energies the interactions of photons in the phantoms are modulated differently by the atomic numbers of soft tissue and bone. In the case of soft tissue, the Compton effect predominates; in the case of bone, the Compton and photoelectric effects are approximately equivalent. (25) For this reason, the backscattered radiation is more in the soft tissue than in the bone. Therefore, the ESD in physical phantoms (almost entirely made in PMMA) is higher than in SKU phantoms (made of rubber and bone equivalent). Another factor that affects the different responses is this: a rounded surface such as that on the SKU phantom reduces the amount of scattered radiation, contributing to ESD.
For the remaining examinations (AP abdomen, AP lumbar spine, LAT lumbar spine, LAT lumbo-sacral joint, AP pelvis, and AP urinary tract), the gold standard is the Alderson Rando phantom: it gives (except two cases) higher ESDs than physical phantoms. A possible explanation is that in these examinations, where the energies are higher, the more accurate composition of anthropomorphic phantoms takes better into account the higher backscatter radiation. Between the two physical phantoms, A31 is slightly better because of its structure, more sophisticated relatively to the simpler uniform phantom.
Note in Table 3 that, in general, for every couple of phantoms, the REFs Phantom1/Phan-tom2 are not always equal to the inverse of the corresponding REFs Phantom2/Phantom1. The explanation is that the REFs reported in Table 3 are averages of ratios of ESDs measured on many different radiological systems, where the same kind of examination is performed with different technique factors, and the phantom's response is nonlinear at different energies. From a strictly theoretical point of view, this is a limitation because all the examinations should be performed with a clinical protocol that gives exactly the same ESD to the standard patient. However, from a more practical and realistic point of view, that aim is seldom reached clinically because it is dependent on radiological equipment characteristics as well as the working procedures of the operators. Therefore, the above-reported "theoretical limitation" becomes an interesting "working hypothesis." The number of radiological systems, reported in Table 1, includes a wide range of equipment typologies used in clinical practice, and, together with the high number of measurements made, it should give a consistent reliability in all operative clinical conditions to comparisons presented in this paper.
Since the individual radiographic systems had reproducibility and linearity values better than 5%, the variability of ESD and then of REF values is most likely attributable to variations in phantom material, design, and characteristics, which are exactly the distinctive features to study.
As far as the mathematical phantom is concerned, the model has been validated by more than 400 ESD calculations compared with measurements taken on different phantoms and radiological equipment. An overall accuracy better than 12% was obtained, comparable with data reported in literature. Therefore, in ESD calculations, this model can be used satisfactorily because it presents many advantages. First, it allows the optimization of available resources by using data already taken during quality controls. Second, the ESDs are calculated theoretically with a general formula but are also based on direct measurements, so that every radiological system keeps its own characteristics. Finally, the model is very easy to implement using an electronic spreadsheet and to use as a dynamic tool.

V. CONCLUSIONS
It is difficult to make comparisons between patient dose results in different studies because different measuring phantoms are used. In this respect, the comparison performed in the present study could be useful in order to have data comparable with those taken both at different times and with different phantoms, in the same or in other radiology departments. In this way, the phantom used most often for ESD measurements in diagnostic radiology-the uniform phantom-can be compared with other more sophisticated but less available phantoms, and each of them can be compared with a simple and easy-to-use mathematical model.
This study shows that in measuring ESD values, the phantoms are not as "standard" as the medical physicist wishes, but that it is possible to take into account the relative differences in order to have more comparable and consistent data.