Appraisal of testicular volumes: volumes matching ultrasound values referenced to stages of genital development

Background Testicular volumes obtained with orchidometers or external linear measurements in the scrotum (centimeter ruler or calipers) grossly over-estimate ultrasound volumes, have much variability and may not be accurate or reproducible. The reference of the values obtained by orchidometers or US, to age or Tanner stages is not useful to determine the normal values for stages of puberty, because overlapping of ages and values. Pubertal development is determined by two events, genital and pubic hair development, that should be analyzed independently because one could be out of step with the other. The ultrasound (US) measurement of testicular volumes is the gold standard but is somewhat inconvenient, because it requires another procedure and, mainly, is costly. The solution of the problems would be to determine testicular volumes matching US values, from the width of the testis obtained in the scrotum with a centimeter ruler, by formulas recently described, and to reference them to the stages of genital development. Methods The width and length of the testes in the scrotum with a centimeter ruler were obtained in 159 study subjects, in different stages of genital development and adults, for a total of 318 testicular determinations, from the age of 3 to 34 years. The width obtained in the scrotum was corrected by subtracting the values of the double scrotal skin (ss). The formulas were then applied and the testicular volumes matching US values were calculated. The volumes and the range of ages for different stages of genital development were determined. Penile measurements were obtained in 145 subjects and pubic and other hair recorded. Paired and unpaired 2 tail student t-test was used to compare the means of the different groups expressed as means and SD and, in addition the Wilcoxon rank sum test and Bootstrap methods for the testicular volume groups. A p value of 0.05 or less was considered significant. The Institutional Review Board (IRB) of Nationwide Children’s Hospital determined that this study did not require IRB approval. Results With a simple measurement of the width of the testis in the scrotum, with a centimeter ruler, testicular volumes matching US values were calculated and normative values for each stage of genital development were determined. Conclusion This information should solve present problems. Electronic supplementary material The online version of this article (doi:10.1186/s13633-017-0046-x) contains supplementary material, which is available to authorized users.


Background
The determination of the testicular volume is of considerable importance to assess the onset, progression and disorders of puberty, including the effect of cryptorchidism and orchiopexy, hypogonadism with respect to tubular function, the effect of a varicocele, abnormal testicular development, damage to the testis by torsion or inflammation, compensatory hypertrophy, detection of Klinefelter syndrome, effect of the administration of sexual steroids or drugs, and, in adults, assessment of fertility. Low testicular volume correlates with tubular size, function and spermatogenesis [1].
In addition, the testicular volume is of interest to assess macroorchidism, such as in Fragile X syndrome, FSH secreting pituitary macroadenomas, immunoglobulin superfamily member 1 (IGSF1) deficiency syndrome [2], long-standing hypothyroidism, adrenal rest cell tumors in congenital adrenal hyperplasia, lymphomas and so on.
A number of clinical methods have been used for the measurement of testicular volumes in the scrotum. Some use an ordinary ruler or sliding calipers [3][4][5], others use orchidometers [6][7][8][9][10]. Testicular volume is usually measured using the Prader orchidometer. All the clinical methods calculate the volumes by the ellipsoid equation Width 2 x Length x π 6 (W 2 x L × 0.52), and overestimate US volumes.
Ultrasound measurements have a high degree of accuracy and reproducibility and are the standard for quantitation of testicular volume [11,12].
The volumes obtained by ultrasound have been calculated by different ellipsoid equations. Some have used only the width (W) and length (L) of the testes, W 2 x L x π 6 that when resolved is W 2 x L × 0.52 = Volume. More frequently they have included the height (H), W x H x L × 0.52 and others, recently, have used the constant 0.71 (suggested by Lambert [13]), to closely match the "true" testicular volumes obtained by water displacement, W x H x L × 0.71 = Volume [14,15].
Formulas, equivalent to the ellipsoid equations used, with inclusion of the values observed in ultrasound measurements, were developed to approximate or match ultrasound volumes, with corrections of the width and length of the testis obtained in the scrotum, to avoid the inclusion of the scrotal skin (ss) and epididymis; the Width minus the double scrotal skin (W-ss), to match the US width.
For the US equation W x H x L × 0.52, the equivalent formula would be (W-ss) 3 [16,27] There are Problems 1. The testicular volumes obtained with orchidometers grossly overestimate ultrasound values, mainly because the inclusion of the scrotal skin and epididymis, have much variability and may not be accurate or reproducible; this has been amply documented [9-12, 17, 18]. The ultrasound measurement, however, is somewhat inconvenient because it requires another procedure and, mainly, it is costly. It does not appear practical or reasonable to use ultrasound to assess the onset and progression of puberty or to assess some of the other conditions that have been mentioned. 2. The reference of the values obtained by orchidometers or ultrasound to age or Tanner stages is not useful to determine the normal values for stages of puberty, because overlapping of ages and values.

Solutions
Pubertal development is determined by two events: pubic hair (from adrenarche and gonadarche) and genital development (testes, scrotum and penis from gonadarche). Because one could be out of step with the other, Tanner [19] recommended for genital and pubic hair development to be analyzed independently.

Methods
The US observations in our hospital of the width, height, length, height/width and length/width ratios and volumes of 110 testes, from 55 children from 1 month to 17 years of age, were reviewed. The width and length of the testis in the scrotum, with a cm ruler, were obtained in 159 study subjects in different stages of genital development and adults for a total of 318 testicular determinations, from the age of 3 to 34 years; a cross-sectional study. The width of the testis obtained in the scrotum was corrected by subtracting the values of the double scrotal skin. The formulas were then applied and the testicular volumes matching ultrasound values were calculated. The volumes and the range of ages for different stages of genital development were determined. The 159 study subjects consisted of 42 normal and 117 patients attending the endocrine clinic who had normal growth and gonadal development. Penile measurements of the length, while gently stretched, from the pubopenile skin to the tip of the glans, and of the width, at the mid-shaft, after smoothing the skin were obtained in 145 subjects. Pubic and other hair (axillary, inner thigh, linea alba, abdominal, chest and facial) were recorded.
The Institutional Review Board (IRB) of Nationwide Children's Hospital determined that this study is a retrospective record review for quality improvement in accuracy, diagnosis and quality of care for appraisal of testicular volumes, and does not meet the definition of human subjects research under 45 CFR part 46, and consequently, this study does not require IRB approval.

Statistical analysis
Paired and unpaired 2-tail student t-test was used to compare the means of the different groups expressed as means and standard deviations (SD).
The testicular volumes of genital stage (G) groups and Adults had non-normal distribution. Data transformation to normality was not possible, because their distribution varied by group. So, the data is presented as the median and interquartile range, instead of the mean and standard deviation. Pairwise group comparisons of adjacent G groups were run using three methods: the Wilcoxon rank sum test [20] and the Bootstrap method [21] that do not assume the data have normal distribution, and the two-sample Student t-test, because although the data deviate from normality, the group sizes (between 22 and 100) may be large enough to tolerate some degree of non-normality. A p value of 0.05 or less was considered significant.
All of the three tests for G1 vs G2, 2 vs. 3, 3 vs. 4, and 4 vs. 5 showed a highly significant difference for these pairs of groups, p < 0.00001. There was not difference between G5 and Adults p > 0.05, <0.1.
All analyses were run in SAS 9.4 (Cary, NC).
Results-analysis of the problems 1. Overestimation: A number of orchidometers have been described; the Prader orchidometer, described in 1966 [6] and the Takihara orchidometer (also known as Rochester orchidometer) [10] described in 1983 are probably the most frequently used.
We conducted an analysis of this overestimation. The ratios of the volumes obtained by the Prader orchidometer and by ultrasound by Goede et al. [8], Table 1, were 1.8 folds to 2 for adults, 2 to 2.5 for pubertal and up to 3.4 folds for prepubertal males, and similar ratios by the caliper external measurements by Osemlak et al. [5]. The greater overestimation for prepubertal than for adults is related to the greater proportion of scrotal skin over width of the testis: ( ss W ) for prepubertal males = 15%, for pubertal = 8.6%, and adults = 7.2%. Moreover, since the scrotal measurements do not include height (W 2 x L × 0.52), there would be an overestimation of the US volumes (W x H x L × 0.52) of about 20% to 30%. Height often is 0.7 to 0.8 of the width. The testicular volumes we obtained by the measurement of the width and length of the testis in the scrotum with a cm ruler (W 2 x L × 0.52) are similar to those obtained by the Prader orchidometer (Additional file 1: Figure S1). The ratio of the volumes we obtained with a cm ruler in the scrotum and the ultrasound values by our formula (W-ss) 3 × 0.64 (equivalent to ultrasound (W x H x L × 0.52)) were 1.8 folds for adults, 1.6 to 2 for pubertal, and 1.9 to 3.5 folds for prepubertal males, similar to those obtained with the Prader orchidometer by Goede  values see Additional file 1: Table S1. Thus, the testicular volume cannot be referred by age only, but also by stage of genital development.
There are also problems with the reference of the values obtained by ultrasonography or orchidometers to Tanner stage. The staging system most frequently used is known as Tanner stages consisting of genital and pubic hair changes [23]. Correlation of the volumes obtained for the different stages of genital development will provide normative data and solve the problem of reference to age and Tanner stages.

Stages of genital development
The main characteristics of the stages of genital development (testes, scrotum, and penis) separate from pubic hair development were well defined by Tanner by photography [19] (Additional file 1: Figure S4). There is no mention of pubic hair in genital stages and all these changes are related to gonadal activity. The stages of pubic hair development were also well defined [19]. Table 2 shows our findings for different genital stages. We measured the width of testis in the scrotum, calculated the volumes, and measured the penile length, for each stage of genital development and for adults. Our findings for different G-stages have not previously been reported, to our knowledge. The pubic and other hair were recorded. Table 2 could be quite helpful for the identification of the stage of genital development. Let's consider a 14-yearold who has some enlargement of the testes, but still an infantile penis and no pubic hair. He would be a normal Stage 2 of genital development beginning puberty with enlargement of the testes, by development of seminiferous tubules from the effect of FSH, but an infantile penis because he does not have yet LH stimulation of Ledig cells and testosterone secretion. Pulsatile secretion of LH comes normally about 6 months or later from the pulsatile secretion of FSH. He does not have pubic hair because he does not have yet adrenarche or testosterone from gonadarche. If a different 14-year-old has further enlargement of the testes, and growth of the penis in length and width from testosterone, he would be stage 3. He may have some or no pubic hair (the effect of testosterone is seen sooner in the penis than in pubic hair). If another 14 or 15-year-old has further growth of the testes and penis and development of the glans, he would be Stage 4. The pubic hair would be from adrenarche and gonadarche. The values of the penile measurement are included in Table 3. The difference between the means of the penile length for G3 and G2 is highly significant (p < 0.001). Even though these stages are supported by serum levels of FSH, LH, and testosterone, one does not need to obtain them.
The testicular volumes obtained in our subjects for each stage of genital development and adults are included in Table 4. The volumes were calculated with the formula (W-ss) 3 × 0.88, because the use of the 0.71 constant, presently, is the preferred one, because it closely matches the "true" testicular volumes  The range of ages is consistent with that reported by a number of authors [24,25]. To facilitate the provider to compare the volumes he/she obtains with the normal volumes obtained by us, the median, quartiles, and minimal and maximal values we observed are included in Figs. 3 and 4. The testicular volumes for different genital stages 1, 2, 3, 4, and 5 are different (p < 0.0001). There is no difference between genital stage 5 and adults (p > 0.05, <0.1). The small increase in volumes in genital stage 1 is thought to be related to proliferation of Sertoli cells. The increase in volume from genital stage 1 to 2, which indicates the beginning of puberty is related to the development of seminiferous tubules and the effect of FSH and is highly significant.

Discussion
The information presented would solve the problem of overestimation by orchidometers and reference to age, would permit assessment of the beginning and progression of puberty, of micro and macroorchidism, Klinefelter and other conditions mentioned. It would also give information on precocious development and late development (with some caution, because this is a cross sectional study and some of the subjects may have had that stage of genital development for some time). Nevertheless, the range of ages we observed are consistent with those reported by others [24,25]. Longitudinal studies and additional number of subjects may provide more accurate information. Assessment of factors affecting differential testicular volumes of the left and right testis could be done by the formulas by the difference of the width obtained in the scrotum [16].
The penile measurements that we obtained are practically identical to those reported by Tomova in 6200 children (300 for every year of age) [26].
If we, presently, have a simple, low cost clinical method (a width of the testis obtained in the scrotum with a centimeter ruler) that closely matches the results obtained by ultrasound, this method would seem to be  preferable. Orchidometers and calculation by the scrotal measurements mentioned, would not need to be used anymore.
Since we analyzed the overestimation of the US values by the orchidometers, it would appear appropriate to analyze the possible difference in ultrasound values for other measurements and those we are recommending.
Though the measurements of width, height, and length on ultrasound are done with an electronic caliper that can measure to within 0.1 mm, depending on the location and compression one can easily obtain a 1 mm variation. That is why it is usually recommended to obtain 3 measurements. Then, however, some use the highest measurements and others the average. A 1 mm difference in the measurement of the width of the testis on ultrasound or in the scrotum will yield a difference of volume of 10% for adults (19.1 to 21.2 ml), 14% for pubertal (6.1 to 7.1 ml), or 26% for prepubertal (0.7 to 0.9 ml) males, variability that constitutes the normal range. A difference of the width of the left and right testes does not indicate an error in measurement. Of interest is that the measurement of the width in the scrotum seems quite accurate and statistically different for each genital stage ( Table 2). Of the 318 testes we measured 93.6% had an equal width of the left and right testis, 2.8% had a 1 mm, 1.8% a 2 mm, and 1.8% a 3 mm difference.
A difference in the formulas of the H/W ratio used of 0.7 to 0.8 will yield a 14% difference for different cohorts in the US volumes, or 7% for 0.75 to 0.8 and so on. And a difference of the formulas of the L/W ratio of 1.5 to 1.55 will yield a difference of 7% in the US volumes. These differences do not indicate errors. Different cohorts maybe have different H/W or L/W ratios. It simply indicates that if one uses the same formula for all the cohorts the difference in volumes is minor.
The observations described should be helpful to assess the onset and progression and disorders of puberty and disorders previously mentioned. The US remains the method of choice for the evaluation of extra testicular (i.e. hydrocele, spermatocele, epididymal cyst, varicocele) or intratesticular (i.e. tumors) abnormalities.
The process for the determination of the testicular volume seems simple.

Measurement of the width of testis in the scrotum
can be obtained by smoothing the scrotal skin around the testis, avoiding compression and using the ruler. 2. The genital stage of development is determined visually by the appearance of the penis testes and scrotum (Additional file 1: Figure S4), and the measurement of the width of the testis and, if needed, the penis (Table 2), without consideration of pubic hair. 3. The width is subtracted by the double scrotal skin for the genital stage. One could make it simpler by subtracting 1.5 mm for genital stages 1, 2, and 3 and 2 mm for genital 4, 5, and adults. The error or variation would be minor.  We certainly encourage the readers to use the formulas (or to develop formulas) and compare the volumes with those obtained on US in their institution, to confirm that volumes matching US volumes can be obtained with a simple measure of the width of the testis in the scrotum. More observations could provide information on accuracy and inter-observant difference.

Conclusion
With a simple measurement of the width of the testis in the scrotum, with a centimeter ruler, testicular volumes matching US values were calculated and normative values for each stage of genital development were determined. This information should solve the present problems of overestimation of US values with orchidometers or external linear measurements, and the lack of normative values for stages of genital development.

Additional file
Additional file 1: Figure S1. Testicular volumes determined by the Prader orchidometer (Goede et al.) [8] and in our subjects by measurements in the scrotum with a centimeter ruler (W 2 x L × 0.52). Figure S2. Testicular volumes for different stages of genital development (G), obtained with the formula (w-ss)3 × 88, equivalent to ultrasound W x H x L × 0.71. The boxes show the volumes in ml in the ordinance with the range of ages in the abscissa. Additional file 1: Figure S3. Testicular volumes for different Genital Stages obtained by measurements in the scrotum with a centimeter ruler (W 2 x L × 0.52). The boxes show the volumes in ml in the ordinance with the range of ages in the abscissa. (The volumes are similar to those obtained with a Prader orchidometer by Goede et al. (Additional file 1: Figure S1).) Figure S4.