The value of hysterosalpingography in the diagnosis of tubal pathology among infertile patients

Summary. Objective. To evaluate the diagnostic accuracy of hysterosalpingography in the diagnosis of tubal pathology among infertile patients. Patients and methods. A prospective cross-sectional study in Kaunas University of Medicine Hospital within the period of 18 months was performed. Consecutive infertile women formed the study group according to defined criteria. Hysterosalpingography was performed in the preovulatory phase of the menstrual cycle. Laparoscopy and dye test was performed within one – three months after hysterosalpingography. General tubal pathology, tubal occlusion, and peritubal adhesions detected at hysterosalpingography were compared with general tubal pathology, tubal occlusion, and peritubal adhesions detected at laparoscopy. Results. The study population comprised 149 infertile women. The sensitivity of 81.4% and specificity of 47.8% the likelihood ratio of a positive test result of 1.6 and a negative test result of 0.4 for hysterosalpingography while evaluating general tubal pathology was determined. Sensitivity of 84.1% and specificity of 59.1% and likelihood ratios of 2.1 and 0.3, respectively, were calculated, when tubal occlusion was defined as any abnormality of tubal patency. When definition of tubal occlusion was limited to two-sided occlusion, the sensitivity and specificity were 89.5% and 90% and likelihood ratios 9.0 and 0.1, respectively. As a test of peritubal adhesions, hysterosalpingography had sensitivity of 35.5% and specificity of 81.3% and likelihood ratios of 1.9 and 0.8, respectively. Conclusion. general adhesions is poor. Hysterosalpingography is more accurate in the diagnosis of tubal occlusion.


Introduction
Tubal pathology is one of the main causes of infertility. It is estimated to account for 12-33% (1)(2)(3). This probably is an underestimate, since most aspects of tubal dysfunction escape our observation. Tubal pathology is usually accompanied by peritubal adhesions and tubal occlusion. In the routine fertility work-up, our ability to evaluate tubal function is limited. We currently judge the degree of tubal damage mainly by tubal patency and the extent of peritubal adhesions (4).
Tests available for evaluation of tubal function can be divided into diagnostic and screening tests (5). The main aim of diagnostic tests is to prove pathology. Today, laparoscopy with dye (LS) is considered the best available diagnostic test for tubal factor infertility (6)(7)(8). It is used as a reference standard in most clinical studies. LS involves hospital admission, general anesthesia, and 1 to 2% complication rate including post-operative infection and injury to bowel or blood vessels, and a mortality rate of 8 per 100 000 (8). Traditionally, LS is the final diagnostic procedure of any infertility investigation.
Screening tests are useful in establishing the risk for tubal pathology in an individual patient. Depending on the risk estimate, decisions can be made concerning additional testing and treatment.
Hysterosalpingography (HSG) is a method used for screening purposes in the routine infertility evaluation. It is used in many infertility centers as a preliminary investigation tool. For many years, HSG has been employed to assess tubal patency and peritubal adhesions. Besides, it gives information about uterine cavity. Although it is relatively quick outpatient procedure, HSG is uncomfortable and often painful to the patient (5). It does involve exposure to ionizing radiation and iodized contrast material. The incidence of febrile morbidity after HSG has been estimated in up to 4% of subfertile patients and even in 10% of patients with tubal pathology (9).
The diagnostic performance of HSG in comparison with LS was discussed by various investigators (10)(11)(12)(13)(14)(15)(16). Most of them have found significant shortcomings with the diagnostic accuracy of HSG.
The ideal screening test for the diagnosis of tubal pathology is needed to be highly sensitive and specific. Sensitivity measures the number of people who truly have the disease who test positive, whereas specificity measures the number of people who do not have the disease who test negative (4). Sensitivity and specificity can be converted into likelihood ratios (LRs). Conceptually, LRs are among the most complicated characteristics of a diagnostic test (8). LR is a semiquantative measure of the performance of diagnostic test, which indicates how much a diagnostic procedure modifies the probability of the disease (17). LRs assist in putting the value of testing in proper perspective (17). LRs are not affected by the prevalence of the disease in the population studied (5,17). The likelihood of a positive test result (LR+) indicates the likelihood of abnormal test result in a patient with the disease, over the likelihood of an abnormal test result in a patient without the disease (5). The likelihood of a negative test result (LR-) indicates the likelihood of a normal test result in a patient with the disease, over the likelihood of a normal test result in a patient without the disease (5). Calculation of LRs yields a score that allows categorization of test results: an LR+ of 2-5 indicates a fair clinical test, 5-10 is good, and >10 is excellent (5). An LR-of 0.5-0.2 indicates a fair clinical test, 0.2-0.1 is good, and <0.1 is excellent (5).
The aim of this study was to evaluate the diagnostic accuracy of HSG in the diagnosis of tubal pathology among infertile patients and to discuss its clinical implication.

Methods and patients
We used data prospectively collected in Department of Obstetrics Index test. HSG was performed in the preovulatory phase of the menstrual cycle as an outpatient procedure without anesthesia and without spasmolyticum. Patients with a history of pelvic inflammatory disease, evidence of cervicitis, cervical chlamydial infection, or suspected tubal disease were given antibacterial prophylaxis. A vaginal speculum was used to visualize the cervix, which was then cleaned with antiseptic before being grasped with a single tooth tenaculum. The instruments after Shultze were used; approximately 10-15 mL of a water-soluble contrast medium were injected manually through the cannula. Fluoroscopic examination was performed during the injection. Patients were lying on their backs during the procedure. Two supine radiograms were done. Third oblique radiogram was performed in uncertain cases. No delayed pictures were taken. The HSGs were performed by staff gynecologist and staff radiologist. The results of HSGs were evaluated by one of the three staff radiologists. The estimator was blind to the results of other tests. At the time this study was conducted, there were no written guidelines for interpretation of HSG in our department.
HSG was considered normal when both tubes were well outlined by free flow of dye, without loculation in the peritoneal cavity. HSG was considered abnormal when the evidence of either unilateral or bilateral tubal obstruction and/or peritubal adhesions was estimated. Criteria for tubal occlusion: 1)Proximal occlusion -filling of the intramural or intramural/isthmic portion of the tube with contrast medium, no passage to the distal portion of the tube. 2)Distal occlusion -passage of the contrast to the distal portion of the tube with or without ampullary dilatation, absence of the spillage to the peritoneal cavity.
Criteria for peritubal adhesions (defined for those in whom patency of at least one tube was demonstrated): 1)Convoluted tube. 2)Loculation of the contrast medium in the peritoneal cavity. 3)Peritubal halo effect.
For radiographic diagnosis of peritubal adhesions, two criteria from the listed above were necessary.
Abnormal findings of HSG were classified as: 1)General tubal pathology -cases with evidence of either unilateral or bilateral tubal obstruction and/ or peritubal adhesions. 2)Tubal pathology considering tubal occlusioncases with evidence of any form of tubal occlusion (one-sided or two-sided) or cases with evidence of only two-sided tubal occlusion. 3)Tubal pathology considering peritubal adhesionsthose with evidence of peritubal adhesions (patency of at least one tube should be demonstrated).
In patients known to have only one tube the HSG was interpreted as abnormal when the remaining tube demonstrated obstruction and/or evidence of peritubal adhesions.
Reference standard. Laparoscopy and dye test (LS) was performed within one-three months after HSG. The procedure was carried out at the Department of Obstetrics and Gynecology of Kaunas University of Medicine Hospital under the general anesthesia by staff gynecologists. A Storz laparoscope was used, artificial pneumoperitoneum was reached while using CO2. A thorough inspection of the pelvis, internal genitalia, appendix, and liver region was performed, followed by testing the patency of the Fallopian tubes using dye. A dilute (0.5%) solution of methylene blue dye (15 to 20 mL) was injected through the uterine cervix with Shultze cannula. Tubal status (patency or occlusion) and periadnexal adhesions were assessed by surgeon who was blind to results of the other tests. The data were registered to a standardized form.
Bilateral spill of dye and absence of periadnexal adhesions was considered as normal tubal status at LS.
Proximal tubal occlusion was diagnosed whenever the dye was injected under pressure and the dye did not fill the tube. Distal tubal occlusion was diagnosed when entire tube was filled and distended with or without ampullary dilatation but with no free spillage.
Periadnexal adhesions were scored according to American Fertility Society criteria. The same criteria were used for distal tubal obstruction (18). Revised American Fertility Society criteria were used for endometriosis (19).
Abnormal findings at LS were classified: 1)General tubal pathology -cases with evidence of either unilateral or bilateral tubal obstruction and/ or peritubal adhesions. 2)Tubal pathology considering tubal occlusioncases with evidence of any form of tubal occlusion (one-sided or two-sided) or cases with evidence of only two-sided tubal occlusion. 3)Tubal pathology considering peritubal adhesionsthose with evidence of peritubal adhesions.
In patients with only one tube, the LS was interpreted as abnormal when the remaining tube demonstrated obstruction and/or evidence of peritubal adhesions.
Approval for the study was received from the local Ethics Committee.
Statistical methods. General tubal pathology, tubal occlusion, and peritubal adhesions detected at HSG were compared with general tubal pathology, tubal occlusion, and peritubal adhesions detected at laparoscopy in a 2×2 table. In case of peritubal adhesions, only those patients in whom patency of at least one tube was demonstrated were included in 2×2 table. In case of tubal occlusion, tubal pathology was defined as any form of tubal occlusion or as only two-sided tubal occlusion. Sensitivity, specificity, LH+, LH-, pretest and posttest probabilities of HSG in diagnosis of general tubal pathology, tubal occlusion, and peritubal adhesions were calculated, regarding LS as the reference standard. Confidence intervals (95% CI) were reported in order for statistical comparisons to be made.

Results
A total 203 consecutive women were approached within the study period ( Table 1).
The index test -HSG was performed for 153 women. For 2 (1.3%) patients, febrile morbidity after the procedure was registered. They were hospitalized and treated with antibiotics. The LS in these cases was postponed for a later date. Two women conceived within 1-3 months after the HSG. These 4 patients dropped out from further examination and analysis (Table 1). No complications were registered during performed LS.
Figure shows tubal pathology estimated by HSG and LS. Following LS, 39.5% (59/149) of women were found with general tubal pathology; 29.5% (44/149) women had one-sided or two-sided tubal occlusion, 12.8% (19/149) -two-sided tubal occlusion, 36.2% (54/149) patients had periadnexal adhesions. From the latter group, only those who demonstrated patency of at least one tube on HSG were included into analysis. Therefore, the group of 119 patients with 31 cases of peritubal adhesions was analyzed. The prevalence of peritubal adhesions in the study group was 26.1% (31/119).
The mean (SD) score for distal tubal occlusion according to classification of American Fertility Society was 21.1 (12.8) with a range of 5-46. The mean score for periadnexal adhesions -28.7 (21.0), range -4-72.
During laparoscopic examination other pelvic pathologies among the infertile patients of the study group were found: minimal and mild endometriosis (stage I or II) in 40 cases, moderate and severe endometriosis (stage III and IV) -11 cases, polycystic ovaries -14 cases, ovarian cysts -4 cases, uterine myo-mas -12 cases, uterine anomalies -5 cases. Perihepatic adhesions (Fitz-Hugh-Curtis syndrome) were found in 11 cases. Table 2 shows general tubal pathology detected at HSG as compared to general tubal pathology detected at laparoscopy. Table 3 demonstrates the diagnostic accuracy of HSG in diagnosis of tubal pathology. Table 4 shows tubal occlusion detected at HSG as compared to tubal occlusion detected at laparoscopy. Diagnostic value of HSG is demonstrated in Table 5. Diagnostic properties of HSG were evaluated twice; once when tubal occlusion was defined as one-sided or two-sided occlusion, and once when definition of tubal pathology was limited to two-sided occlusion. Table 6 shows cross-tabulation of HSG and laparoscopic findings considering peritubal adhesions. The diagnostic performance of HSG is presented in Table 7.

Discussion
The useful screening test should have high estimates of both sensitivity and specificity. These properties of the test correspond to good diagnostic accuracy. Our study evaluated diagnostic performance of HSG in a prospective manner. The evaluation of radiographic and laparoscopic results was performed independently. The restricted period up to three months between the index test and reference standard minimized the verification bias. Two patients dropped out from our study after the index test because of pregnancy and two -because of febrile morbidity. In only one study, all patients had HSG and LS performed on the same day (16). In most studies, after normal     HSG, a 3-to 6-month time interval was observed to allow for the "positive perturbation effect" of HSG.
Only patients who did not conceive (a selected population) were referred for LS (8,10,11,20). The issue of routine use of HSG at an early stage in the fertility workup was challenged by the conclusions of a recent randomized controlled trial (13). As well as other published studies, our study did not evaluate results of HSG in a second, independent group of patients. We estimated sensitivity of 81.4% and specificity of 47.8% for HSG as a test of tubal pathology. For HSG as a test of tubal patency, sensitivity of 84.1% and specificity of 59.1% were calculated, when tubal occlusion was defined as any abnormality of tubal patency. When definition of tubal occlusion was limited to two-sided occlusion, the estimated sensitivity and specificity were 89.5% and 90%, respectively. As a test of peritubal adhesions, HSG had sensitivity of 35.5% and specificity of 81.3%. These results are comparable to the numbers calculated in meta-analysis of Swart and coworkers (15). The authors of metaanalysis limited their assessment to included retrospective cohort studies, because no RCTs and no prospective cohort studies had been published investigating the validity of HSG in diagnosing tubal pathology. The point estimate of 65% (95% CI, 50-78) for sensitivity and of 83% (95% CI, 77-88) for specificity was calculated for tubal patency. These calculations were made for three studies that judged HSG and LS independently (10,11,16). HSG was found to be unreliable in diagnosing peritubal adhesions, with sensitivity below 50% (range 13-83%) (15). Another retrospective study by Opsahl and coworkers estimated sensitivity of 96.5% and specificity of 71.2% for HSG as the test of tubal patency. The authors contributed suspicious cases of HSG to the high false-positive rate (21). The same test characteristics from Meikle et al. (22) were 78% and 84%, respectively. The prospective cohort study by Mol and coworkers (12) reported the sensitivity of 81% and specificity of 75% when disease was defined as any abnormality of tubal patency. The sensitivity of HSG was estimated to be 72% and specificity 82% when disease was defined as two-sided tubal abnormality (12). The recent prospective study of Perquin et al. reported sensitivity of 69% and specificity of 73% for HSG as the test of tubal patency (20).
Good diagnostic accuracy for HSG as the test of tubal patency was reached in cases when tubal pathology was defined as two-sided tubal occlusion (Table 5). When tubal occlusion was defined as one-sided or two-sided occlusion, HSG as a test of tubal patency was less accurate (Table 5). For diagnosis of general tubal pathology and peritubal adhesions, the accuracy of HSG is lacking (Table 3 and Table 7). This issue was accompanied by high rate of false-positive and false-negative results. For example, rate of false-positives in diagnosis of general tubal pathology -32%, tubal occlusion -29%; rate of false-negatives in cases of peritubal adhesions -17%.
The lack of accuracy could be influenced by the faulty technique and artefacts occurring while performing HSG. Hofmann et al. (23) studied HSGs from 100 consecutive patients referred for IVF and found that 17% of the films were technically inadequate. Although only those films in which a technically adequate view of the uterus and tubes were included in our study, the results of Hofmann et al. underline the importance of technical factors when performing and interpreting HSG films (23). This issue was discussed by the other authors (24)(25)(26)(27). Artefacts might include inadvertent insertion of the cannula, premature ending of the procedure, insufficient pressure because of vaginal reflux, or differences in muscle tonus of the tubes (4,27). Some authors underline the importance of possible cornual spasm during the procedure (14). The late radiographs for detection of the contrast depots were not performed in our study. In that case, accuracy could be improved by optimizing the technique for performing HSG and training of the personnel.
On the other hand, the diagnostic accuracy of HSG in our study could be influenced by the evaluation of the test results, that is, lack of reproducibility. If observers disagree on the reading of a test result, the test is unlikely to have a very good accuracy. The HSG results were assessed by three staff radiologists who were not provided with information concerning the patient. At the time this study was conducted, there were no written guidelines for interpretation of HSG in our department. For these reasons the interpretation of HSG results might be biased due to variability among observers. Gladstein and coworkers estimated that interobserver reliability of the HSG varied from poor to fair (24). Overall, agreement among observers regarding the presence of an abnormal tubal pattern was fair and regarding adhesions only marginal (24). The other paper by Mol et al. (27) analyzed interobserver as well as intra-observer reproducibility on four HSG items -proximal tubal obstruction, distal tubal obstruction, hydrosalpinx, and peritubal adhesions. The authors found that reproducibility within and between observers for proximal tubal occlusion was almost perfect, for distal obstruction and hydrosalpinx only substantial and for adhesions -from slight to fair (27). The results by Renbaum and coworkers were similar (28). The latter estimated that clinicians tend to more reliably diagnose hydrosalpinx and tubal obstruction, while radiologists tend to more reliably detect the more subtle findings of salpingitis isthmica nodosa and uterine adhesions (28). The possible variability among observers was not taken into account in our study.
Finally, the issue of "gold standard" should be discussed. LS and dye test was considered the reference standard. This procedure is commonly used in most clinical studies on tubal factor subfertility. Data of some of them found the choice of LS as a "gold standard" procedure questionable. Findings in metaanalysis comparing results of HSG and LS for the diagnosis of tubal pathology (15) indicated that 35% of the tubes that found to be occluded at LS showed patency at HSG. This particular finding might be an argument that LS could be incorrect in diagnosing tubal occlusion in these patients (12). Our study estimated that 7 patients demonstrated bilaterally patent tubes at HSG had one-sided distal obstruction (6 cases) and one-sided proximal occlusion (1 case) at LS. Interesting information could be found in a few studies analyzing HSG as a prognostic test for the occurrence of pregnancy (12,20,29). For patients diagnosed with bilateral tubal occlusion at LS a 3year cumulative pregnancy rate was estimated to be 2% (12). This underlines that LS is not the perfect test in the diagnosis of tubal pathology. If some patients with tubal blockage at LS conceive, LS obviously is not real gold standard, but it is the best we have (8). Recently fertiloscopy as a procedure of choice for evaluation of tubal status was analyzed (30), but more data are necessary for the assessment of the accuracy of this procedure.
Calculations of LR+ (1.6) shows that abnormal HSG findings are useless for ruling in diagnosis of general tubal pathology (Table 3). With belief that the "normal" pretest probability of tubal pathology among infertile patients in our population is 30%, the positive test (abnormal HSG) would increase the posttest probability by 10%. For clinical management, it does not seem to be important. On the other hand, LR for negative test result of 0.4 in case of negative test (normal HSG) would change the "normal" pretest probability to 14.6%. It indicates the HSG as fair clinical test in ruling out tubal pathology.
In clinical practice, the items of tubal patency and peritubal adhesions usually are more important than general tubal pathology. In case of any abnormality of tubal patency, HSG could be qualified as a fair clinical test in ruling in and ruling out any form of tubal occlusion (Table 5). In case of the positive test (abnormal HSG with any form of tubal occlusion) the posttest probability of tubal occlusion would increase to 47.4%. In case of negative test (HSG with normal tubal patency), the probability of tubal occlusion would decrease to 11.4%. Similar results were reported by Swart and coworkers (15). They calculated the LR for positive test (abnormal HSG) of 3.8 and for negative test of 0.4. With applied pretest probability of 14% for tubal occlusion, the posttest probability of abnormal HSG would be changed into 38% and into 6% in case of normal HSG (8).
When target condition was defined as two-sided tubal occlusion HSG seems to be good clinical test for ruling in (LH+ = 9.0) and ruling out (LH-= 0.1) abnormalities of tubal patency ( Table 5). The posttest probability of tubal occlusion in that case would increase up to approximately 80% in case of abnormal HSG. In case of negative test the posttest probability of two-sided occlusion would decrease to 4% ( Table 5).
The results of LR+ and LR-qualify HSG as unsatisfactory test in diagnosis of peritubal adhesions ( Table 7). The change of "normal" probability after positive result would increase the chance of the disorder by approximately 15%. The change of "normal" probability after negative result would decrease the chance of the disorder only by 5% ( Table 7). The calculations of LH+ and LH-do not correspond to satisfactory diagnostic properties of HSG as the test of peritubal adhesions.

Conclusions
1) The diagnostic accuracy of hysterosalpingography in the diagnosis of tubal pathology depends on selected target condition.
2) Diagnostic accuracy of hysterosalpingography is lacking in the diagnosis of general tubal pathology, peritubal adhesions, and tubal occlusion when target condition is defined as any form of tubal occlusion.
3) Diagnostic accuracy of hysterosalpingography is good in the diagnosis of tubal occlusion when target condition is defined as two-sided tubal occlusion. 4) Hysterosalpingography is useless test in ruling in and ruling out the diagnosis of general tubal pathology and peritubal adhesions. 5) Hysterosalpingography is a fair clinical test in ruling in and ruling out the diagnosis of tubal occlusion when pathology is defined as any form of tubal occlusion.
6) Hysterosalpingography is a good clinical test in ruling in and ruling out the diagnosis of tubal occlusion when pathology is defined as two-sided tubal occlusion.