Using Hashimoto thyroiditis as gold standard to determine the upper limit value of thyroid stimulating hormone in a Chinese cohort

Background Subclinical hypothyroidism, commonly caused by Hashimoto thyroiditis (HT), is a risk factor for cardiovascular diseases. This disorder is defined as merely having elevated serum thyroid stimulating hormone (TSH) levels. However, the upper limit of reference range for TSH is debated recently. This study was to determine the cutoff value for the upper normal limit of TSH in a cohort using the prevalence of Hashimoto thyroiditis as “gold” calibration standard. Methods The research population was medical staff of 2856 individuals who took part in health examination annually. Serum free triiodothyronine (FT3), free thyroxine (FT4), TSH, thyroid peroxidase antibody (TPAb), thyroglobulin antibody (TGAb) and other biochemistry parameters were tested. Meanwhile, thyroid ultrasound examination was performed. The diagnosis of HT was based on presence of thyroid antibodies (TPAb and TGAb) and abnormalities of thyroid ultrasound examination. We used two different methods to estimate the cutoff point of TSH based on the prevalence of HT. Results Joinpoint regression showed the prevalence of HT increased significantly at the ninth decile of TSH value corresponding to 2.9 mU/L. ROC curve showed a TSH cutoff value of 2.6 mU/L with the maximized sensitivity and specificity in identifying HT. Using the newly defined cutoff value of TSH can detect patients with hyperlipidemia more efficiently, which may indicate our approach to define the upper limit of TSH can make more sense from the clinical point of view. Conclusions A significant increase in the prevalence of HT occurred among individuals with a TSH of 2.6–2.9 mU/L made it possible to determine the cutoff value of normal upper limit of TSH.


Background
Patients with subclinical hypothyroidism have normal level of serum thyroid hormones (T4, T3, FT3, FT4) and elevated TSH. These patienshave a higher incidence of lipid abnormalities, coronary heart disease, psychiatric disorders and pregnancy complications [1][2][3][4][5][6][7][8][9], although their clinical symptoms are very mild. Proper screening and treatment of these patients may help to improve the adverse outcome of the involving diseases. Therefore, to define the upper limit of TSH precisely had an important role in detecting patients who had mild thyroid dysfunction and might benefit from early intervention.
The upper limit of reference range for "normal" TSH has been the focus of debate in the recent decade. Some authors insisted on the conventional value of 4.0-5.0 mU/L as the upper limit of normal thyroid stimulating hormone (TSH) [10], others suggested it narrowed to 2.5-3.0 mU/L [11]. As mentioned in the National Academy of Clinical Biochemistry (NACB) guideline, more than 95 % of normal individuals had TSH below 2.5 mU/L. There were even data showing that African-Americans had very low incidence of HT with a mean TSH level of 1.18 mU/L This value maybe the "true normal" upper limit for TSH, because African-Americans have very low prevalence of Hashimoto thyroiditis (HT) to elevate TSH [11].
Another question was put forward that which was the true set point of normal upper limit. Which centile (90, 95 or 99th) should be the real normal limit? The approximate questions related to glycemic threshold for diagnosing diabetes were already answered by using diabetic retinopathy (DR) as "gold" calibration standard [12]. Referring to that and based on the correlation between HT (as end-point) and TSH value, we tried using the prevalence of HT as the calibration standard to determine the upper limit of TSH.

Research population
A total of 2856 medical staffs with age between 20 and 60 years old participated in a health examination in the year of 2013 were included in our research. Questionnaires related to medication history and other healthrelated behaviors were completed beforehand. Blood sampling was performed in the morning after eight hours of fasting, followed by the detection of height, weight, waist-to-hip ratio (WHR), blood pressure and physical examinations.
The study was conducted with the approval of the Ethics Committee of Beijing Tongren Hospital, Capital Medical University.

The diagnostic criteria for HT
The diagnosis of HT was established by a combination of presence of thyroid antibodies (TPAb and TGAb) and abnormalities of thyroid sonogram including reduced echo or diffused heterogeneity echo of thyroid with or without nodularity [8]. Ultrasound examination of thyroid was performed by four experienced professional ultrasound physicians. Consistent diagnosis standard was used, including description for the size, nodules and echo of the thyroid.

Framingham score
Framingham score (FS), a 10-year estimated risk of CHD, was calculated by the equation including: age, gender, blood pressure, smoking status and cholesterol levels [13]. FS equaling to or over 9 was considered to be a predictor for the higher CHD risk of over 5 %.

Statistical analysis
Unpaired t-test, Mann-Whitney U test or Pearson Chisquare test was used to compare the difference between cohorts with or without HT. We used two different methods to estimate the cutoff point of TSH based on the prevalence of HT: Joinpoint regression [14] and ROC curve by maximizing the sensitivity and specificity. All statistical analyses were conducted with the software package SPSS version 13 for windows, and ljr package of R software (http://www.r-project.org). A two-sided p value of less than 0.05 was considered to be statistically significant.

Characteristics of the observed cohort
By the exclusion standard, those with abnormal thyroid function (FT3 and FT4 out of the normal range), being pregnant, taking thyroid related medicine, accepted thyroid operation and patients with personal history of autoimmune disorders were excluded. A total number of 2856 individuals were included in the research, with the average age of 36.0. The proportion of women (75 %) was higher than men (25 %), which was consistent with the gender characteristic of the medical staff in China.
The research population was further divided into HT and non-HT groups. According to the diagnosis standard, 187 (7 %) in the observed cohort had HT (Table 1), which comprised of 14 males and 173 females. The prevalence of HT in females (8 %) was 4 times higher than in males (2 %).
No significant differences between the HT and non-HT groups were found in age, BMI, WHR, blood pressure or biochemical parameters concerning liver function, blood glucose and lipid. However, serum creatinine and uric acid were lower in HT group, which may be related to the high prevalence of HT in females, because females have relatively low level of serum creatinine and uric acid than males.
The distribution of TSH in the total population and subgroups (HT and non-HT group) Although exclusion standard was used in our research to get a relatively homogeneous cohort without overt thyroid diseases, the distribution of TSH still showed asymmetrical curve with extension to the right side. We then separated HT from the whole population, and the TSH distribution of HT and non-HT groups was described in Fig. 1. Non-HT group showed a more likely Gaussian curve, and the HT group showed an uneven curve with more rightward trend representing higher TSH values. The different centiles of TSH in both groups were listed in Table 2. By ANOVA analysis, there was no statistically significant influence of age to the upper limit of TSH in our research population (P = 0. 5).
The prevalence of HT corresponding to each decile of TSH value A curve was drawn to describe the correlation between HT and TSH value (Fig. 2). Each decile of TSH value was marked in the X-axis, meanwhile the proportion of participants with HT between each decile of TSH was shown in Y-axis. The curve was relatively flat when TSH <2.9 mU/L. Whereas the curve showed a sharp increase at the ninth decile of TSH (corresponding to TSH≧2.9 mU/L) The prevalence of HT in the ninth decile was 21 %, while that in the eighth decile was 8 %. Regression analysis showed a good Joinpoint at the ninth decile (P = 0.0012). Logistic regression models adjusted for sex, age, BMI, FT4, systolic blood pressure and biochemical parameters also confirmed a statistically significant difference for the prevalence of HT below vs. above the cutoff point for TSH value (OR 3.07 [95 % CI: 2.24-4.21]; P = 0.000).

ROC curve describing the cutoff value of TSH in detecting participants with HT
When maximizing the sensitivity and specificity, the cutoff value of TSH equaled to 2.6 mU/L, with the sensitivity of 52 % and the specificity of 77 %. The area under the ROC curve was 66 % (95 % CI: 61.01-70.51) (Fig. 3). The proportion of participants with HT was 4 % when TSH was below 2.6 mU/L. As compared, when TSH was above 2.6, the proportion of participants with HT was 14 % (P = 0.000 by Fisher's exact test).

Comparison of metabolism-related parameters between groups divided by different cutoff values of TSH
As shown in Table 3, the cutoff value of 2.9 mU/L (based on the Joinpoint regression), and 2.6 mU/L (based on the ROC curve analysis), and 4.5 mU/L (conventionally accepted cutoff value) were used to categorize the population. All three grouping methods showed a higher probability of hyperlipidemia and a higher Framingham Score in participants with TSH above the cutoff point, and a relatively younger age in participants with TSH below the cutoff point. As compared to using 4.5 mU/L as the cutoff point, using 2.6 mU/L or 2.9 mU/L as the cutoff value of TSH can detect more patients with abnormal TC and LDL. That may tell us it's more effective to apply the lowered cutoff value of TSH for detecting patients with dyslipidemia.

Discussion
To get a more convincing reference range of TSH, previous researchers used a more stringent standard for screening the reference population proposed by the NHANES III study [2], and more sensitive assays were used to measure TSH of those reference individuals. The upper limit of TSH was lowered to 3.6-3.7 mU/L [3,15]. It was believed that the skewing right tail of the TSH distribution curve was caused by the inclusion of those with occult thyroid dysfunctions most representatively as HT. Besides, other factors such as gender and age are also critical to the reference range of TSH [16]. In this study, age does not serve as an influence factor to the upper limit of TSH. The reason may be the research population was a homogeneously young cohort. Besides, It is not an iodine deficient area in Beijing. Therefore, the daily iodine intake and total body iodine status should not be a major factor to influence TSH value in our research population. We aimed to clear up the confounding influence of those individuals with HT to determine the cutoff point of TSH in our research population. If we defined the upper limit of normal TSH as the 95th centile as adopted by previous researchers, the value would be 4.2, which could be comparable to most publications and more approximate to the traditional opinions insisting on 4.0-5.0 mU/L as the upper limit of normal TSH.
However, there were still considerable proportion of people with occult subclinical thyroid disease (mostly HT) within the upper part of the normal TSH range. Fig. 2 a The prevalence of HT corresponding to each decile of TSH value (Each point in the chart corresponded to the number labeled in X axis. 2.89 is the cutoff value of TSH revealed by joinpoint analysis). b The joinpoint regression analysis showed only one joinpoint was found. This joinpoint presented in the ninth decile group, and the lower limit value of this group of 2.89 was taken as the cutoff Whereas we expect the normal range as a tool to screen out abnormal individuals. We tried an alternative approach to exclude the influence to TSH value confounded by individuals with HT. Based on the fact that TSH and the prevalence of HT was well correlated, we used "the prevalence of HT" as a checkout factor to determine the cutoff upper limit of TSH. Mimicking the diagnostic methodology in determining glycemic thresholds using DR as the checkout factor, we drew a curve describing the correlation between the prevalence of HT and TSH value. From this curve we got a turning point at TSH 2.9 mU/L by Joinpoint regression analysis. The prevalence of HT above this point was 14.5 % and much higher than that below the cutoff point (4.6 %). Similarly, ROC curve revealed a cutoff value of 2.6 mU/L, by which we can effectively detect HT with the best sensitivity and specificity. These two measures showed very close results which are more close to the proposal in recent NACB guideline about narrowing the upper limit of TSH to 2.5 mU/L [11]. As proposed by Surks [16], the normal range for TSH should meet the needs to categorize population into groups 1) normal, 2) those who need to be observed closely and 3) those who need medication. Although the lowered cutoff value of TSH may not be used as a treatment threshold, it can help to find more patients with occult subclinical thyroid diseases. If the limit was increased to 4.5U/L, 2.2 % of individuals with HT would be missed.
It has been indicated by several studies [17][18][19], that elevated TSH within the normal range was positively related to lipid abnormalities or even to the future occurring of hypertension and hyperlipidemia. Likewise, we used the lowered cutoff value of TSH to categorize population to watch if it made sense concerning the metabolism parameters among groups. As expected, people in the group with higher TSH had a much higher age, higher rate of hyperlipidemia and a higher Framingham score. When the cutoff value of 2.6 mU/L was used, it showed better results with more significant difference between groups concerning hyperlipidemia (hyper-TC, hyper-LDL and hyper-TG). Meanwhile, if a lowered upper limit of TSH was used, we can detect more patients with hyperlipidemia with less missed diagnosis. The Framingham score went higher with the increasing of TSH. This was also consistent with Asvold's observation [9] and suggested increased incidence of coronary heart disease in population with SCH. Therefore, a lowered upper limit of TSH can also help to detect individuals with increased risk of ischemic heart diseases effectively.  Certainly, there were several limitations in our study, including the higher proportion of females in the research population leading to a bias of TSH value which may be caused by the higher prevalence of HT in females. Secondly, the age of the research population was no older than 60, so the conclusion may not be applied to older people.

Conclusion
This study shows a high prevalence of HT occurred among individuals with a TSH of 2.6-2.9 mU/L. These values are possible to the "true" values of normal upper limit of TSH for Chinese population. From the trend of higher abnormalities in metabolismrelated indicators and Framingham score with the elevation of TSH, a proper cutoff value of normal upper limit of TSH will be useful in patients with mild thyroid dysfunction.