The efficiency and safety of methimazole and propylthiouracil in hyperthyroidism

Abstract Purpose: The aim of this study was to evaluate the efficiency and safety of methimazole (MMI) and propylthiouracil (PTU) in the treatment of hyperthyroidism. Methods: Articles were searched through the PubMed, EMBASE, Cochrane Library, Web of Science, CNKI, Wanfang, and QVIP. The primary outcomes were clinical efficacy and thyroid hormone levels in MMI and PTU groups. The secondary outcomes were liver function indexes and adverse reactions in MMI and PTU groups. Results were expressed as weighted mean difference (WMD) or odds ratio (OR) with 95% confidence intervals (CIs). The Begg test was applied to assess the publication bias. Results: Totally, 16 randomized controlled trials were retained in this meta-analysis with 973 patients receiving MMI and 933 receiving PTU. The levels of triiodothyronine (T3) (WMD = −1.321, 95% CI: −2.271 to −0.372, P = .006), thyroxine (T4) (WMD = −37.311, 95% CI: −61.012 to −13.610, P = .002), Free T3 (FT3) (WMD = −1.388, 95% CI: −2.543 to −0.233, P = .019), Free T4 (FT4) (WMD = −3.613, 95% CI: −5.972 to −1.255, P = .003), and the risk of liver function damage (OR = 0.208, 95% CI: 0.146–0.296, P < .001) in the MMI group were lower than those in the PTU group. The thyroid-stimulating hormone level (WMD = 0.787, 95% CI: 0.380–1.194, P < .001) and the risk of hypothyroidism (OR = 2.738, 95% CI: 1.444–5.193, P = .002) were higher in the MMI group than those in the PTU group. Conclusions: Although MMI might have higher risk of hypothyroidism than PTU, the efficacy of MMI may be better than PTU in patients with hyperthyroidism regarding reducing T3, T4, FT3, and FT4 levels, decreasing the risk of liver function damage and increasing the level of thyroid-stimulating hormone. Register number: osf.io/ds637 (https://osf.io/search/).


Introduction
Hyperthyroidism is one of the most common endocrine diseases that caused by excessive production of thyroid hormones. [1] Excessive thyroid hormones inhibits the production of serum thyroid-stimulating hormone (TSH). [2] The prevalence of hyperthyroidism is reported to be up to 1.3% in iodine sufficient areas. [3] Higher incidence of it was obtained in females than that in males with the female-to-male ratio of about 5 to 10:1. [4] Hyperthyroidism is clinically manifested by goiter, protruding eyeballs and increased basal metabolic rate. [5] Hyperthyroidism progresses rapidly and once diagnosed, treatment must be taken as soon as possible.
Evidences indicated that hyperthyroidism can elevate the risk of multiple comorbidities, such as cardiovascular, pulmonary diseases, and psychiatric diseases. [6][7][8] The association between hyperthyroidism and excess mortality has been confirmed by several studies. [9,10] Nowadays, anti-thyroid drugs (ATDs) are one of the main methods for the treatment of patients with hyperthyroidism, which can preserve the function of thyroid hormone production and have low possibility of hypothyroidism. [5] Methimazole (MMI) and propylthiouracil (PTU) are 2 most extensively used ATDs for patients with hyperthyroidism. [11] MMI and PTU are effective inhibitors of thyroid iodide peroxidase, which can catalyze the biosynthesis of thyroid hormone from the initial step. [12] MMI exerts its function by inhibiting the peroxidase activity in the thyroid, and then suppressing the synthesis of triiodothyronine (T 3 ) and thyroxine (T 4 ) [13] PTU has an inhibitory effect on peroxidase and the iodization of tyrosine in thyroid, thereby restrains the synthesis of T 4 . Meanwhile, PTU can interfere with the transformation from T 4 to T 3, which decreases the level of serum Free T 3 (FT 3 ). [14,15] Although MMI and PTU were validated to have effects on treating hyperthyroidism, they might have adverse reactions. Previously, a study has demonstrated that PTU has a high risk of adverse reactions compared with MMI in the treatment of hyperthyroidism. [16] Meanwhile, another study has suggested that PTU and MMI has a similar risk of adverse events during the treatment of hyperthyroidism. [17] These controversial results require additional studies to make it clear about the clinical outcomes of hyperthyroidism patients after the treatment of PTU and MMI. This meta-analysis was performed to better understand the efficacy and safety of PTU and MMI in the treatment of hyperthyroidism.

Eligibility criteria
Inclusion criteria were first hyperthyroidism patients. The diagnostic criteria of hyperthyroidism are based on clinical symptoms: metabolic syndromes including heat unbearable, sweat, flustered, hand shake, easy to hunger, hyperphagia, emaciation characterized by goiter, ophthalmic sign, among others; and laboratory examinations: the serum levels of T 3 and T 4 , FT 3 , free T 4 (FT 4 ) are increased, and the serum level of TSH is decreased; second, experimental group: treated with MMI, control group: treated with PTU; third, randomized controlled trials (RCTs); fourth, English and Chinese literatures. Exclusion criteria were: animal experiments; articles with different study topics with our study; articles impossible to extract data; conference articles, dissertations, case reports, metaanalyses, and reviews.

Methodological quality appraisal
For the RCTs included in this study, the modified Jadad scale was used to evaluate their qualities, [18] which has a total score of 7 with 1 to 3 as low quality and 4 to 7 as high quality (Supplementary Table 1-2, http://links.lww.com/MD/G311). Additionally, the Cochrane Collaboration's tool for assessing risk of bias in RCTs was applied to evaluate the quality of included studies. [19] The tool involved in Random Sequence Generation, Allocation Concealment, Blinding of Participants and Personnel, Blinding of Outcome Assessment, Incomplete Outcome Data Addressed, Free of Selective Reporting, and Free of Other Bias. Each was classified as "Yes," "No," or "?." The results of the quality evaluation of included studies were shown in Supplementary Table 3, http://links.lww.com/MD/G311 and Supplementary Figure 6, http://links.lww.com/MD/G310. Moreover, the Grading of Recommendations, Assessment, Development and Evaluation (GRADE) approach was applied to measure the overall quality of evidence included in our study. [20] Evidence was evaluated through two aspects including Decrease quality of evidence (Study limitation, Indirectness, Inconsistency, Imprecision, and Publication bias) and increased quality of evidence (Large magnitude of effect, Residual confounding, and doseresponse gradient). The detailed results were depicted in Supplementary Table 4, http://links.lww.com/MD/G311.

Data collection process
All data were assessed by 2 reviewers (ST and LC) who extracted data including author, year, country, length of study, interventions (MMI or PTU), sex, age, number of study subjects, and outcomes indicators: clinical efficacy (effective rate and drug withdrawal rate); thyroid hormone levels (TSH, T 3 , T 4 , FT 3 , FT 4 , thyrotropin receptor antibody [TRAb] and thyroid peroxidase antibody [TPOAb]); liver function indexes (alanine aminotransferase [ALT], aspartate aminotransferase [AST], and alkaline phosphatase [ALP] levels), and adverse reactions (hypothyroidism, liver function damage, rash, pruritus, and leukopenia) ( Table 1). When disagreements existed between the 2 reviewers, a consensus was achieved by consulting a third person (LJ).

Objectives
The primary objective was to compare the outcomes of patients receiving MMI or PTU including clinical efficacy (effective rate and drug withdrawal rate) and thyroid hormone levels (T 3 level, T 4 level, TSH level, FT 3 level, FT 4 level, TRAb level, and TPOAb level). The secondary outcomes were liver function indexes ALP level, ALT level, and AST level) and adverse reactions (hypothyroidism, liver function damage, rash, pruritus, and leukocytopenia). Subgroup analysis was conducted according to length of study, literature quality, and the results of Cochrane bias of risk evaluation.

Statistical analysis
Stata15.1 software (Stata Corporation, College Station, TX) was employed for statistical analysis in this meta-analysis. The weighted mean difference (WMD) was used as the effect index for measurement data while odds ratio (OR) were utilized as the effect index for the enumeration data with respective 95% confidence intervals (CIs). Heterogeneity test was performed for each outcome, and random-effects model analysis was performed when the heterogeneity was high (I 2 ≥50%), otherwise, fixedeffects model analysis was adopted. When the difference was statistically significant and the heterogeneity was high (I 2 ≥50%), the research time and literature quality were subjected to subgroup analysis. Meta-regression analysis was used to explore the source of heterogeneity. Sensitivity analysis was performed for all outcomes through reducing the literature by one and see whether the final conclusion has changed. The Begg test was applied to assess the publication bias. A difference of P < .05 was statistically significant.

Included studies
According to the search strategy, 11,219 articles were identified through searching English database and 575 articles were identified through retrieving Chinese database. After removing the duplicates, 7446 articles were included. Then 1108 reviews or meta-analysis, 3498 irrelevant researches, 1831 abstracts or case reports, and 893 animal experiments were eliminated. After screening the titles and abstracts, 4 articles unable to extract data and 96 articles with control group not meeting the requirements were excluded. Finally, 16 RCTs were retained. [21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36] In total, 1906 subjects were involved in this study with 973 receiving MMI and 933 receiving PTU. Figure 1 displayed the screen process of the articles.

3.3.2.
Drug withdrawal rate. The data on drug withdrawal rate were described in 2 articles (I 2 = 66.8%). Similar drug Table 1 Characteristics of articles involved in this meta-analysis.
According to the results of the Cochrane Collaboration's tool for assessing risk of bias in RCTs, 6 studies presented high risk of bias in Blinding of Outcome Assessment. Subgroup analysis was also conducted based on the results of Blinding of Outcome Assessment. The data depicted that there were significant differences in Blinding of Outcome Assessment (Yes) (WMD = À1.474, 95% CI: À2.762 to À0.185, P = .025) and Blinding of Outcome Assessment (No) (WMD = À0.890, 95% CI: À1.403 to  Table 2). The results suggested that T 3 level in the MMI treatment group was lower than that of PTU treatment group.

TSH level (mIU/mL).
The data on the level of TSH (mIU/ mL) were available in 9 studies. According to the results of the pooled data analysis, the TSH level was higher in the MMI treatment group than that in the PTU treatment group (WMD = 0.787, 95% CI: 0.380-1.194, P < .001) (Fig. 5A, Table 2 Table 2), implying that T 3 level in the MMI treatment group was lower than that of PTU treatment group in studies with risk of bias in Blinding of Outcome Assessment.

FT 3 level (pmol/L).
Eight studies included the data about FT 3 level (pmol/L). The pooled data indicated that the FT 3 level in the MMI treatment group was lower than that in the PTU treatment group (WMD = À1.388, 95% CI:À2.543 to À0.233, P = .019) (Fig. 6A, Table 2). The sensitivity analysis showed that WMDÀ1.388 (95% CI: À2.543 to À0.233). As the heterogeneity between studies was considerable (I 2 = 97.7%), subgroup analysis was conducted based on length of study and literature quality. The results showed that 1 year (WMD = À1.767, 95% CI: À2.992 to À0.542, P = .005) and low quality (WMD = À 2.311, 95% CI:À2.667 to -1.955, P < .001) presented statistical differences ( Fig. 6B and C, Table 2). The results of metaregression revealed that length of study (3 vs 6 months or 3 months vs 1 year) and literature quality (high quality vs low quality) had no effect on the heterogeneity (P > .05). Additionally, we found significant difference of MMI and PTU in subgroup  Table 2).  Table 2). The sensitivity analysis showed that WMDÀ3.613 (95% CI: À5.972 to À1.255). The heterogeneity test results showed statistically significant difference (I 2 = 98.6%). Subgroup analysis was carried out due to the substantial heterogeneity, demonstrating that there was significant difference in 1 year (WMDÀ4.573, 95% CI: À7.442 to À1.704, P = .002) ( Fig. 7B and C, Table 2). The length of study (3 vs 6 months or 3 months vs 1 year) and literature quality (high quality vs low quality) were not the sources of the heterogeneity according to the results from meta-regression. Subgroup analysis concerning the risk of bias in Blinding of Outcome Assessment according to the Cochrane Collaboration's tool for assessing risk of bias in RCTs was also performed to identify the level of FT 4 in MMI and PTU treatment groups. The data delineated that in studies in Blinding of Outcome Assessment (No) group, the level of FT 4 was lower in the MMI treatment group than that in the PTU treatment group (WMD = À6.759, 95% CI: À7.448 to À6.071, P < .001) (Supplementary Figure 5, http://links.lww.com/MD/ G310, Table 2).

TPOAb level.
Totally, 2 experiments provided information about TRAb level (IU/mL) in patients. The results of heterogeneity test showed no statistically significant difference (I 2 = 0.0%), so fixed-effect model was used for pooled data analysis. The results of pooled data showed that the TPOAb level had no significant difference in between the MMI treatment group and the PTU treatment group (WMD = 11.540, 95% CI: À5.873 to À28.952, P = .194) (Fig. 9, Table 2). The sensitivity analysis showed that WMD = 11.540 (95% CI: À5.873 to À28.952).
3.5. Liver function indexes 3.5.1. ALP level. ALP level (U/L) was noticed in 4 trials. The results of the pooled data delineated that the ALP level was similar in the MMI treatment group and PTU treatment group (WMD = À4.708, 95% CI: À19.606 to À10.189, P = .536) (Fig. 10, Table 2). The sensitivity analysis showed that (WMD = À4.708, 95% CI: À19.606 to À10.189). To investigate the source of heterogeneity (I 2 = 96.8%), meta-regression was performed on length of study, and the results indicated that length of study had no association with the heterogeneity (P > .05).
3.6. Adverse reactions 3.6.1. Hypothyroidism. The risk of hypothyroidism was analyzed in 6 trials and the results indicated that the risk of hypothyroidism was higher in the MMI treatment group than in the PTU treatment group (OR = 2.738, 95% CI 1.444-5.193, P = .002) (Fig. 13, Table 2). The sensitivity analysis showed that OR = 2.738 (95% CI: 1.444-5.193).
3.6.2. Liver function damage. The definition of liver function damage refers to when AST and ALT more than double the upper limit of the reference range. [37] The data on liver function damage were extracted from 9 studies. We observed that the risk of liver function damage in the MMI treatment group was lower than that in the PTU treatment group (OR = 0.208, 95% CI: 0.146-0.296, P < .001) (Fig. 14, Table 2). The sensitivity analysis showed that OR = 0.208 (95% CI: 0.146-0.296).
3.6.3. Rash. A total of 8 articles included the data about rash in the patients. The pooled data revealed that there was no significant difference regarding the risk of rash in the MMI treatment group and the PTU treatment group (OR = 1.419, 95% CI: 0.980-2.056, P = .064) (Fig. 15, Table 2). The sensitivity analysis showed that OR = 1.419 (95% CI: 0.980-2.056).
3.6.4. Pruritus. The data on the risk of pruritus in patients were available in 3 trials. As displayed in Figure 16 and Table 2, no significant difference was shown in the risk of pruritus between the MMI treatment group and the PTU treatment group (OR = 0.247, 95% CI: 0.099-1.220, P = .099). The sensitivity analysis showed that OR = 0.247 (95% CI: 0.099-1.220).

Recurrence of hyperthyroidism
In total, 2 articles explored the recurrence of hyperthyroidism.

Discussion
This meta-analysis compared the efficacy and safety of MMI and PTU in the treatment of hyperthyroidism. The results showed that the levels of T 3 , T 4 , FT 3 , FT 4 and the risk of liver function damage in the MMI treatment group were lower than those in the PTU treatment group. The TSH level and the risk of hypothyroidism  were higher in the MMI treatment group than those in the PTU treatment group. The findings of our study might offer a reference for the treatment of hyperthyroidism regarding ATDs. T 3 and T 4 are members of iodine-containing tyrosine, 90% of them can bind to plasma proteins composed of thyroxin-binding globulin when released to blood, and only a few of them are in free state, becoming FT 3 and FT 4 . [38] The increase of T 3 and T 4 will inhibit the secretion of TSH. TSH serves as the first line indicator for evaluating thyroid function and the best index for screening overt and subclinical hyperthyroidism. [39] MMI suppresses the peroxidase system in thyroid cells to inhibit the iodization of tyrosine which can decrease the expression of T 3 , T 4 and increase the expression of TSH; PTU inhibits the process of transformation of T 4 into T 3 and further elevates the level of TSH. [40] In our study, the levels of T 3 , T 4 , FT 3 and FT 4 in the MMI treatment group were lower than those in the PTU treatment group, whereas the level of TSH level was higher in the MMI treatment group than those in the PTU treatment group. This indicates that MMI is superior to PTU in the treatment of hyperthyroidism and can more effectively reduce the synthesis of T 3 and T 4 . This conclusion was supported by a study from He et al indicating that MMI treatment induced a more rapid decrease of serum T 3 levels than PTU treated patients. [21] Okamura et al emphasized that MMI treatment had better effect on reducing the level T 3 in serum than PTU treatment. [37] That maybe because MMI had better effect on the substrate for T 3 manufacture from T 4 . Heterogeneities existed in the results of T 3 , T 4 , TSH, FT 3 , and FT 4 levels and subgroup analysis and sensitive analysis were conducted. The data depicted that significant differences were observed in 3 months, ≥1 year, high quality and low quality in T 3 level, 3 months, high quality and low quality in T 4 levels, ≥1 year, high quality and low quality in TSH level, 1 year and low quality in FT 3 level and 1 year in FT 4 level. However, meta-regression indicated the sources of the heterogeneity were not because of the length of study (3 vs 6 months or 3 months vs 1 year) and literature quality (high quality vs low quality). Additionally, based on the results of the Cochrane Collaboration's tool for assessing risk of bias in RCTs [19] , subgroup analysis was also conducted based on the results of Blinding of Outcome Assessment. The data indicated that the evident differences were shown in T 3 and T 4 levels in Blinding of Outcome Assessment (Yes) and Blinding of Outcome Assessment (No). Statistical differences were also found in FT 3 level in Blinding of Outcome Assessment (Yes) group. Besides, in Blinding of Outcome Assessment (No) group, the levels of TSH and FT 4 were also significantly different between MMI and PTU groups. The reason of this may be due to Blinding of Outcome Assessment is only one of the items of the Cochrane Collaboration's tool for assessing risk of bias in RCTs.
In our study, we found the risk of liver function damage in the MMI treatment group were lower than those in the PTU treatment group. Liver function damage is a pivotal adverse event of PTU and MMI treatment in hyperthyroidism patients. [41] PTU may have higher risk of liver function damage than MMI. A study from Liaw et al reported that subclinical and asymptomatic liver injury can be commonly induced by PTU. [42] Tamagno revealed that PTU treatment has a higher risk of hepatotoxicity than MMI. [43] According to the results from the report of Russo et al,              PTU ranked the third leading cause of drug-induced liver failure requiring transplants with 23 cases receiving liver transplants between 1990 and 2007 in the United States. [44] This may be because PTU can lead to active metabolites, resulting in the injury of the hepatocellular and the increase of ALT in serum. Accordingly, regular measurement of the liver function for hyperthyroidism patients undergoing PTU treatment is of great value and effective measures should be taken in time when transaminase or bilirubin rise obviously. The risk of hypothyroidism was higher in the MMI treatment group than those in the PTU treatment group in our meta-analysis. In previous study, 10 mg daily administration of MMI was found to cause spontaneous hypothyroidism in 2 patients with diffuse goiter among 36 participates. [45] These findings implied that the clinicians might be careful with the dose of MMI in patients to avoid hypothyroidism. The implication of the present study was that we identified MMI might be superior to PTU in terms of reducing T 3 , T 4 , FT 3 , and FT 4 levels, decreasing the risk of liver function damage and increasing the level of TSH. However, some limitations existed in this study. First, this study lacked the detailed analysis on sex differences in all patients as hyperthyroidism was reported to have higher incidence in females. Secondly, the functions of MMI and PTU vary dose-dependently. The doses of MMI and PTU in all the studies were not completely unification. Thirdly, publish bias was presented in the present study because the positive results were published more easily than negative results. Besides, in the clinic, more drugs will emerge for treating hyperthyroidism and the efficacy and safety of these drugs might be analyzed by network meta-analysis to identify the best drugs for treating patients with hyperthyroidism. These limitations implied that the results of our study should be interpreted with caution.

Conclusions
This meta-analysis compared the efficacy and safety of MMI and PTU in treating hyperthyroidism. The results of it indicated that the efficacy and safety of MMI was better than PTU in patients with hyperthyroidism regarding reducing T3, T4, FT3, and FT4 levels, decreasing the risk of liver function damage and increasing the level of TSH. The findings of the present study might serve as a guide for clinicians in the treatment of hyperthyroidism.