The Effectiveness and Safety of Wu Tou Decoction on Rheumatoid Arthritis—A Systematic Review and Meta-Analysis

Rheumatoid arthritis (RA) is an autoimmune disease primarily affecting the joints and requires various treatments, including medication, injection, and physiotherapy. Wu tou decoction (WTD) is a traditional Chinese medicine prescribed for RA, with several articles documenting its effectiveness in RA treatment. This systematic review and meta-analysis aimed to evaluate the efficacy and safety of WTD for RA. We searched for randomized controlled trials (RCTs) comparing WTD with conventional treatments (including medication, injection, and physiotherapy) from its inception to May 2024. Primary outcomes were disease activity scores, including effective rate, tender joint count, and morning stiffness. Secondary outcomes comprised blood test results (erythrocyte sedimentation rate, C-reactive protein, and rheumatoid factor) and adverse events. Nineteen RCTs involving 1794 patients were included. Statistically, WTD demonstrated better improvement than conventional treatments (18 medications and 1 injection) across the effective rate, joint scale, and blood tests, regardless of the treatment type (monotherapy or combination therapy). Adverse events were reported in 11 studies, with no statistical differences observed between them. The numerical results showed that WTD may offer potential benefits for managing RA. However, the significant discrepancy between clinical practice and the low quality of the RCTs remains a limitation. Therefore, further well-designed studies with larger patient cohorts are needed to draw definitive conclusions.


Introduction
Rheumatoid arthritis (RA) is a chronic autoimmune disease characterized by persistent inflammation in the joints and other organs [1], leading to immune system dysfunction [2].RA can develop at any age and affects 0.5-1% of the global population [3,4].The etiology is unknown, but tumor necrosis factor (TNF)-α and interleukin (IL)-6 are reported to play important roles in the pathogenesis and maintenance of inflammation in RA [5].Additionally, T cells play an important role in bone destruction and inflammation, and B cells are the main source of production of autoantibodies, such as rheumatoid factor (RF), anticitrullinated peptide antibody (ACPA), cytokine secretion, and antigen presentation [6,7].In the later stages of the disease, macrophages produce cytokines, while dendritic cells and natural killer cells also fulfill their duties [8].
RA is characterized by arthritis that can cause joint damage, systemic inflammation, and extraarticular symptoms in other organs, including the heart, kidney, lungs, digestive system, eye, skin, and nervous system [9].In extra-articular symptoms, approximately 30% of patients present with rheumatoid nodules, whereas 10% of patients suffer from Sjogren's syndrome [10], and excessive complications lead to increased mortality [9].Its diagnosis involves abnormal erythrocyte sedimentation rate (ESR), C-reactive protein (CRP), and RF levels, in addition to joint swelling, autoantibody production, and duration of symptoms [11].ACPA is also specific to RA.Like CRP elevation, the presence of ACPA and RF indicates a very early onset of RA development [12].
In clinical practice, several treatment strategies and conventional treatments for RA are available [13,14], with the most common being medications, including nonsteroidal antiinflammatory drugs (NSAIDs) [15], glucocorticoids, and disease-modifying antirheumatic drugs (DMARDs) [16].Recent recommendations are to start DMARDs plus glucocorticoids immediately.Effective doses of methotrexate (MTX, oral or subcutaneous) are used for 3-6 months [17].If medication is ineffective in controlling RA symptoms, treatment is rapidly expanded to include various medications with a treat-to-target strategy [18].In addition, various treatments are administered to alleviate symptoms and enable daily activities with regular disease activity monitoring [18].In contrast, the use of natural ingredients from Oriental medicine offers a viable alternative for RA treatment.Consequently, healthcare providers and patients are increasingly interested in traditional Chinese medicine (TCM), particularly Wu tou decoction (WTD) [19].
WTD is an herbal medicine that has been used to treat joint-related diseases and consists of Radix Aconiti (Wu tou), Herba Ephedrae (Ma huang), Radix Astragali (Huang qi), Radix Paeoniae Alba (Bai shao), and Radix Glycytthiza (Gan cao) [20].There have been attempts to determine the pharmacological component, action, and mechanism of WTD.Chemical profiling of WTD in the rat model via high-performance liquid chromatography revealed that Radix Aconiti and Herba Ephedrae contained alkaloids that had anti-inflammatory and analgesic effects, Radix Astragali and Radix Glycytthiza exert antioxidant effects on flavones and glucosides, while monoterpene glycosides from Radix Paeoniae alba have neuroprotective effects [21,22].
Studies using an animal model of arthritis reported that WTD modulates C-C chemokine receptor 5 (CCR5), affects the inflammatory response in macrophages [24], inhibits nuclear factor kappa B (NF-κB) phosphorylation (through the action of Herba ephedrae), and enhances nuclear factor-like 2 (Nrf2) expression (via Radix Astragali and Radix aconiti) [25].Other mechanisms include reducing angiogenesis in the joint synovium by inhibiting VEGF165/MH7A, which is crucial for endothelial cell activation [26].
These active components, receptors, and mechanisms are closely related to RA. Benzoylaconitine inhibits the expression of IL-6 and IL-8 by inhibiting the activation of the mitogen-activated protein kinase (MAPK), Akt, and NF-κB pathways in human synovial cells [27].In an arthritic rat model, pseudoephedrine reduces the expression of TNF-α, IL-β, and IL-6 while paeoniflorin additionally alters cyclooxugenase-2 protein expression [28,29].CCR5 is a key gene that regulates the cellular immune response and cytokine signaling, which are crucial for distinguishing RA [30].Nrf-2 regulates oxidative stress, immune response, and cartilage and bone metabolism [31].Inflammation-related signals associated with NF-κB have been reported in the context of RA [32].Thus, the possibility that WTD might alleviate symptoms of RA has increased [24,25,33].
Studies have examined the clinical efficacy of WTD in RA [34]; however, knowledge about its effects and safety remains limited due to the lack of a systematic review (SR).Therefore, this study aims to evaluate the clinical efficacy and adverse events of WTD in treating RA through an SR and meta-analysis.

Ethics
Ethical approval was not required because no personal information of patients was collected.

Study Registration
This SR followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Protocols 2020 statement [35].The protocol was registered in PROS-PERO (Registration number: CRD42022310337) and published in March 2022 [36].

Search Strategy
Data searches were conducted from inception to May 2024 across multiple databases, including MEDLINE, Cochrane Library, Web of Science, ScienceDirect, Wiley, EMASE, China National Knowledge Infrastructure, CiNii, Wanfang data, J-STAGE, KoreaMed, Korean Studies Information Service System, National Digital Science Library, Korea Institute of Science and Technology Information, and Oriental Medicine Advanced Searching Integrated System.Searches were performed in the appropriate language for each database (e.g., 'Wu tou decoction' and 'rheumatoid arthritis' in the English database).Additionally, related literature materials, reports, and papers were searched.Manual searches were also performed using textbooks on RA and by contacting authors via e-mail when necessary.

Inclusion and Exclusion Criteria
We set the following inclusion and exclusion criteria.

Inclusion Criteria
We included studies involving patients with RA, regardless of age and sex.Randomized controlled trials (RCTs) were included, excluding those that omitted the "randomization" phase or implemented incorrect randomization.The study included research that used WTD as an experimental treatment for RA and compared its efficacy with conventional treatments, such as nonoperative methods, including medication, injection, and physiotherapy.

Exclusion Criteria
Patients with other forms of arthritis, such as osteoarthritis and gout, were excluded from the study.Non-RCTs, case reports, SR, and studies where the control group did not receive treatment or a placebo were excluded.Additionally, studies that newly initiated or changed interventions during the treatment period or did not clearly present pre-and post-experiment comparison values were excluded.

Study Selection and Data Extraction
Two reviewers (JHM and GEP) independently screened studies and extracted information.They excluded studies based on titles and abstracts and then reviewed the full texts of the articles included to assess their suitability for inclusion in this SR.Any disagreements were resolved through discussion or by involving an additional reviewer (WSS) responsible for reaching a final decision.

The Characteristics of Study
The extracted information included first author, publication years, patient characteristics, interventions in the two groups, session frequencies, duration periods, outcome measures, results, adverse events, and quality of studies.In cases of incomplete information, attempts were made to contact the authors for complete data.If complete data were not obtainable, the meta-analysis was conducted using as much complete data as possible.

Outcome Measures
According to the protocol [36], the disease activity score was established, with the effective rate (ER), tender joint count (TJC), and morning stiffness (MS) as primary outcome measures.Secondary outcome measures included blood test results such as ESR, CRP, RF, and adverse events.A meta-analysis was conducted if there were two or more studies that met the criteria for data synthesis.

Statistical Analysis
The differences from baseline to endpoint were combined to calculate the mean difference (MD) and 95% confidence intervals (CI) for the same outcome measures, while the standardized mean difference and 95% CI were calculated for different outcome measures.These were evaluated using either a random-effects or fixed-effects model.Review Manager (Version 5.3; Copenhagen; The Nordic Cochrane Center, The Cochrane Collaboration, 2014) software for Windows was utilized for the SR.Chi-squared and I-squared tests were employed to assess heterogeneity across the selected studies [37].The interpretation of heterogeneity was categorized as follows: Heterogeneity levels of 0-40%, 30-60%, 50-90%, and 75-100% were classified as unimportant, moderate, substantial, and considerable, respectively.Subgroup analyses were performed when considered necessary.

Quality Assessment
Two reviewers independently assessed the risk of bias using the "Risk of Bias" tool from the Cochrane Collaboration [38], which evaluates seven areas: sequence generation, allocation concealment, blinding of participants and investigators, blinding of outcome assessment, incomplete outcome data, selective reporting, and other biases.The risk of bias for each domain was categorized as "low risk", "high risk", or "unclear risk".Disagreements between reviewers were resolved through discussions; if unresolved, an additional reviewer mediated the final decision.The quality of evidence was rated using the Grades of Recommendation, Assessment, Development, and Evaluation framework, starting from high-quality evidence and stepping down to moderate, low, and very low quality [39].The quality of evidence was assessed for several outcomes.The primary outcomes included ER, TJC, and MS while secondary outcomes consisted of ESR, CRP, and RF.The quality of evidence was rated high if all included studies were RCTs.Five factors, namely, risk of bias, inconsistency, indirection, imprecision, and other considerations, could affect downgrading.Based on the seriousness of each factor after evaluation, a decision was made whether to downgrade by one or two grades.Considering the prevalence of RA, the optimal information size (OIS) was determined to be 200.Consequently, a sample size exceeding 200 was considered significant.

Publication Bias
If more than 10 studies are included in this SR, funnel plots will be presented.

Study Selection
According to their protocol, articles were searched, resulting in the identification of 1677 studies in the databases.After removing 405 duplicate records, 1272 studies underwent screening based on their abstracts and titles.Overall, 1210 studies were excluded due to being non-RCT, non-RA, non-WTD, or for other reasons.After screening, 62 studies were retrieved and assessed for eligibility.The full text of these 62 studies was reviewed, resulting in the exclusion of 37 studies for the following reasons: (1) improper interventions, including modification of herbal medicine according to pattern identification (34 studies); (2) improper randomization (two studies); and (3) inability to obtain the full text (one study).Finally, 25 studies were included in this review, with 19 of them selected for synthesis in this SR and meta-analysis (Figure 1).
went screening based on their abstracts and titles.Overall, 1210 studies were exc due to being non-RCT, non-RA, non-WTD, or for other reasons.After screening, 62 st were retrieved and assessed for eligibility.The full text of these 62 studies was revie resulting in the exclusion of 37 studies for the following reasons: (1) improper inte tions, including modification of herbal medicine according to pattern identificatio studies); (2) improper randomization (two studies); and (3) inability to obtain the fu (one study).Finally, 25 studies were included in this review, with 19 of them select synthesis in this SR and meta-analysis (Figure 1).
Regarding outcome measures, the ER was used most often (16 studies).TJC was used in five studies, while MS was used in four studies.Blood tests included ESR and CRP, each used in 10 studies, and RF, used in seven studies.Additionally, the disease activity score in 28 joints (DAS28) was used in six studies.Other RA-related scales utilized in the selected studies included ACR20, 50, 70, and the health assessment questionnaire (HAQ), each used in three studies.

Efficacy Assessment of WTD Monotherapy
WTD was used in five studies [40][41][42][43][44] as monotherapy (n = 426), with modified WTD used in three studies, while the original WTD was used in two.In the control group, MTX was used in two studies, while LEF was used in the other two.
Regarding outcome measures, the ER was used most often (16 studies).TJC was used in five studies, while MS was used in four studies.Blood tests included ESR and CRP, each used in 10 studies, and RF, used in seven studies.Additionally, the disease activity score in 28 joints (DAS28) was used in six studies.Other RA-related scales utilized in the selected studies included ACR20, 50, 70, and the health assessment questionnaire (HAQ), each used in three studies.

Efficacy Assessment of WTD Monotherapy
WTD was used in five studies [40][41][42][43][44] as monotherapy (n = 426), with modified WTD used in three studies, while the original WTD was used in two.In the control group, MTX was used in two studies, while LEF was used in the other two.

Risk of Bias Assessment
Regarding random sequence generation, 10 studies demonstrated an unclear risk of selection bias without specifying the method of randomization.Five studies demonstrated a high risk of bias, while four studies showed a low risk of bias utilizing the visiting order method.Apart from random sequence generation, all 19 studies showed comparable results.However, owing to the lack of clear criteria for assessing performance and detection biases in each study, the risk of performance and detection bias remained unclear.The risk of attrition bias was low across all studies.Additionally, reporting bias was deemed low in all the studies, while the risk of other biases remained unclear across all included studies (Figure 9).

Risk of Bias Assessment
Regarding random sequence generation, 10 studies demonstrated an unclear risk selection bias without specifying the method of randomization.Five studies demonstra a high risk of bias, while four studies showed a low risk of bias utilizing the visiting ord method.Apart from random sequence generation, all 19 studies showed comparable sults.However, owing to the lack of clear criteria for assessing performance and detect biases in each study, the risk of performance and detection bias remained unclear.T risk of attrition bias was low across all studies.Additionally, reporting bias was deem low in all the studies, while the risk of other biases remained unclear across all includ studies (Figure 9).

Sensitivity Analysis
We performed sensitivity analyses for ESR, CRP, and ER.We examined how effi figures (MD and RR) and heterogeneity changed when each trial was excluded indiv ally.In monotherapy (five included studies), efficacy measures and heterogen changed significantly when excluding three studies (Liu [41] and Wei [42] for ER; Wang [43] for CRP).In contrast, in combination treatment (14 included studies), two s ies (Hu [55] for ESR and Zhou [56] for CRP) showed significant changes (Supplemen Materials).

Publication Bias Assessment
The ER of combination therapy was assessed based on data from > 10 studies, promo an evaluation of publication bias.Across the 13 studies included in the analysis, a gene symmetrical figure was observed, indicating no obvious publication bias (Figure 10).

Sensitivity Analysis
We performed sensitivity analyses for ESR, CRP, and ER.We examined how efficacy figures (MD and RR) and heterogeneity changed when each trial was excluded individually.In monotherapy (five included studies), efficacy measures and heterogeneity changed significantly when excluding three studies (Liu [41] and Wei [42] for ER; and Wang [43] for CRP).In contrast, in combination treatment (14 included studies), two studies (Hu [55] for ESR and Zhou [56] for CRP) showed significant changes (Supplementary Materials).

Publication Bias Assessment
The ER of combination therapy was assessed based on data from > 10 studies, promoting an evaluation of publication bias.Across the 13 studies included in the analysis, a generally symmetrical figure was observed, indicating no obvious publication bias (Figure 10).

Sensitivity Analysis
We performed sensitivity analyses for ESR, CRP, and ER.We examined how efficacy figures (MD and RR) and heterogeneity changed when each trial was excluded individually.In monotherapy (five included studies), efficacy measures and heterogeneity changed significantly when excluding three studies (Liu [41] and Wei [42] for ER; and Wang [43] for CRP).In contrast, in combination treatment (14 included studies), two studies (Hu [55] for ESR and Zhou [56] for CRP) showed significant changes (Supplementary Materials).

Publication Bias Assessment
The ER of combination therapy was assessed based on data from > 10 studies, promoting an evaluation of publication bias.Across the 13 studies included in the analysis, a generally symmetrical figure was observed, indicating no obvious publication bias (Figure 10).

Evidence Evaluation
Table 3 shows the summarized overall results of the GRADE evaluation.Subjective outcomes such as ER, TJC, and MS were downgraded by one level in monotherapy and combination therapy, owing to the risk of bias.Objective outcomes with < 200 participants, such as ESR, CRP, and RF in WTD monotherapy, were downgraded by two levels owing to serious imprecision.Considering the heterogeneity, ER and RF in monotherapy were downgraded by one level.For combination therapy, ESR, CRP, RF, TJC, and MS were also downgraded by one level.Finally, the efficacy of combination therapy was rated as "moderate" for ER, ESR, CRP, and RF.For monotherapy, ER, ESR, and CRP were rated as "low", as were RF, ER, TJC, and MS for combination therapy.Lastly, the RF of monotherapy was rated as "very low" (Table 3).

Discussion
RA is a prevalent autoimmune systemic disease [64] characterized by joint deformities and functional impairments, which significantly affects patients with RA [65].While DMARDs are established treatments that lower CRP and ESR [66,67], a growing interest exists in alternative rheumatic treatments, including natural ingredients, particularly those used in TCM, such as WTD [68].Ba et al. demonstrate that WTD may suppress RA through various chemical mechanisms, employing safer and more patient-friendly approaches [23].In contrast to previous SRs that explored various TCM formulations [69], our review focuses specifically on WTD, covering 19 studies with 1794 participants.
Among these 19 studies, those employing combination therapy (14 studies) constitute a larger proportion than those utilizing monotherapy (five studies).Chae et al. assessed the efficacy of Simiao Xiaobi decoction for RA through an SR [70].In contrast to his study, which emphasizes combination therapy, Simiao Xiaobi decoction was mainly administered as monotherapy, and it demonstrated improvements in RA symptoms.Therefore, we conclude that either combination therapy or monotherapy can be appropriately administered based on the condition of the patient or preference, emphasizing that rational decision making in prescribing medication is essential during treatment.
The experimental group initially showed superior ER than those of the control group, with reductions observed in TJC and MS in monotherapy and combination therapy.Additionally, significant differences between WTD and the control group were observed in DAS28, ACR20, and 50 results.DAS28 results were reported in six studies [43][44][45][46][47]58], followed by assessments of HAQ improvement [46][47][48], Lansbury score, activities of daily living (ADL), and quality of life (QOL) [54].Objective numerical indicators such as ESR, CRP, and RF were included as secondary outcome measures.In monotherapy and combination therapy, the experimental group exhibited significant differences in ESR and CRP than those of the control group.
Significant differences in RF were observed between the two groups in monotherapy and combination therapies.However, data from monotherapy studies indicate a higher risk owing to significant heterogeneity between the two included studies.RF, a recognized diagnostic marker for RA [71], is elevated in over 70% of RA cases compared to < 15% in other forms of arthritis [72].However, elevated RF levels are also common in other autoimmune conditions, such as systemic lupus and Sjogren's disease [73].Additionally, WTD demonstrated efficacy across several inflammatory diseases beyond RA, emphasizing the need to clarify the correlation between RA and RF.
In the control groups of the included studies, MTX and LEF were the most common treatments, followed by NSAIDs.Moreover, among the nine studies with single-treatment controls, LEF was used in four.MTX is recognized as more effective than other csD-MARDs [74], and LEF is often considered the primary treatment option for patients who cannot tolerate MTX [16].Compared with the primary conventional synthetic DMARDs (csDMARDs) typically used in early stages, the favorable outcomes linked to WTD, with no reported side effects, are significant.This suggests that WTD may be a safe and effective option for initial RA treatment.
Regarding adverse events, skin irritation, gastrointestinal issues, including nausea with vomiting, liver failure, diarrhea with vomiting, and decreased WBC were reported.However, no significant differences were observed between the experimental and control groups, with the incidence rates of all adverse events lower in the experimental groups than in the control groups.For example, the combination of WTD and MTX results in fewer adverse events, including abnormal liver and renal function, than in the MTX group alone [20].MTX is the most popular DMARD; however, MTX frequently induces gastrointestinal side effects such as nausea, vomiting, diarrhea, hepatotoxicity, pulmonary toxicity, and hematologic toxicity, which can lead patients to discontinue its use [75].Therefore, the use of WTD in RA treatment warrants careful consideration.
These statistical results showed a promising potential for WTD.However, marked discrepancies exist between the study results and clinical practice.First, regarding the duration of DMARD treatment, medications such as MTX and LEF usually require 3-6 months to show noticeable improvement.However, eight of the selected studies reported significant effects compared to those of MTX or LEF within 12 weeks.[76].Second, the recent clinical practice guidelines for RA emphasize medication strategies, guiding physicians to prescribe alternative medications if initial medications are ineffective.[77] In essence, RA treatment typically builds on previous therapeutic strategies [16][17][18], a factor not addressed in the selected studies.Therefore, applying these findings comprehensively in clinical practice may be challenging.Finally, while the included studies aim to compare the effectiveness of WTD with commonly used conventional medications in clinical practice, several limitations affect their practical application.
Another limitation is the low quality of the included studies.First, a high risk of bias errors exists owing to inaccurate random assignment and inadequate blinding procedures.
Second, despite the efforts of the authors, gaps in comprehensive database coverage may exist.The number of studies on WTD monotherapy and combination therapies is limited, and some studies have short treatment durations, making it challenging to assess the clear efficacy of WTD in either monotherapy or combination therapy.Third, all studies were conducted in China, which may limit their generalizability to the global population.Most did not adhere to international journal standards, and some may have employed unrealistic statistical methods, potentially leading to inaccurate data or methodology.These discrepancies could affect the applicability of the findings to actual clinical practice.Fourth, while the indicators generally used the same units, the risk of numerical errors could exist owing to studies employing units that differ from the commonly used ones.As a result, this study contains some results with high heterogeneity (Figure 7).Fifth, the presence of studies with sex ratios differing from actual clinical practice [41,48] or identified in sensitivity analyses [40][41][42][43]55,56] undermines the credibility of the statistical results.Finally, the WTD used in the experimental group lacks consistency across different studies owing to varying additions and subtractions.Moreover, determining the optimal treatment is challenging because of the diverse types and durations of interventions employed in the control group.
In conclusion, this SR numerically demonstrated that WTD enhanced various indicators compared to traditional treatments, whether used as monotherapy or in combination therapy.Despite these positive statistical results, significant gaps in clinical practice make practical application challenging.Therefore, future studies should prioritize factors such as randomization assignment, blinding of participants or results, and outcome reviews.Additionally, as discussed by an experienced rheumatologist, clinical studies should reflect situations that could benefit patients with RA to reach a definitive conclusion.

Conclusions
This review highlights the therapeutic potential of WTD compared to conventional treatment based on statistical aspects.However, owing to the significant discrepancies with clinical practice and the low quality of the included studies, applying these findings in real-world clinical settings is challenging for physicians.Therefore, further studies with well-designed and larger patient cohorts are needed to draw definitive conclusions.

Table 1 .
The characteristics of the included studies.

Table 2 .
Adverse events in each group.

Table 2 .
Adverse events in each group.

Table 3 .
GRADE certainty of evidence assessments.
GRADE has four levels of quality of evidence: high, moderate, low, and very low.More ⊕ indicates better quality of evidence.Abbreviations: CRP: C-reactive protein; ER: Effective rate; ESR: Erythrocyte sedimentation rate; MD: Mean difference; MS: Morning stiffness; RF: Rheumatoid factor; RR: Risk ratio; TJC: Tender joint count.