Identifying Compound Effect of Drugs on Rheumatoid Arthritis Treatment Based on the Association Rule and a Random Walking-Based Model

Rheumatoid arthritis (RA) is a chronic autoimmune disorder that is diagnosed mainly on the basis of patient signs, symptoms, and laboratory indices. However, the exact causes of RA are unclear. Moreover, there is a lack of any method of dynamically evaluating the efficacy of the medication administered to treat RA. Here, we applied a random walk model to reveal the compatibility among the various constituents of traditional Chinese medicine and evaluate their therapeutic efficacy against RA. Drugs commonly used to treat RA were investigated using cluster analysis. The association rule analysis was applied to identify compatibilities among the constituents. A random walk model was developed to evaluate drug efficacy based on an in-house database comprising the clinical records of 9,408 RA patients. Frequently administered medicines were combined into three correlated sets. The evaluation based on the random walk method showed that the drug combination improved ESR, CRP, C3, C4, and IgA more effectively than any single drug. The present study demonstrated that the TCM constituents complement each other and various combinations of them produce different therapeutic effects on RA treatment.


Introduction
Rheumatoid arthritis (RA) is a common, refractory systemic autoimmune disease caused by multiple factors [1]. RA morbidity is increasing worldwide. However, it is difficult to treat this condition as its etiology and pathogenesis are unclear [2]. In certain patients, the symptoms cannot be effectively controlled [3]. Several new synthetic and biological diseasemodifying antirheumatic drugs (DMARDs) are administered as RA therapy. Nevertheless, these treatments are lengthy and costly and induce adverse reactions [4,5]. Therefore, it is imperative to find new RA therapies that are efficacious and economical and cause few side effects.
For many years, Chinese herbal medicine has been administered for RA treatment [6,7]. Combinations of various medications such as oral Chinese herbal decoctions, prescription preparations, and external applications have attracted interest in recent years as they have diverse applications [8][9][10][11]. They regulate multiple targets in immunity and inflammation, work synergistically, are unlikely to induce resistance, and may have optimal clinical efficacy [12]. At present, however, there is still no effective clinical evaluation system for Chinese herbal medicine application. Further, relative therapy remains to be investigated in order to determine the rationality and regularity of herb compatibility. Hence, we pursued these research objectives in the present study.
Data mining technology uncovers and compiles potentially valuable knowledge based on a large amount of data. It has been widely used in the field of medicine. This process comprises data preparation and mining followed by expression and analysis of the results. It is mature information processing that involves database applications. Database technology entails data processing and information management. Data are processed in the database and analyzed by studying the basic theory and practice of structure, storage, design, management, and application [13]. Data mining technology includes various methods and extracts different aspects of the information content.
Here, we evaluated the clinical therapeutic effects and the pharmaceutical rule of medicine in the treatment of patients with RA. We used clustering analysis, association rules, a baseline matching algorithm, and a random walk model to analyze clinical RA data. The results showed complementary information among the drugs. Various combinations of them could produce different therapeutic effects in RA treatment.

Material and Methods
2.1. Materials. Hospitalization data were compiled for persons who were RA inpatients between July 2009 and May 2019 in the Department of Rheumatology and Immunology of the First Affiliated Hospital of Anhui University of Chinese Medicine. The dataset consisted of records of the use of Chinese herbal medicine, Xinfeng capsule prescription preparation, Furong ointment, and disease-related laboratory indices including the inflammatory markers CRP and ESR and the immune indicators CCP, RF, IgA, IgM, IgG, C3, and C4. The research scheme was approved by the Ethics Committee of the First Affiliated Hospital of Anhui University of Chinese Medicine. A total of 10,155 patients with RA were searched, of which 9,408 were treated with Chinese herbal medicine. The patients were assigned either to a control group (Chinese herbal medicine alone) or an experimental group (Chinese herbal medicine plus Xinfeng capsule/Furong ointment prescription preparation). There were 3,533 cases in the control group and 5,875 cases in the experimental group.

Methods
2.2.1. Cluster Analysis. The designation for the use of Chinese herbal medicine was 1 while that for nonuse was 0. Chinese herbal medicine compatibility was investigated by systematic clustering in SPSS v. 21.0 (IBM Corp., Armonk, NY, USA). In the clustering analysis algorithm, each herb was regarded as a cluster, and N clusters were combined to form a new class based on the similarity between objects. The Euclidean metric was used to calculate similarity between herbs [14]:

Association Rules
(1) Apriori Algorithm. The designation for the use of Chinese herbal medicine was 1 while that for nonuse was 0. The Apriori module in SPSS Clementine v. 11.1 (IBM Corp., Armonk, NY, USA) was used to identify correlations among Chinese herbal medicines. We set the minimum support and confidence to 80% and the degree of improvement to >1. The Apriori algorithm was implemented to establish the relationships among items within a dataset. It is also known as a shopping blue analysis. Each drug was treated as a variable in this dataset. The formulae [14] applied were as follows: where X ⟶ Y is an association rule, X (left-hand side (LHS)) and Y (right-hand side (RHS)) represent the set of herb items, σðXÞ is the frequency of itemset X, X ∪ Y is the union of itemsets X and Y, σðX ∪ YÞ is the frequency with which itemsets X and Y appear together, supportðX ⟶ YÞ is the frequency with which X and Y appear together, and confidenceðX ⟶ YÞ is the probability that itemset Y appears in the presence of X. Expected confidenceðX ⟶ YÞ is the probability that itemset Y appears without any conditional influence. Lift is the ratio of the probability that itemset Y appears in the presence of itemset X to the frequency of itemset Y. Support and confidence are often used to eliminate meaningless combinations. Lift indicates the validity of the association rules.
(2) FP-Growth Method. Frequent pattern growth (FP-Growth) adopts a divide-and-conquer technique and recursively projects a transactional database into a set of smaller projected transactional databases and mines frequent itemsets in each projected database by exploring only locally frequent items. This mines the complete set of frequent itemsets and substantially reduces those candidate itemsets that do not exist in the database. FP-Growth stores the transactional database in a highly condensed much smaller data structure called frequent pattern tree (FP-tree). The support of candidate itemsets is counted directly from the FP-tree without scanning the original database multiple times. This improves the processing speed of the algorithm [15]. We set each Chinese herbal medicine as an itemset and explore the frequency between itemsets.

Baseline Matching
Algorithm. The baseline matching algorithm solves real-world inconsistencies in patient condition (immune inflammation index). It is based on a 2D Euclidean distance. Starting from its minimum value, the target area is tracked and stripped from small to large, one unit at a time, until the target area is empty. In this way, objects near the target value are obtained. Matching is executed as shown in Figure 1. For an uncorrelated walk, the direction of each step is independent of those of the previous steps. For a correlated random walk, the direction of each step is independent of the history ("memory") of the walker. A random walk naturally motivates the quantification of this correlation by calculating the "net displacement" (y) of the walker after one step which is the sum of the unit steps uðiÞ for each step i [16]: An important statistical quantity characterizing any walk 2 is the root mean square fluctuation FðlÞ about the average of the displacement. FðlÞ is defined as the difference between the average of the square and the square of the average of a quantity ΔyðlÞ defined by The output of this operation is equivalent to (1) walking a set of calipers for a fixed distance l, (2) sequentially moving the starting point from l 0 = 1 to l 0 = 2 and so on, (3) calculating the quantity ΔyðlÞ and its square for each l 0 , and (4) averaging all calculated quantities to obtain equation (4a): 2.3. Statistical Processing. All data were analyzed in SPSS v. 21.0 (IBM Corp., Armonk, NY, USA). A nonparametric test on two related samples was run for the control and experimental groups before and after treatment. Differences between groups before and after treatment were compared with a Mann-Whitney rank sum test. Differences were considered statistically significant at P < 0:05.

Association Rule Analysis of Chinese Herbal Medicine
Used in RA Treatment. The minimum support and confidence were set to 80%. The Apriori module analysis indicated the correlations among Chinese herbal medicines. The degree of lift was >1 and P < 0:05 (Table 2). We set each Chinese herbal medicine as an itemset. We obtained a pair of highly related drugs and the frequency of these highly related drugs (Figure 3).
3.4. Improvement of Immune-Inflammatory Indices. Compared with those before treatment, ESR, CRP, IgA, IgG, C3, C4, CCP, and RF decreased significantly in both groups after treatment. After treatment, ESR, CRP, IgA, IgM, IgG, C3, and C4 decreased more significantly in the experimental group than the control group (Table 3).

Evaluation of Immune-Inflammatory Indices by Random
Walking Model. The ESR of the control and experimental groups had 2,923 and 6,420 comprehensive evaluation records, respectively. The improvement coefficients of the patients were 0.369 and 0.452, respectively. The clinical significance was that the patients had to walk 5.850 and 4.210 steps, respectively, for each comprehensive index improvement. The CRP of both groups had 3,254 and 6,840 comprehensive evaluation records, respectively. The patient improvement coefficients were 0.466 and 0.510, respectively. The clinical significance was that each improvement in the patient comprehensive index required 4.440 and 3.630 steps, respectively. There were 1,795 and 3,430 comprehensive evaluation records for C3 in both groups, respectively. The patient improvement coefficients in both groups were 0.292 and 0.330, respectively. The clinical significance was that for each improvement in the comprehensive index, the patients had to walk 9.880 and 8.040 steps, respectively. There were 1,795 and 3,430 comprehensive evaluation records for C4 in both groups, respectively. The patient improvement coefficients in both groups were 0.416 and 0.432, respectively. The clinical significance was that the patients had to walk 6.930 and 6.140 steps, respectively, for each comprehensive index improvement. There were 1,796 and 3,426 comprehensive evaluation records for IgA in both groups, respectively. The patient improvement coefficients were 0.202 and 0.269, respectively. The clinical significance was that the patients had to walk 14.310 and 9.860 steps, respectively, for each comprehensive index improvement (Table 4 and Figure 4).

Discussion and Conclusion
Clustering analysis divides similar objects into different sets.
Clustering is an unsupervised learning process of searching clusters. The Apriori algorithm clarifies the relationship between items in a dataset. This process is known as a shop-ping blue analysis. The Apriori algorithm divides association rule discovery into two steps. First, all frequent itemsets in transaction database 1 are retrieved via iteration. The itemsets here are those whose support is not lower than the threshold set by the users. Second, the frequent itemsets are used to construct rules that satisfy minimum user trust.  Values are % degrees of relevancy.

BioMed Research International
Mining or identifying all frequent itemsets is the core of the algorithm and accounts for most of the computation. A baseline matching algorithm is based on a two-dimensional Euclidean distance. Starting at its minimum value, the target area is tracked and stripped from small to large one unit at a time until it is empty and the objects near the target value are obtained. The random walking model explores the law of motion and integrates the probability and dissipative structure theories [17]. Whether or not there is a long-range correlation in the random walking model, it indicates whether the index system is effective. When long-term correlation is confirmed, the curative effect is measured by calculating the ratio of the random walk cumulative fluctuation value to the random walk point or the random positive increase rate [18]. Here, we used the random walk model to evaluate the therapeutic efficacy of RA drugs.
In traditional Chinese medicine (TCM) theory, RA is in the Bi syndrome category. According to TCM theory, Bi syndrome occurs in response to incoordination among pathological factors and is mainly attributed to external pathogens such as wind, dampness, heat, and lack of vital body energy. It may manifest spleen deficiency that cannot resist pathological factors [19]. Dampness and wet phlegm transport are weakened in spleen deficiency and gradually progress to humid heat and the stagnated blood stasis syndrome. RA patients may present with joint swelling, chronic pain, fever, joint deformity, and loss of joint function. Hence, we hypothesized that RA pathogenesis comprises spleen deficiency, dampness resistance, heat exuberance, and blood stasis according to TCM theory. The characteristics of TCM include holism and syndrome differentiation-based treatment. However, most patients have a variety of RA symptoms that might change over time. Depending on their symptoms, patients may be classified according to various patterns and treated by different approaches. The use of Chinese herbal medicine in the treatment of the syndrome could alleviate RA symptoms and attenuate side effects caused by chemical drugs [20,21]. The Chinese herbal medicine administered for RA treatment in our hospital is divided into four categories and can significantly improve RA immunity and inflammation indicators. The efficacy of Chinese herbal medicine combined with prescription drugs is superior to that of Chinese herbal medicine alone [5,8].
Here, we applied cluster and association rule analyses, baseline matching algorithms, and random walking model   ; CCP: anticyclic citrullinated peptide; RF: rheumatoid factor. d 0 is the difference in the control group before and after treatment. P 0 is the comparison between the control group before and after treatment. d 1 is the difference in the experimental group before and after treatment. P 1 is the comparison between the experimental group before and after treatment. P 2 is comparison between both groups after treatment. 6 BioMed Research International data mining to identify Chinese herbal medicine for RA treatment, compatible combinations, and significant therapeutic efficacy against RA.
We identified Chinese herbal medicine commonly used for RA treatment (Table 1). Here, we divided them into four categories according to the efficacy of the constituent herbs.

BioMed Research International
Poria, Pericarpium Citri Reticulatae, Semen Coicis, Rhizoma Dioscoreae Oppositae, and Fructus Hordei Germinatus were used 33,818 times to invigorate the spleen and resolve dampness. Radix Salviae Miltiorrhizae, Flos Carthami, Semen Persicae, Caulis Spatholobi, and Rhizoma Chuanxiong were used 28,614 times to promote blood circulation and dredge collaterals. Radix et Rhizoma Clematidis Chinensis, Sigesbeckia orientalis L., Rhizoma Alismatis, Semen Plantaginis, and Radix Angelicae Biserratae were used 20,063 times to dispel wind and dehumidify. Herba Taraxaci Mongolici, Herba Hedyotdis, Radix Scutellariae Baicalensis, Cortex Phellodendri Amurensis, and Rhizoma Anemarrhenae were applied for heat clearing and detoxification. Within the four classes of Chinese herbal medicine, the spleen meridian was used 35,478 times, sweet taste was used 57,052 times, and bitter taste was used 50,762 times. In traditional Chinese medicine, bitter taste is used to dehumidify while sweet taste is used to tonify the spleen. Both can verify that RA pathogenesis originates mainly from spleen deficiency and dampness. Data mining technology disclosed that the Chinese herbal medicines most efficacious at treating RA were those that invigorated the spleen, resolved dampness, cleared heat, and dredged collaterals. Thus, administration of these preparations may reverse RA pathogenesis. The main pathogenic factors are wind, dampness, heat, spleen deficiency, and collateral stasis caused by RA joint symptoms.
By cluster analysis, we extracted common Chinese herbal medicine combinations for RA treatment (Figure 2). We conducted a cluster analysis on commonly used Chinese herbal medicines to discover various combination rules for their use. Chinese herbal medicines were divided into three sets. Herbs in the first set invigorate the spleen, remove dampness, and dredge collaterals. The herbs in the second set clear heat and dampness. The herbs in the third set clear heat, promote dampness, invigorate the spleen, and dredge collaterals.
To elucidate the compatibility of the herbs commonly used for RA treatment in our hospital, we analyzed their association rules (Tables 2; Figure 3). This process determines the degree of support and confidence. We set the minimum support and confidence to 80%, the degree of improvement to >1, and P < 0:05. We obtained one pair of highly correlated drugs and the frequency of these highly correlated drugs. Herb compatibility clarification is invaluable in planning rational clinical drug use, enhancing curative efficacy, and developing modern pharmacy.
The efficacy of Chinese herbal medicine at treating RA has been confirmed (Tables 3 and 4; Figure 4). In Asian countries, compatible Chinese medicines have been widely used in clinical RA treatment as the combinations are simple, flexible, and efficient [22]. In the present study, we applied data mining technology to identify the rules of use of Chinese herbal medicine for RA treatment in our hospital and tested the efficacy of herbal medicine in RA treatment. Xinfeng capsule (Anhui medicine No. Z20050062) is a prescription hospital preparation of Anhui Traditional Chinese Medicine Hospital. It consists of Radix Astragali Mongolici, Semen Coicis, Radix et Rhizoma Tripterygii, and Scolopendra. It invigorates the spleen, replenishes qi, resolves dampness, removes arthralgia, promotes blood circulation, and removes meridian obstructions. It is a traditional Chinese medicine prescription with a long history of use in clinical RA treatment. Furong ointment is a prescription preparation of our hospital. It clears heat, detoxifies, reduces swelling, relieves pain, and promotes healing in RA.
Of the 9,408 RA patients in this study, 3,533 were in the control group and 5,875 were in the experimental group. The degree of immune-inflammatory response varied among RA patients and was reflected mainly in the differences in the numerical values of their immune-inflammatory indices. We matched the immune-inflammatory indices of both groups by a computer matching algorithm before treatment. In this manner, we unified the disease before therapy. After treatment, the immune-inflammatory indices of both groups significantly decreased. Compared with the control group, ESR, CRP, IgA, IgG, C3, and C4 were significantly lower in the experimental group. Hence, there was strong therapeutic efficacy of Chinese herbal decoctions combined with prescription preparations.
We also applied a random walking model to evaluate the immune-inflammatory indices of both groups of RA patients. The improvement coefficients of ESR, CRP, C3, C4, and IgA were higher for the experimental group than those for the control group. There were fewer walking steps in the experimental group than the control group for the improvement of each comprehensive index. Thus, there is a long-term correlation between RA treatment and Chinese herbal decoctions combined with prescription preparations. Further, the combination had a superior curative effect to that of the Chinese herbal decoction alone.
RA pathogenesis is complex. Compatible Chinese herbal medicines administered for RA treatment have numerous pharmacological effect targets [23]. The commonly used Chinese herbal medicine we extracted had a positive therapeutic effect on RA. The 2015 edition of the Chinese Pharmacopoeia lists the functions of commonly used Chinese herbal medicines including spleen invigoration, dampness resolution, blood circulation promotion, blood stasis removal, heat clearing, and detoxification. These findings are consistent with RA pathogenesis in traditional Chinese medicine, and these Chinese herbal medicines have been widely used in RA treatment [24].
The data mining method used here has several advantages. First, it requires no data structures. This property is very useful for data mining in Chinese herbal medicine as the data structure for most Chinese herbal medicines is not uniform. Second, a variety of data mining methods are applied for comprehensive analyses. In this way, reliability of the results is assured. Third, we can verify the clinical efficacy of the extracted Chinese herbal medicine to ensure the accuracy of the conclusion. The most important aspect of data mining technology is that it facilitates learning the main treatment methods and the uses of Chinese herbal medicine. Our study also has some limitations. We collected only prescription information but not diagnostic information of TCM syndrome classification, so our results have certain deviations. Besides, the safety of the drugs was not evaluated and should be investigated in future research. 8 BioMed Research International In conclusion, we applied clustering analysis, association rules, a baseline matching algorithm, and a random walking model to show that there are complementary relationships among the constituent herbs in Chinese herbal medicine used to treat RA. Various combinations of these materials produce different therapeutic effects in RA treatment. The new clinical evaluation method known as random walking dynamically evaluates the therapeutic efficacy of drugs against RA and provides a novel technique for evaluating clinically applied medications.

Data Availability
The datasets generated for this study are available on request to the corresponding authors.