Next Article in Journal
Effects of Kinesiology Taping on Shoulder Posture and Peak Torque in Junior Baseball Players with Rounded Shoulder Posture: A Pilot Study
Previous Article in Journal
Frailty and Sleep Disorder in Chronic Liver Diseases
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparison of Data Mining Methods for the Signal Detection of Adverse Drug Events with a Hierarchical Structure in Postmarketing Surveillance

Division of Biostatistics, Department of Biomedical Systems Informatics, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Life 2020, 10(8), 138; https://doi.org/10.3390/life10080138
Submission received: 29 May 2020 / Revised: 31 July 2020 / Accepted: 4 August 2020 / Published: 5 August 2020
(This article belongs to the Section Pharmaceutical Science)

Abstract

:
There are several different proposed data mining methods for the postmarketing surveillance of drug safety. Adverse events are often classified into a hierarchical structure. Our objective was to compare the performance of several of these different data mining methods for adverse drug events data with a hierarchical structure. We generated datasets based on the World Health Organization’s Adverse Reaction Terminology (WHO-ART) hierarchical structure. We evaluated different data mining methods for signal detection, including several frequentist methods such as reporting odds ratio (ROR), proportional reporting ratio (PRR), information component (IC), the likelihood ratio test-based method (LRT), and Bayesian methods such as gamma Poisson shrinker (GPS), Bayesian confidence propagating neural network (BCPNN), the new IC method, and the simplified Bayesian method (sB), as well as the tree-based scan statistic through an extensive simulation study. We also applied the methods to real data on two diabetes drugs, voglibose and acarbose, from the Korea Adverse event reporting system. Only the tree-based scan statistic method maintained the type I error rate at the desired level. Likelihood ratio test-based methods and Bayesian methods tended to be more conservative than other methods in the simulation study and detected fewer signals in the real data example. No method was superior to the others in terms of the statistical power and sensitivity of detecting true signals. It is recommended that those conducting drug‒adverse event surveillance use not just one method, but make a decision based on several methods.

1. Introduction

It is critical to detect signals of adverse drug reactions from real-world data early enough to protect public health. From the real-world data, we could identify new effects of drugs that had not been identified during premarketing clinical trials. Adverse event (AE) information after drug marketing is often collected via a spontaneous reporting system to identify any long-term adverse drug reactions. In Korea, for example, the Korea Institute of Drug Safety and Risk Management (www.drugsafe.or.kr) collects the information through a spontaneous reporting system.
Through this system, anyone, for example, a patient who has taken the drug, a doctor, or the manufacturer, can report an AE. They report information such as the symptoms of the AE, the date of onset, the name of the drug, the frequency and duration of the dose, patient information, and causality assessment information. As the causality can only be reported by medical experts, the information reported by the patients does not confirm that an AE has been caused by a particular drug. In addition, there could be issues related to data quality and under-reporting. The total number of people who received the drug and the number of AEs are not precisely known. It is difficult to determine a causal relationship between drugs and adverse effects from a spontaneous reporting system database. We can only identify signals of adverse drug reactions, so additional in-depth studies are needed [1].
Statistical analysis is performed to detect whether any particular AEs have occurred more frequently or whether there are any unexpected AEs. Among the various data mining tools, disproportionality methods are widely used for AE signal detection. Different disproportionality methods exist based on different measures such as the reporting odds ratio (ROR) [2], proportional reporting ratio (PRR) [3], information component (IC), likelihood ratio test-based method (LRT) [4], gamma Poisson shrinker (GPS) [5], Bayesian confidence propagating neural network (BCPNN) [6], new IC method [7,8], and the simplified Bayesian method (sB) [9]. ROR, PRR, IC, and LRT are frequentist methods, while GPS, BCPNN, the new IC method, and sB are Bayesian methods [10]. Some studies suggest that Bayesian methods, such as multi-item gamma Poisson shrinker (MGPS) and BCPNN, outperform frequentist methods such as PRR [2,3,6,7,11]. Other studies showed that the sB method performed better than BCPNN and PRR [9,11]. Unlike other methods, the GPS method needs to estimate hyperparameters of the prior distribution using the whole data. Because of this process, the GPS method requires more computation time. Thus, most pharmaceutical companies and national and international pharmacovigilance organizations use other methods more often than the GPS method [12].
Another type of data mining method for signal detection is the tree-based scan statistic (TreeScan) proposed by Kulldorff et al. [13]. This method simultaneously searches for signals at any level (or layer) of AE in a hierarchical structure, adjusting for the multiple testing problem. It has been applied to drug safety surveillance as well as occupational disease surveillance [13,14].
Both LRT and TreeScan methods were developed based on a likelihood ratio test with the test statistic as the maximum likelihood ratio. Moreover, both methods use the Monte Carlo method to obtain the empirical distribution for statistical inference. The LRT method can handle AEs such as system-organ classes (SOC), preferred terms (PT), or included terms (IT) (only one layer, not all the layers together). If the AE is coded as PT, the LRT method detects the signal of single PT. The TreeScan method for detecting AE signals for a fixed drug in data with multiple layers may consider all the layers, and search for signals of PT and SOC (or other layers) together. The LRT method is more general and covers different aspects of safety signal detection. The TreeScan method is for detecting AE or certain prespecified AE groups as signals for a fixed drug, which is a special case of the LRT method.
Clinical information related to adverse drug reactions is generally coded using medical dictionaries such as the World Health Organization Adverse Reaction Terminology (WHO-ART) and the Medical Dictionary for Regulatory Activities (MedDRA), which have a hierarchical structure. There are several studies comparing the performance of disproportionality methods in a usual database format with AEs and drug combination data [2,3,6,7,9,11,12,15]. However, it is not clear which method performs better than the others when AEs are classified into a hierarchical structure. There is a study comparing the application of the GPS and TreeScan methods to real cohort data by Brown et al. [15]. They showed that the signaled regions were similar. However, in some cases, the TreeScan method detected signals that were not detected by the GPS. We do not know which results are more reliable.
The purpose of this study was to compare the performance of several different data mining methods for signal detection in adverse drug event data grouped into hierarchical structures. Through an extensive simulation study, we evaluated the performance of ROR, PRR, IC, LRT, GPS, BCPNN, sB, and TreeScan on datasets generated based on the WHO-ART’s hierarchical structure. Originally, the methods except TreeScan were not developed for hierarchically structured data. We evaluated all the methods by considering all layers, instead of limiting to a single layer, to better reflect the hierarchical data structure and make fair comparisons. We used the type I error rate, power, sensitivity, and positive predictive value as performance measures. We also compared the application results of the methods to a real dataset from the Korea adverse event reporting system (KAERS).

2. Signal Detection Method

Several different data mining methods have been developed to detect unusually high disproportionate reporting rates from the large drug safety databases. In this paper, we considered ROR, PRR, IC, LRT, GPS, BCPNN, new IC, sB, and TreeScan, which have been relatively widely used. We mainly referred to Huang et al. [10] for a review of the methods apart from TreeScan.
From a large drug safety database, the number of AEs by drugs can be presented in matrix form with I rows of AEs and J columns of drugs. For a particular AE (ith AE) at a particular drug (jth drug), the data can be summarized in a 2 × 2 table, as shown in Table 1.

2.1. Frequentist Methods

2.1.1. Reporting Odds Ratio (ROR)

The ROR is the odds ratio that a particular AE is reported in patients who take a specific drug compared to patients who take other drugs [2]. The ROR for the ith AE and the jth drug ( R O R i j ) is estimated as
R O ^ R i j = n i j / n i . n i j n . j n i j / n . . n i . n . j + n i j = n i j n . . n i . n . j + n i j n i . n i j n . j n i j
and if either n i . n i j or n . j n i j is equal to 0, then R O ^ R i j is not defined. The log-transformed R O ^ R i j can be approximated to a normal distribution as follows:
log R O ^ R i j   ~ ˙   N log R O R i j , σ R O R i j 2 ,
σ ^ R O R i j 2 1 n i j + 1 n i . n i j + 1 n . j n i j + 1 n . . n i . n . j + n i j   .
We can obtain an approximate 100( 1 α ) % confidence interval (CI) for R O R i j as:
C I R O R i j , 100 1 α % = exp log R O ^ R i j ± z 1 α / 2 σ ^ R O R i j 2 ,
where z 1 α / 2 = Φ 1 α / 2 and Φ is the standard normal distribution’s cumulative distribution function.
The null and the alternative hypotheses to test whether the ith AE for the jth drug is a signal or not are expressed as:
H 0   : R O R i j = 1   vs .   H a   : R O R i j > 1 .
As Evans et al. [3] suggested, the lower bound of C I R O R i j , 100 1 α % > 2 , so we reject the null hypothesis and conclude that the ith AE can be interpreted as a signal of the disproportionate rate (SDR) for the jth drug.

2.1.2. Proportional Reporting Ratio (PRR)

The PRR is the ratio of the proportion of patients who reported a particular AE after taking a specific drug to the proportion of patients who have taken other drugs that reported the same AE [3]. We estimate the PRR for the ith AE and the jth drug ( P R R i j ) as:
P R ^ R i j = n i j / n i . n . j n i j / n . . n i . = n i j n . . n i . n i . n . j n i j .
If n i . n i j n i . and n . . n i . n . j + n i j n . . n i . , then R O ^ R i j P R ^ R i j . We use the normal approximation for the distribution of log P R ^ R i j for inference as follows:
log P R ^ R i j   ~   N log P R R i j , σ P R R i j 2 ,
σ ^ P R R i j 2 1 n i j 1 n i . + 1 n . j n i j 1 n . . n i . .
Therefore, an approximate 100( 1 α ) % CI for P R R i j is expressed as follows:
C I P R R i j , 100 1 α % = exp log P R ^ R i j ± z 1 α / 2 σ ^ P R R i j 2 .
The null hypothesis of H 0   : P R R i j = 1 is rejected if the lower bound of C I P R R i j , 100 1 α % > 2 , as for PRR.

2.1.3. Information Component (IC)

The IC is based on the relative reporting rate R R i j , which indicates how many particular events were reported in excess for a specific drug over the expected number of reported counts under the null hypothesis that a drug and AE are independent. The relative reporting rate is estimated by n i j E i j , where E i j = n i n j n   is the expected number of reports for the ith AE and the jth drug under the null hypothesis. The IC for the ith AE and the jth drug is defined as follows:
I C i j = log 2 R R i j = log R R i j log 2 .
The I C i j is estimated as I C ^ i j = log 2 n i j E i j for n i j > 0 ,   n i > 0 , and n j > 0 . The estimated variance of I C ^ i j is given by σ ^ IC ^ i j 2 1 log 2 2   1 n i j + 1 n i + 1 n j   . An approximate 100( 1 α ) % CI for I C i j is expressed as follows:
C I 100 1 α % = exp log I C ^ i j ± z 1 α 2 σ ^ IC ^ i j 2 .
If the lower bound of C I 100 1 α % > 1, the ith AE can be interpreted as a signal.

2.1.4. Likelihood Ratio Test-Based Method (LRT)

Huang et al. [4] proposed the likelihood ratio test statistic, which controls the type I error and false discovery rates by using Monte Carlo hypothesis testing. The null and alternative hypotheses to test whether the ith AE for a specific drug ( j * ) is a signal or not are expressed as follows:
H 0   : p i = q i   vs .   H a   : p i > q i ,
where p i and q i are defined as the reporting rates of ith AE and other AEs for a specific drug, respectively. The maximum likelihood ratio (MLR) is expressed as follows:
M L R = max i n i j * n i . n i j n . j * n i j * n . . n i . n j * . n i j * n . j * n . . n . j * n i j * × I p ^ i > q ^ i ,
where I() is the indicator function, p ^ i = n i j * / n i . , and q ^ i = n . j * n i j * / n . . n i . . As the distribution of MLR under the null hypothesis is unknown, the Monte Carlo hypothesis testing is used to calculate p-values. For details, see Section 2.3.

2.2. Bayesian Method

2.2.1. Gamma Poisson Shrinker (GPS)

DuMouchel [5] suggested the GPS method, which is an empirical Bayes signal detection method. The GPS method uses the relative report rate, defined as follows:
λ i j = n i j E i j ,   where   E i j = n i . × n . j n . . .
This indicates the actual frequency compared to the expected frequency. E i j is calculated under the null hypothesis that there is no association between the drug‒AE pairs. The null and alternative hypotheses are expressed as follows:
H 0   : λ i j = 1   vs .   H a   : λ i j > 1 .
The GPS method assumes that the model and prior distributions are as follows:
model   : n i j | λ i j   ~ i i d   P o i s s o n μ ij ,
prior   : λ i j   ~   w × G a m m a α 1 , β 1 + 1 w × G a m m a α 2 , β 2 ,
where the observed report count n i j follows the Poisson distribution with unknown mean μ i j = E i j × λ i j . The relative report rate follows the mixture gamma distribution where G a m m a α , β is a gamma distribution with mean α / β and variance α / β 2 and 0 < w < 1 is the prior probability that λ i j came from the first gamma distribution of mixture. The hyperparameters α 1 ,   β 1 ,   α 2 ,   β 2 ,   w are estimated by the empirical Bayes method, which is also known as the maximum marginal likelihood.
As gamma distribution is a conjugate prior for Poisson distribution, the posterior distribution of λ i j can be obtained in a closed form as follows:
posterior   : λ i j | n i j   ~   w i j * × G a m m a α 1 + n i j , β 1 + E i j +
1 w i j * × G a m m a α 2 + n i j , β 2 + E i j ,
where w i j * is the posterior probability that λ i j came from the first gamma distribution of the mixture. This is expressed as follows:
w i j * = w × f n i j | α 1 ,   β 1 ,   E i j w × f n i j | α 1 ,   β 1 ,   E i j + 1 w × f n i j | α 2 ,   β 2 ,   E i j ,
where f n i j | α ,   β ,   E i j is the marginal distribution. This marginal distribution follows the negative binomial distribution as follows:
n i j | α , β , E i j   ~   N B α ,   E i j E i j + β ,
where N B ( x | r , p ) = r + x 1 x p x 1 p r .
The 5th percentile of the posterior distribution of λ i j (EB05) is used for decision making. EB05 can be obtained by solving the equation as follows:
0.05 = 0 E B 05 λ i j f λ i j | n i j d λ i j .
This integral can be solved easily using iterative techniques such as Newton’s method. If EB05( λ i j ) is greater than 2, this drug‒adverse effect pair is considered a signal of disproportionate rates (SDR).

2.2.2. Bayesian Confidence Propagation Neural Network (BCPNN)

Bate et al. [6] proposed the BCPNN method based on the IC measure. In the BCPNN method, the IC measure was defined as follows:
I C i j = log 2 θ i j θ i . × θ . j   .
The observed reporting counts and marginal counts are assumed to follow a binomial distribution with a beta distribution for priors as follows:
n i j | θ i j   ~   B i n n . . ,   θ i j   with   θ i j   ~   B e t a α i j , β i j ,
n i . | θ i .   ~   B i n n . . ,   θ i .   with   θ i .   ~   B e t a α i . , β i . ,
n . j | θ . j   ~   B i n n . . ,   θ . j   with   θ . j   ~   B e t a α . j , β . j ,
where α i j = α i . = β i . = α . j = β . j = 1 and β i j = 1 E n i . | θ i . E n . j | θ . j 1 .
Using the delta method, the posterior mean and variance of I C i j can be obtained as follows:
E I C i j | d a t a = log 2 n i j + 1 n . . + 1 2 n . . + γ n i . + 1 n . j + 1 ,
V a r I C i j | d a t a = 1 l o g 2 2 n . . n i j + γ 1 n i j + 1 1 + n . . + γ + n . . n i . + 1 n i . + 1 n . . + 3 + n . . n . j + 1 n . j + 1 n . . + 3 ,
where γ = β ^ i j + 1 = n . . + 2 2   n i . + 1 n . j + 1 .
The lower limit of the 95% credible interval for I C i j is calculated by:
I C α / 2 = E I C i j | d a t a z 1 α / 2 V a r I C i j | d a t a ,
and if s B α , i j is greater than 2, this drug‒AE pair is a possible signal with a higher reporting rate.

2.2.3. New IC Method

The new IC is an improved method for posterior inference in IC analysis, including an accurate estimate for the mode and significantly improved credibility interval estimates. This method also assumes the number of reports n i j ~ i i d P o i s s o n λ i j E i j , where λ i j denotes the relative reporting rate. The prior of parameters λ i j is given by λ i j ~ i i d G a m m a 0.5 ,   0.5 , and the posterior distribution of λ i j is given by λ i j | d a t a ~ i i d G a m m a n i j + 0.5 , E i j + 0.5 . Then, the New I C i j is the posterior mean of log 2 λ i j , which is E ( log 2 λ i j | d a t a ) log 2 n i j + 0.5 E i j + 0.5   .
The 95% credible interval limits ( λ 0.025 ,   λ 0.975 ) are obtained by:
0 λ α G a m m a y | n i j + 0.5 , E i j + 0.5 d y = α
for α = 0.025 and α = 0.975. If the lower limit λ 0.025 > 0 , the ith AE can be interpreted as a signal.

2.2.4. Simplified Bayesian

For small datasets, the GPS method is usually not recommended because of instability in the estimation of the hyperparameters. Thus, Huang et al. [9] suggested the simplified Bayesian (sB) method, which assumes a weaker assumption on prior distribution than the GPS method. The sB method uses a single gamma distribution as a prior as follows:
prior   : λ i j   ~   G a m m a α , α ,
with mean 1 and variance 1 / α . Huang et al. [9] proposed using three values (0.5, 0.01, and 0.0001) for α . They also called the prior distribution with α = 0.5 a less noninformative prior. The other prior distributions were called noninformative priors. The posterior distribution is also a single gamma distribution as follows:
posterior   : λ i j | n i j   ~   G a m m a α + n i j , α + E i j .
The lower bound of the 95% credible interval for λ i j ( s B α ,   i j ) is used for detecting signals of SDR. s B α ,   i j is expressed as follows:
E λ i j | n i j = α + n i j α + E i j
V a r λ i j | n i j = α + n i j α + E i j 2
s B α ,   i j = E λ i j | n i j 1.645 V a r λ i j | n i j .
If s B α , i j is greater than 2, this drug‒AE pair is a possible signal with a higher reporting rate. With α = 0.5 , the sB method is identical to the new IC method [10]. Hence, we only included the sB method in the simulation.

2.3. Tree-Based Scan Statistic

In a medical dictionary, all AEs are categorized into a hierarchical tree structure. Kulldorff et al. [13,14] proposed the tree-based scan statistic, which simultaneously searches for signals at any level (or layer) of AEs in a hierarchical structure. We call the last cell of the tree a leaf and the rest a node. That is, the higher level of leaves is the node. A higher-level node is defined as the parent node; the lower level node is defined as the child node. c i is the observed number of AEs for each leaf I and C = i c i = i n i j is the total observed number of AEs reported in patients who have taken a specific drug j and X = i x i = i n i . is the total number of AEs reported in patients who have taken any drugs.
When the branches of a tree are cut, the sum of the observed and total number of AEs in the leaves of each cut, G, c G = i G c i and x G = i G x i , respectively, are obtained. G includes both the child nodes and parent nodes as a unit of AE. For each cut G, we can calculate the log likelihood ratio and test statistic:
L R G = c G log c G x G + C c G log C c G X x G .
T = max G L R G × I c G x G > C c G X x G ,
where I() is the indicator function. The cut G that maximizes LR(G) is the most likely cut of related AEs. The null hypothesis implies that the group defined by cut G has the same ratio of observed to expected AEs as the rest of the tree. In inference, Monte Carlo hypothesis testing is used, calculating the most likely cut in each random dataset. Firstly, the likelihood of the most likely cut in a real dataset is calculated. Secondly, 9999 random datasets are generated under the null hypothesis and the test statistic for each random dataset calculated. Then, the p-value is calculated as R/(9999 + 1), where R is the rank of the test statistic of real dataset compared with random datasets.
The LRT and TreeScan methods basically use the same test statistic. Because the TreeScan considers the hierarchical structure in nature, the distribution of the test statistic is also obtained by comparing all possible cuts in the hierarchical structure. Even if the two methods detected the same signal, p-values could be different.

3. Simulation Study

3.1. Data Generation and Evaluation Measures

We generated datasets that reflect WHO-ART’s hierarchical structure, which can be expressed as system-organ classes (SOC), preferred terms (PT), and included terms (IT) for AEs [16]. In the simulation study, we included only SOC and PT levels. To reduce the computation time, we only considered 500 drugs and 300 AEs, which were randomly selected from a total of 2161 PT levels. We followed the approach in the study by Huang et al. [4] to generate our simulation data.
First, we generated marginal counts of AEs n 1 . , , n I . ( I = 300) and drugs n . 1 , , n . J ( J = 500) as follows:
n 1 . , , n I . | n . .   ~   M u l t i n o m i a l n . . , u 1 i = 1 I u i , ,   u I i = 1 I u i n . j , , n . J | n . .   ~   M u l t i n o m i a l n . . , u 1 j = 1 J u j , ,   u J j = 1 J u j ,
where u   ~   U n i f o r m 0 , 1 with n . . = i = 1 I n i . .
Next, we generated the number of cases reported for a specified drug j * ,   n 1 j * , , n I j * using
n 1 j * , , n I j * | n . j *   ~   M u l t i n o m i a l n . j * , p r r ,
where p r r = r r 1 j * × r 0 × n 1 . n . . , ,   r r I j * × r 0 × n I . n . . is a vector of probabilities with r r 1 j * , , r r I j * as the relative reporting rates. When r 0 is considered as the baseline risk, p r r has the constraints that 0 r r i j * × r 0 × n i . n . . 1 ,   i = 1 , , I , and i = 1 I r r i j * × r 0 × n i . n . . = 1 . Note that the number of reported cases was generated for a specific drug, and hence the true signals are signals for each drug. This means that the relative reporting rate for the AE with a true signal is higher than those for all the other AEs for one fixed drug. If an AE is a true signal, the relative reporting rate is greater than 1, while the relative reporting rate is equal to 1 when the AE is a false signal [11]. The cells for the true signals were randomly selected first depending on the assumed proportion of true signals. The relative reporting rate for each of the selected cells as true signals was generated from U n i f o r m 1.2 ,   10 and U n i f o r m 1.2 ,   4 .
While the TreeScan method detected signals simultaneously for both SOC and PT levels, all the other methods detected signals from SOC and PT levels separately. To evaluate the performances of the methods considering the hierarchical data structure, we merged two separate results from each level for all methods except the TreeScan method.
We generated 1000 datasets for each of nine different settings with three different total sample sizes (300,000, 500,000, 1,000,000) and three different percentages of true signals (3%, 5%, 10%). We used five different cutoffs, which are the criteria for signal detection for each method. Different criteria have been used depending on the organization for different methods [17]. In practice, one may change the criteria based on experience. We used the same criterion of the lower bound of the 95% CI for fair comparison in our simulation.
To compare the performance, we calculated the type I error rate, sensitivity, positive predicted value (PPV), and power for specific drugs. Under the null hypothesis, the type I error is estimated as follows:
Type   I   error = #   of   times   detecting   at   least   one   false positive   signal total   #   of   simulated   datasets .
The sensitivity, PPV, and power are estimated as:
Sensitivity = 1 S s = 1 S #   of   true positive   signals   in   s th   simulated   dataset #   of   true   signals   in   the   s th   simulated   dataset
PPV = 1 S s = 1 S #   of   true positive   signals   in   s th   simulated   dataset #   of   detected   signals   in   the   s th   simulated   dataset
Power = #   of   times   detecting   at   least   one   signal total   #   of   simulated   datasets ,
where S is the total number of simulated datasets with at least one signal detected. We used R software 3.5.2 version (Vienna, Austria) for all simulations and data analyses.

3.2. Results

3.2.1. Comparison of Type I Error Rate

To compare the type I error rate of each method and cutoff, all relative reporting rates were set to 1 for each total sample size (Table 2). The type I error rates of the ROR, PRR, and IC methods were relatively high for the standard cutoff and for all total sample sizes, which means that spurious detection could frequently occur even when there are no actual signals. The type I error rates of the GPS and sB methods were close to 0 for the standard cutoff and all total sample sizes. The type I error rates of the ROR, PRR, IC, GPS, BCPNN, and sB methods varied depending on how the cutoff was set. On the other hand, the type I error rates of the LRT and TreeScan methods were close to the prespecified significance level in most cases, although the LRT method had slightly higher type I error rates.

3.2.2. Comparison of Sensitivity, PPV, and Power

Table 3 and Table 4 present the results for sensitivity, PPV, and power of each method when the total sample size is equal to 300,000. The other results are presented in Appendix A. For all simulation settings and the standard cutoff for each method, the ROR, PRR, and IC methods had relatively higher sensitivity and power than the other methods. However, the LRT, GPS, BCPNN, sB, and TreeScan methods had relatively higher PPV than the other methods. This means that the ROR, PRR, and IC methods may detect too many signals regardless of whether they are actually true, so these methods could detect many false signals as well as true ones. On the contrary, the LRT, GPS, BCPNN, sB, and TreeScan methods detected much fewer signals, but more true signals than false ones.
When the relative reporting rates were low (Table 4), all the methods had lower performance compared to when the relative reporting rates were high (Table 3). The GPS, BCPNN, and sB methods had a significant decrease in power and sensitivity, especially the GPS method.
As the percentage of true signals increased for all settings of total sample size, the sensitivity decreased but the PPV and power increased for all methods. As the total sample size increased for all settings of the percentage of true signals, the sensitivity, PPV, and power increased for all methods. However, depending on the cutoff of each method, the sensitivity, PPV, and power varied. No single method was superior to the others overall for all settings.

4. Example

4.1. Korea Adverse Event Reporting System (KAERS)

The KAERS is a spontaneous reporting system that receives and manages adverse drug events reported by patients, manufacturers, or medicine experts, provided by Korea Institute of Drug Safety and Risk Management. It consists of drugs, AEs, basic demographic, and causality assessment information. When reported, a drug and an AE should be reported together in a pair. These can be reported several times depending on the dose and time. If the same drugs and AEs were reported in duplicate, depending on dose or time, only the first report was counted. Therefore, drugs and AEs are paired only one time.
Causality was assessed at six levels: certain, probable, possible, unlikely, unclassified, and unassessable. The assessment criteria are shown in Table 5. We used all drug‒AE pairs except for ones with an unassessable level. Not only the reported information on a possible causal relationship between an AE and a drug, but also previously unknown or incompletely documented relationships can be a signal. The causality assessment was performed by a reporter, such as a medical institution, expert, manufacturer, pharmacy, or public health center.
In KAERS, AEs were organized under the WHO-ART’s hierarchical structure [16]. This consists of four hierarchical levels: system-organ class (SOC), high-level terms (HLT), preferred terms (PT), and included terms (IT). SOC is the highest level. IT represents various expressions about the same AE in the PT level. HLT is a set of PTs related to each other or having some similar symptoms. HLT may or may not exist and therefore are excluded from the analysis. A small subset of the hierarchical structure is listed in Table 6. However, in the KAERS database, more than half of the reports were reported up to the PT level. Thus, we used the PT level as the lowest level of AEs. In the following illustration, we used the SOC and PT levels in the WHO-ART’s hierarchical structure.

4.2. Data

We used drug‒adverse effects pair data from KAERS between 2012 and 2016. Between 2012 and 2016, there were approximately 3.1 million drug‒AE pairs with 1615 kinds of PT-level AEs and 1950 kinds of drugs. Restricting the causality assessment information to certain, probable, possible, unlikely, or unclassified levels, approximately 2.5 million drug‒AE pairs with 1484 kinds of PT level AEs and 1716 kinds of drugs were left. These data contained 32 SOC levels, 1484 PT levels, and 3557 IT levels. Analyses were done with these drug‒AE pairs.

4.3. Analysis

We selected two diabetes drugs, voglibose and acarbose, to compare specific results. Both are hypoglycemic agents that are used for type 2 diabetes, along with diet and exercise. These two drugs were selected because of their substantial exposure and comparable characteristics. Voglibose has a simple structure relative to acarbose. Moreover, it is known to be more economical and safer because its absolute administration dose is 1000 times lower than that of acarbose. However, some severe AEs tend to be more reported in voglibose [17,18]. Therefore, we found specific AEs in acarbose and voglibose using KAERS data by the signal detection methods previously described.
First, we compared the number of signals detected by each method from all drug‒adverse effect pairs with 1484 kinds of PT level AEs and 1716 kinds of drugs. Second, the specific signals detected by each method were compared for the two diabetes drugs mentioned above. The detection criteria for each method are shown in Table 7 and the TreeScan method was performed with a simple cut.

4.4. Results

Table 8 provides the overall signal detection results of all methods. We used the signal detection criteria presented in Table 7. We summarized the number of detected signals separately for PT and SOC levels. The GPS, BCPNN, and sB methods detected relatively fewer signals than the other methods. The ROR and PRR detected the most signals.
The results of applying all methods to two drugs, voglibose and acarbose, are summarized in Table 9. We report only the AEs that were detected by more than two of the signal detection methods. Voglibose had a higher reported count of all AEs than acarbose. The number of AEs detected by at least one method was higher for voglibose (36 AEs) than for acarbose (31 AEs). For both drugs, the common AEs detected were diarrhea, flatulence, and hypoglycemia at the PT level, and metabolic and nutritional disorders at the SOC level. There was only one common AE detected by all methods in acarbose and voglibose: flatulence at the PT level. Both drugs signaled strongly for flatulence, which is an AE commonly observed in patients with type 2 diabetes [19,20]. In addition, the common AEs detected by all methods were dyspepsia and hypoglycemia at the PT level, and metabolic and nutritional disorders at the SOC level in voglibose.

5. Discussion

A number of disproportionality methods for data mining and the TreeScan method were compared for signal detection during drug surveillance for AEs data grouped into hierarchical structures. We included various frequentist methods such as ROR, PRR, IC, LRT, and TreeScan as well as Bayesian methods such as GPS, BCPNN, and sB. The LRT, GPS, BCPNN, sB, and TreeScan methods detected fewer signals than the ROR, PRR, and IC methods. The power and sensitivity of the GPS, sB, LRT, and TreeScan methods tended to be lower than those of others, which implies that these methods are more conservative. The higher power and sensitivity of the ROR, PRR, and IC methods seemed to be due to the higher type I error rates. The three methods had lower PPV. The TreeScan method controls the type I error rate at the desired level, while other methods cannot control this or find appropriate cutoffs for the desired type I error rate. However, no method was superior to the others in relation to all performance measures.
We observed similar patterns in the analysis results of the KAERS data. The GPS and sB methods detected much fewer signals than the others overall. For the two specific drugs, some common AEs were detected by all methods. The ROR, PRR, and IC methods detected additional signals that were not detected by the GPS, sB, LRT, or TreeScan methods. The ROR and PRR methods detected rather too many signals, even if the number reported was small. Thus, the restriction of three or more cases for the reported count to be a signal for the ROR and PRR methods, which is usually imposed in practice [3], might be sensible.
In terms of computation time, the GPS, LRT, and TreeScan methods are more intensive relative to the other methods. Other methods have a closed form for the confidence interval of each statistic, so only the cell count ( n i j ) and marginal count ( n i . or n . j ) of the matrix are required to calculate the confidence interval. On the other hand, the GPS method requires all cell counts in the matrix to estimate the parameters of prior distribution. For the LRT and TreeScan method, a Monte Carlo simulation is required to obtain p-values.
The methods considered in this paper are approaches that can be applied to an existing database. In some cases, one may want to continuously or sequentially monitor to detect a signal as early as possible. The sequential probability ratio test (SPRT) [21,22] can be used. The method has also been applied to a spontaneous adverse event reporting system [23,24]. However, the result of the SPRT method is highly dependent on the relative risk used to specify the alternative hypothesis [25]. Although we did not include the SPRT in this study for these reasons, it would be interesting to compare the method in appropriate situations in future research.
The drug safety databases such as KAERS are constructed by a spontaneous reporting system and very few AEs that occur were reported, so it has a large number of zero-count cells. In this situation, a zero-inflated Poisson model could be considered. Hu et al. [11] proposed ZIP-sB and ZIP-DP (Dirichlet process). Huang et al. [26] proposed a zero-inflated Poisson (ZIP) model based on the likelihood ratio test. According to these research findings, ZIP models detected fewer signals in data containing a large number of zero-counts. This means that they are more conservative by considering zero-counts. In a further study, we will evaluate the performance of ZIP models and apply them to real data to compare.
Huang et al. proposed extending the likelihood ratio test-based (LRT) methods [9] that can detect signals for including a single AE or several AEs within one AE group. The extended LRT method could be used for hierarchical structures of AEs for a fixed drug. The threshold for a signal for multiple-layer analysis should be higher than that for single-layer analysis. It will be very interesting to see the simulation results by comparing the Extended LRT vs. TreeScan with multiple layers (PT, SOC, or others). This is a future research topic.
Currently, some drug companies have different AE detection criteria. For example, AstraZeneca detects an AE when the EB05 is greater than 1.8, whereas GlaxoSmithKline detects AE when it is greater than 2 [12]. In our study, it was confirmed that the performance of each method could vary depending on the cutoff, which is the criteria for signal detection in simulation. Therefore, how to set the cutoff for signal detection is very important and worth noting.

6. Conclusions

In summary, the LRT, GPS, BCPNN, sB, and TreeScan methods are more conservative than the ROR, PRR, and IC methods. Only the TreeScan method controls the type I error rate at the desired level. No method is superior to the others in relation to all performance measures. It is recommended that those conducting drug‒AE surveillance use not just one method, but make a decision based on several methods.

Author Contributions

Conceptualization, I.J.; Data curation, G.P., H.J., and S.-J.H.; Formal analysis, G.P., H.J., and S.-J.H.; Funding acquisition, I.J.; Investigation, G.P., H.J., S.-J.H., and I.J.; Methodology, G.P., H.J., S.-J.H., and I.J.; Project administration, I.J.; Software, S.-J.H.; Supervision, I.J.; Validation, G.P., and H.J.; Writing—original draft, G.P., H.J., and S.-J.H.; Writing—review and editing, I.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Education (2019R1F1A1057182).

Acknowledgments

We are very grateful to the reviewers for their insightful comments.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Summary of performance for each method at various cutoff points when the total sample size = 500,000 and r r ~ U 1.2 ,   10 .
Table A1. Summary of performance for each method at various cutoff points when the total sample size = 500,000 and r r ~ U 1.2 ,   10 .
True Signal Ratio0.030.050.1
MethodCutoff *PowerSensitivityPPVPowerSensitivityPPVPowerSensitivityPPV
ROR10.9960.7990.3030.9950.7640.4620.9980.7280.729
1.50.9960.7470.4930.9950.7090.6310.9980.6630.821
20.9950.6960.6210.9950.6510.7260.9980.5960.865
2.50.9940.6410.6940.9950.5950.7800.9980.5320.888
30.9940.5900.7400.9950.5390.8130.9980.4650.904
PRR10.9960.7990.3030.9950.7640.4620.9980.7280.729
1.50.9960.7470.4950.9950.7080.6310.9980.6630.821
20.9950.6950.6220.9950.6500.7270.9980.5960.865
2.50.9940.6390.6950.9950.5930.7810.9980.5300.889
30.9940.5880.7410.9950.5370.8130.9980.4620.904
IC log 2 1 0.9930.7580.2850.9940.7350.4680.9920.6820.760
log 2 1.5 0.9920.6890.6630.9920.6580.8100.9910.5930.927
log 2 2 0.9900.6190.8660.9920.5860.9220.9870.5120.967
log 2 2.5 0.9830.5830.9100.9880.5400.9440.9870.4660.978
log 2 3 0.9750.4940.9520.9870.4460.9730.9840.3660.985
LRT0.20.9580.5580.9620.9740.5270.9890.9740.4400.994
0.10.9550.5350.9790.9670.5020.9940.9660.4140.997
0.050.9460.5150.9880.9610.4770.9970.9590.3921.000
0.0250.9380.4950.9950.9550.4560.9990.9590.3701.000
0.010.9350.4710.9990.9440.4330.9990.9490.3441.000
GPS10.9260.5110.9980.9340.4930.9990.9680.4780.999
1.50.9250.5090.9980.9340.4910.9990.9680.4760.999
20.9250.5060.9990.9340.4820.9990.9670.4470.999
2.50.9220.4870.9990.9320.4390.9990.9650.3550.999
30.9180.4300.9990.9260.3531.0000.9470.2421.000
sB10.9400.5930.8450.9620.5520.9370.9610.4910.987
1.50.9290.5050.9900.9480.4620.9970.9540.3880.999
20.9100.4260.9980.9380.3731.0000.9480.3011.000
2.50.8800.3621.0000.9200.2981.0000.9340.2261.000
30.8410.3101.0000.8890.2341.0000.9020.1621.000
BCPNN log 2 1 0.9640.6700.7370.9720.6460.8700.9720.5680.964
log 2 1.5 0.9470.5420.9820.9600.5090.9960.9580.4220.999
log 2 2 0.9300.4351.0000.9360.3961.0000.9370.3051.000
log 2 2.5 0.8930.3421.0000.9070.2991.0000.9140.2151.000
log 2 3 0.8370.2681.0000.8750.2231.0000.8650.1421.000
TreeScan0.20.9560.5690.9700.9600.5250.9810.9810.4640.996
0.10.9510.5500.9850.9570.5040.9920.9760.4400.997
0.050.9420.5340.9900.9510.4850.9950.9680.4190.999
0.0250.9320.5190.9960.9400.4710.9980.9610.4001.000
0.010.9210.4950.9990.9290.4501.0000.9570.3761.000
ROR, Reporting Odds Ratio; PRR, Proportional Reporting Ratio; IC, Information Component; LRT, Likelihood ratio test; GPS, Gamma Poisson Shrinker; BCPNN, Bayesian Confidence Propagation Neural Network; sB simplified Bayes; TreeScan, Tree-based Scan Statistic; * Cutoff values for the lower bound of the 95% CI for ROR, PRR, IC, BCPNN, and sB, for EB05 for GPS, and for the p-value for LRT and TreeScan.
Table A2. Summary of performance for each method at various cutoff points when the total sample size = 1,000,000 and r r ~ U 1.2 ,   10 .
Table A2. Summary of performance for each method at various cutoff points when the total sample size = 1,000,000 and r r ~ U 1.2 ,   10 .
True Signal Ratio0.030.050.1
MethodCutoff *PowerSensitivityPPVPowerSensitivityPPVPowerSensitivityPPV
ROR10.9970.8530.3910.9960.8290.5691.0000.7790.810
1.50.9970.8000.6490.9960.7720.7551.0000.7110.886
20.9970.7450.7630.9960.7100.8341.0000.6450.916
2.50.9970.6910.8180.9960.6450.8711.0000.5750.932
30.9970.6400.8520.9960.5870.8921.0000.5020.941
PRR10.9970.8530.3910.9960.8290.5691.0000.7790.810
1.50.9970.8000.6500.9960.7720.7551.0000.7110.886
20.9970.7450.7630.9960.7090.8351.0000.6440.916
2.50.9970.6910.8190.9960.6450.8711.0000.5740.932
30.9970.6380.8520.9960.5850.8931.0000.5010.941
IC log 2 1 0.9940.8370.2630.9920.8170.4511.0000.7770.794
log 2 1.5 0.9920.7660.7480.9900.7390.8591.0000.6860.960
log 2 2 0.9890.6940.9120.9880.6630.9541.0000.6000.983
log 2 2.5 0.9860.6510.9430.9880.6220.9670.9980.5500.987
log 2 3 0.9830.5560.9710.9870.5250.9830.9940.4380.991
LRT0.20.9690.6640.9820.9820.6420.9920.9880.5710.997
0.10.9660.6450.9920.9780.6230.9960.9860.5500.999
0.050.9590.6250.9960.9710.6051.0000.9820.5320.999
0.0250.9530.6070.9980.9710.5891.0000.9770.5130.999
0.010.9490.5841.0000.9670.5671.0000.9720.4921.000
GPS10.9540.6410.9980.9670.6210.9990.9800.5900.999
1.50.9540.6390.9980.9670.6190.9990.9790.5830.999
20.9530.6250.9980.9670.5870.9990.9780.5241.000
2.50.9530.5770.9980.9650.5221.0000.9740.4221.000
30.9520.5010.9980.9640.4281.0000.9630.3141.000
sB10.9760.7560.7530.9830.7430.8920.9900.6830.977
1.50.9580.6400.9910.9680.6170.9980.9760.5451.000
20.9410.5371.0000.9620.5151.0000.9700.4261.000
2.50.9210.4501.0000.9490.4201.0000.9580.3241.000
30.8890.3721.0000.9380.3341.0000.9390.2371.000
BCPNN log 2 1 0.9700.7160.8250.9710.6810.9370.9680.6090.989
log 2 1.5 0.9610.6350.9940.9660.5830.9970.9610.5081.000
log 2 2 0.9530.5451.0000.9620.4910.9990.9550.4111.000
log 2 2.5 0.9410.4721.0000.9550.4091.0000.9430.3261.000
log 2 3 0.9260.4061.0000.9480.3351.0000.9320.2471.000
TreeScan0.20.9720.6880.9800.9790.6510.9920.9850.5740.999
0.10.9660.6730.9880.9770.6320.9960.9830.5530.999
0.050.9640.6580.9920.9740.6150.9970.9830.5331.000
0.0250.9590.6460.9960.9660.6031.0000.9770.5181.000
0.010.9550.6270.9970.9630.5811.0000.9680.5001.000
ROR, Reporting Odds Ratio; PRR, Proportional Reporting Ratio; IC, Information Component; LRT, Likelihood ratio test; GPS, Gamma Poisson Shrinker; BCPNN, Bayesian Confidence Propagation Neural Network; sB simplified Bayes; TreeScan, Tree-based Scan Statistic; * Cutoff values for the lower bound of the 95% CI for ROR, PRR, IC, BCPNN, and sB, for EB05 for GPS, and for the p-value for LRT and TreeScan.
Table A3. Summary of performance for each method at various cutoff points when the total sample size = 500,000 and r r ~ U 1.2 ,   4 .
Table A3. Summary of performance for each method at various cutoff points when the total sample size = 500,000 and r r ~ U 1.2 ,   4 .
True Signal Ratio0.030.050.1
MethodCutoff *PowerSensitivityPPVPowerSensitivityPPVPowerSensitivityPPV
ROR10.9910.5920.2040.9970.5710.3130.9960.5220.510
1.50.9890.4710.3410.9970.4520.4650.9960.4030.637
20.9790.3600.4340.9960.3400.5470.9960.2940.686
2.50.9540.2600.4650.9890.2380.5760.9960.1980.693
30.8590.1740.4530.9480.1580.5640.9900.1280.673
PRR10.9910.5920.2040.9970.5710.3130.9960.5220.510
1.50.9890.4700.3420.9970.4510.4650.9960.4030.638
20.9780.3590.4340.9960.3390.5480.9960.2920.686
2.50.9540.2580.4660.9890.2370.5760.9960.1970.694
30.8520.1720.4500.9440.1560.5630.9890.1260.673
IC log 2 1 0.9680.5780.1770.9790.5560.2810.9870.5110.493
log 2 1.5 0.9480.4300.4570.9660.4020.5920.9810.3580.771
log 2 2 0.9140.2960.6980.9440.2740.7770.9650.2310.866
log 2 2.5 0.8800.2290.7770.9220.2100.8290.9560.1710.897
log 2 3 0.6470.1060.8500.7690.0930.8810.8570.0680.909
LRT0.20.8100.2300.9100.8600.2100.9400.9180.1790.968
0.10.7630.2020.9510.8270.1830.9660.8890.1540.982
0.050.7210.1800.9730.7940.1590.9820.8600.1340.991
0.0250.6840.1610.9870.7690.1400.9910.8330.1170.996
0.010.6390.1390.9950.7220.1190.9920.7840.0970.999
GPS10.6500.1630.9790.7450.1790.9840.7920.1850.990
1.50.1640.0250.9940.3600.0471.0000.5290.0471.000
20.0620.0091.0000.1500.0181.0000.1700.0111.000
2.50.0190.0021.0000.0560.0041.0000.0160.0011.000
30.0060.0011.0000.0050.0001.0000.0010.0001.000
sB10.9170.4210.5110.9420.3860.6520.9690.3510.833
1.50.7900.2100.9470.8410.1860.9680.8940.1550.979
20.5330.0900.9990.6070.0710.9970.7090.0550.999
2.50.2100.0251.0000.2760.0211.0000.3000.0121.000
30.0510.0051.0000.0560.0031.0000.0480.0021.000
BCPNN log 2 1 0.8070.2790.8600.8410.2600.9240.8510.2220.980
log 2 1.5 0.6550.1400.9910.7160.1270.9950.7470.1000.997
log 2 2 0.3840.0570.9990.5020.0510.9990.5480.0331.000
log 2 2.5 0.1500.0171.0000.2080.0151.0000.2010.0071.000
log 2 3 0.0440.0041.0000.0490.0031.0000.0290.0011.000
TreeScan0.20.8150.2300.9180.8550.2140.9450.8910.1810.965
0.10.7760.2020.9480.8330.1870.9660.8520.1560.982
0.050.7360.1770.9710.7920.1620.9780.8200.1350.989
0.0250.6910.1570.9830.7490.1430.9880.7840.1180.990
0.010.6320.1330.9920.6930.1200.9910.7480.0980.994
ROR, Reporting Odds Ratio; PRR, Proportional Reporting Ratio; IC, Information Component; LRT, Likelihood ratio test; GPS, Gamma Poisson Shrinker; BCPNN, Bayesian Confidence Propagation Neural Network; sB simplified Bayes; TreeScan, Tree-based Scan Statistic; * Cutoff values for the lower bound of the 95% CI for ROR, PRR, IC, BCPNN, and sB, for EB05 for GPS, and for the p-value for LRT and TreeScan.
Table A4. Summary of performance for each method at various cutoff points when the total sample size = 1,000,000 and r r ~ U 1.2 ,   4 .
Table A4. Summary of performance for each method at various cutoff points when the total sample size = 1,000,000 and r r ~ U 1.2 ,   4 .
True Signal Ratio0.030.050.1
MethodCutoff *PowerSensitivityPPVPowerSensitivityPPVPowerSensitivityPPV
ROR10.9930.6770.2660.9960.6620.4010.9990.6290.633
1.50.9930.5410.5120.9960.5200.6210.9990.4770.774
20.9850.4060.6070.9960.3820.6920.9990.3370.818
2.50.9600.2790.6330.9910.2570.7150.9990.2120.821
30.8600.1750.6110.9500.1530.6940.9790.1200.795
PRR10.9930.6770.2660.9960.6620.4010.9990.6290.633
1.50.9930.5410.5130.9960.5200.6210.9990.4760.774
20.9850.4030.6080.9960.3800.6920.9990.3360.818
2.50.9590.2770.6330.9910.2550.7150.9990.2100.822
30.8550.1720.6100.9450.1490.6940.9780.1170.793
IC log 2 1 0.9860.7050.1640.9890.6870.2720.9960.6600.508
log 2 1.5 0.9750.5370.5850.9810.5170.7040.9940.4790.856
log 2 2 0.9530.3830.8180.9700.3590.8700.9920.3180.934
log 2 2.5 0.9320.3000.8690.9570.2770.9060.9860.2340.947
log 2 3 0.7600.1400.9220.8550.1160.9420.9240.0860.960
LRT0.20.8990.3820.9410.9250.3470.9640.9520.3120.988
0.10.8850.3570.9640.9070.3160.9850.9410.2840.995
0.050.8710.3330.9790.8970.2910.9940.9350.2590.996
0.0250.8550.3100.9920.8850.2700.9980.9180.2360.999
0.010.8370.2850.9970.8590.2440.9990.8950.2111.000
GPS10.8380.2980.9950.8660.2920.9950.9020.3300.991
1.50.8130.2620.9990.8410.2530.9990.8110.1770.999
20.7680.1911.0000.7710.1561.0000.5830.0641.000
2.50.3620.0491.0000.3720.0321.0000.1360.0061.000
30.0330.0031.0000.0380.0021.0000.0040.0001.000
sB10.9610.5620.5610.9630.5250.7070.9670.4910.872
1.50.8920.3380.9670.9170.3000.9830.9440.2610.996
20.7640.1791.0000.7970.1381.0000.8550.1060.999
2.50.4490.0631.0000.5090.0460.9980.5440.0271.000
30.1240.0131.0000.1320.0091.0000.1210.0041.000
BCPNN log 2 1 0.9150.4410.8600.9290.4170.9210.9270.3830.979
log 2 1.5 0.8430.2650.9960.8630.2390.9980.8850.2051.000
log 2 2 0.6550.1311.0000.7190.1121.0000.7880.0831.000
log 2 2.5 0.3550.0481.0000.4260.0371.0000.4600.0211.000
log 2 3 0.0940.0101.0000.1140.0071.0000.0860.0031.000
TreeScan0.20.9130.3780.9560.9320.3460.9670.9550.3130.986
0.10.8990.3480.9770.9130.3190.9880.9350.2850.992
0.050.8790.3230.9880.8960.2970.9910.9100.2610.995
0.0250.8540.3000.9930.8770.2730.9960.9050.2410.997
0.010.8340.2730.9970.8460.2450.9990.8850.2150.997
ROR, Reporting Odds Ratio; PRR, Proportional Reporting Ratio; IC, Information Component; LRT, Likelihood ratio test; GPS, Gamma Poisson Shrinker; BCPNN, Bayesian Confidence Propagation Neural Network; sB simplified Bayes; TreeScan, Tree-based Scan Statistic; * Cutoff values for the lower bound of the 95% CI for ROR, PRR, IC, BCPNN, and sB, for EB05 for GPS, and for the p-value for LRT and TreeScan.

References

  1. Korea Institution of Drug Safety & Risk Management. Guideline for KIDS-Korea Adverse Event Reporting System Database; Korea Institution of Drug Safety & Risk Management: Seoul, Korea, 2017. [Google Scholar]
  2. Rothman, K.J.; Lanes, S.; Sacks, S.T. The reporting odds ratio and its advantages over the proportional reporting ratio. Pharmacoepidemiol. Drug Saf. 2004, 13, 519–523. [Google Scholar] [CrossRef] [PubMed]
  3. Evans, S.J.; Waller, P.C.; Davis, S. Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports. Pharmacoepidemiol. Drug Saf. 2001, 10, 483–486. [Google Scholar] [CrossRef] [PubMed]
  4. Huang, L.; Zalkikar, J.; Tiwari, R.C. A likelihood ratio test based method for signal detection with application to FDA’s drug safety data. J. Am. Stat. Assoc. 2011, 106, 1230–1241. [Google Scholar] [CrossRef]
  5. Dumouchel, W. Bayesian Data mining in large frequency tables, with an application to the FDA apontaneous reporting system. Am. Stat. 1999, 53, 177–190. [Google Scholar]
  6. Bate, A.; Lindquist, M.; Edwards, I.R.; Olsson, S.; Orre, R.; Lansner, A.; De Freitas, R.M. A bayesian neural network method for adverse drug reaction signal generation. Eur. J. Clin. Pharmacol. 1998, 54, 315–321. [Google Scholar] [CrossRef]
  7. Noren, G.N.; Bate, A.; Orre, R.; Edwards, I.R. Extending the methods used to screen the WHO drug safety database towards analysis of complex associations and improved accuracy for rare events. Stat. Med. 2006, 25, 3740–3757. [Google Scholar] [CrossRef]
  8. Norén, G.N.; Edwards, I.R. Opportunities and challenges of adverse drug reaction surveillance in electronic patient records. Pharmacovigil. Rev. 2010, 4, 17–20. [Google Scholar]
  9. Huang, L.; Zalkikar, J.; Tiwari, R.C. Likelihood ratio test-based method for signal detection in drug classes using FDA’s AERS database. J. Biopharm. Stat. 2013, 23, 178–200. [Google Scholar] [CrossRef]
  10. Huang, L.; Guo, T.; Zalkikar, J.N.; Tiwari, R.C. A review of statistical methods for safety surveillance. Ther. Innov. Regul. Sci. 2014, 48, 98–108. [Google Scholar] [CrossRef]
  11. Hu, N.; Huang, L.; Tiwari, R.C. Signal detection in FDA AERS database using Dirichlet process. Stat. Med. 2015, 34, 2725–2742. [Google Scholar] [CrossRef]
  12. Candore, G.; Juhlin, K.; Manlik, K.; Thakrar, B.; Quarcoo, N.; Seabroke, S.; Wisniewski, A.; Slattery, J. Comparison of statistical signal detection methods within and across spontaneous reporting databases. Drug Saf. 2015, 38, 577–587. [Google Scholar] [CrossRef] [PubMed]
  13. Kulldorff, M.; Fang, Z.; Walsh, S.J. A tree-based scan statistic for database disease surveillance. Biometrics 2003, 59, 323–331. [Google Scholar] [CrossRef] [PubMed]
  14. Kulldorff, M.; Dashevsky, I.; Avery, T.R.; Chan, A.K.; Davis, R.L.; Graham, D.; Platt, R.; Andrade, S.E.; Boudreau, D.; Gunter, M.; et al. Drug safety data mining with a tree-based scan statistic. Pharmacoepidemiol. Drug Saf. 2013, 22, 517–523. [Google Scholar] [CrossRef]
  15. Brown, J.S.; Petronis, K.R.; Bate, A.; Zhang, F.; Dashevsky, I.; Kulldorff, M.; Avery, T.R.; Davis, R.L.; Chan, K.A.; Andrade, S.E.; et al. Drug adverse event detection in health plan data using the gamma poisson shrinker and comparison to the tree-based scan statistic. Pharmaceutics 2013, 5, 179–200. [Google Scholar] [CrossRef]
  16. The Uppsala Monitoring Centre: The WHO Adverse Reaction Terminology—WHO-ART, Terminology for Coding Clinical Information in Relation to Drug Therapy. 2015. Available online: https://www.who-umc.org/vigibase/services/learn-more-about-who-art/ (accessed on 5 August 2020).
  17. Lee, M.Y.; Choi, D.S.; Lee, M.K.; Lee, H.W.; Park, T.S.; Kim, D.M.; Chung, C.H.; Kim, D.K.; Kim, I.J.; Jang, H.C.; et al. Comparison of acarbose and voglibose in diabetes patients who are inadequately controlled with basal insulin treatment: Randomized, parallel, open-label, active-controlled study. J. Korean Med. Sci. 2014, 29, 90–97. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  18. Vichayanrat, A.; Ploybutr, S.; Tunlakit, M.; Watanakejorn, P. Efficacy and safety of voglibose in comparison with acarbose in type 2 diabetic patients. Diabetes Res. Clin. Pract. 2002, 55, 99–103. [Google Scholar] [CrossRef]
  19. Martin, A.E.; Montgomery, P.A. Acarbose: An alpha-glucosidase inhibitor. Am. J. Health-Syst. Pharm. AJHP Off. J. Am. Soc. Health-Syst. Pharm. 1996, 53, 2277–2290. [Google Scholar] [CrossRef]
  20. Dabhi, A.S.; Bhatt, N.R.; Shah, M.J. Voglibose: An alpha glucosidase inhibitor. J. C. Diagn. Res. JCDR 2013, 7, 3023–3027. [Google Scholar]
  21. Wald, A. Sequential tests of statistical hypotheses. Ann. Math. Stat. 1945, 16, 117–186. [Google Scholar] [CrossRef]
  22. Wald, A. Sequential Analysis. In Wald Sequential Analysis 1947; Jon Willey & Sons. Inc.: New York, NY, USA, 1947. [Google Scholar]
  23. Chan, C.L.; Rudrappa, S.; San Ang, P.; Li, S.C.; Evans, S.J. Detecting signals of disproportionate reporting from singapore’s spontaneous adverse event reporting system: An application of the sequential probability ratio test. Drug Saf. 2017, 40, 703–713. [Google Scholar] [CrossRef]
  24. Chan, C.L.; Soh, S.; Tan, S.H.; Ang, P.S.; Rudrappa, S.; Li, S.C.; Evans, S.J. Quantitative data mining in signal detection: The Singapore experience. Exp. Opin. Drug Saf. 2020, 19, 1–7. [Google Scholar] [CrossRef] [PubMed]
  25. Kulldorff, M.; Davis, R.L.; Kolczak, M.; Lewis, E.; Lieu, T.; Platt, R. A maximized sequential probability ratio test for drug and vaccine safety surveillance. Seq. Anal. 2011, 30, 58–78. [Google Scholar] [CrossRef]
  26. Huang, L.; Zheng, D.; Zalkikar, J.; Tiwari, R. Zero-inflated poisson model based likelihood ratio test for drug safety signal detection. Stat. Methods Med. Res. 2017, 26, 471–488. [Google Scholar] [CrossRef] [PubMed]
Table 1. Adverse events count for the ith adverse event and the jth drug.
Table 1. Adverse events count for the ith adverse event and the jth drug.
AE j t h   Drug All Other DrugsTotal
ithadverse event n i j n i . n i j n i .
All other adverse events n . j n i j n . . n i . n . j + n i j n . . n i .
Total n . j n . . n . j n . .
Table 2. Comparison of type I error rates at various cutoff points when r r = 1 .
Table 2. Comparison of type I error rates at various cutoff points when r r = 1 .
Total Sample Size300,000500,0001,000,000
MethodCutoff *Type I Error
ROR11.0001.0000.999
1.51.0000.9990.991
20.9990.9790.914
2.50.9780.9310.790
30.9390.8610.679
PRR11.0001.0000.999
1.51.0000.9980.991
20.9990.9780.910
2.50.9740.9290.786
30.9340.8600.676
IC log 2 1 0.9950.9981.000
log 2 1.5 0.9920.9940.993
log 2 2 0.8280.7170.546
log 2 2.5 0.6070.4990.335
log 2 3 0.2840.2120.121
LRT0.20.2410.2150.207
0.10.1240.1070.117
0.050.0680.0630.053
0.0250.0440.0390.029
0.010.0310.0120.020
GPS10.5670.6150.656
1.50.0090.0100.005
20.0000.0000.000
2.50.0000.0000.000
30.0000.0000.000
BCPNN log 2 1 0.3710.9490.959
log 2 1.5 0.0240.1130.078
log 2 2 0.0000.0030.003
log 2 2.5 0.0000.0000.001
log 2 3 0.0000.0000.000
sB10.7410.8420.917
1.50.0880.0900.079
20.0090.0050.004
2.50.0000.0000.000
30.0000.0000.000
TreeScan0.20.1940.2400.219
0.10.1030.1240.097
0.050.0520.0500.047
0.0250.0250.0290.029
0.010.0080.0100.009
ROR, Reporting Odds Ratio; PRR, Proportional Reporting Ratio; IC, Information Component; LRT, Likelihood ratio test; GPS, Gamma Poisson Shrinker; BCPNN, Bayesian Confidence Propagation Neural Network; sB simplified Bayes; TreeScan, Tree-based Scan Statistic; * Cutoff values for the lower bound of the 95% CI for ROR, PRR, IC, BCPNN, and sB, for EB05 for GPS, and for the p-value for LRT and TreeScan.
Table 3. Summary of performance for each method at various cutoff points when the total sample size = 300,000 and r r ~ U 1.2 ,   10 .
Table 3. Summary of performance for each method at various cutoff points when the total sample size = 300,000 and r r ~ U 1.2 ,   10 .
True Signal Ratio0.030.050.1
MethodCutoff *PowerSensitivityPPVPowerSensitivityPPVPowerSensitivityPPV
ROR10.9960.7530.2510.9970.7280.3940.9990.6800.649
1.50.9960.7040.3880.9970.6780.5340.9990.6210.744
20.9960.6550.5030.9970.6240.6280.9990.5610.800
2.50.9960.6080.5870.9970.5690.6920.9990.4980.833
30.9960.5590.6460.9970.5190.7370.9990.4380.854
PRR10.9960.7530.2510.9970.7280.3940.9990.6800.649
1.50.9960.7040.3890.9970.6780.5350.9990.6210.744
20.9960.6540.5040.9970.6230.6290.9990.5600.800
2.50.9960.6070.5880.9970.5680.6930.9990.4970.833
30.9960.5570.6470.9970.5160.7380.9990.4360.855
IC log 2 1 0.9910.6870.3090.9950.6600.4790.9950.6130.748
log 2 1.5 0.9860.6120.6140.9910.5790.7570.9920.5260.904
log 2 2 0.9800.5410.8250.9840.5070.8810.9900.4480.951
log 2 2.5 0.9760.5000.8770.9820.4670.9170.9890.4050.963
log 2 3 0.9630.4130.9380.9770.3750.9560.9860.3110.978
LRT0.20.9390.4620.9620.9560.4170.9830.9730.3380.990
0.10.9290.4320.9810.9450.3880.9900.9690.3120.995
0.050.9150.4090.9900.9300.3650.9920.9610.2890.996
0.0250.9010.3870.9940.9220.3410.9950.9470.2671.000
0.010.8810.3590.9970.9080.3140.9990.9320.2401.000
GPS10.8910.4170.9970.9260.3950.9980.9510.3780.997
1.50.8880.4150.9980.9250.3930.9980.9510.3780.997
20.8880.4140.9980.9250.3910.9980.9500.3690.998
2.50.8880.4060.9980.9240.3730.9980.9450.3121.000
30.8860.3690.9990.9110.3150.9990.9130.2041.000
BCPNN log 2 1 0.9480.5780.7310.9510.5390.8570.9720.4740.957
log 2 1.5 0.9140.4430.9840.9240.3980.9920.9500.3230.998
log 2 2 0.8670.3350.9980.8930.2910.9990.9110.2161.000
log 2 2.5 0.8080.2461.0000.8310.2041.0000.8370.1391.000
log 2 3 0.7190.1741.0000.7530.1381.0000.7540.0811.000
sB10.9340.4880.8660.9420.4480.9390.9390.3870.988
1.50.9210.4140.9920.9320.3630.9960.9190.2930.999
20.9000.3471.0000.9080.2921.0000.8940.2171.000
2.50.8640.2881.0000.8750.2311.0000.8560.1561.000
30.8070.2411.0000.8260.1801.0000.7970.1101.000
TreeScan0.20.9420.4770.9640.9540.4440.9810.9750.3690.992
0.10.9300.4570.9830.9490.4170.9910.9680.3430.996
0.050.9170.4370.9900.9420.3930.9970.9550.3220.998
0.0250.9040.4220.9940.9350.3730.9990.9480.3011.000
0.010.8870.4000.9970.9140.3520.9990.9240.2801.000
ROR, Reporting Odds Ratio; PRR, Proportional Reporting Ratio; IC, Information Component; LRT, Likelihood ratio test; GPS, Gamma Poisson Shrinker; BCPNN, Bayesian Confidence Propagation Neural Network; sB simplified Bayes; TreeScan, Tree-based Scan Statistic; * Cutoff values for the lower bound of the 95% CI for ROR, PRR, IC, BCPNN, and sB, for EB05 for GPS, and for the p-value for LRT and TreeScan.
Table 4. Summary of performance for each method at various cutoff points when the total sample size = 300,000 and r r ~ U 1.2 ,   4 .
Table 4. Summary of performance for each method at various cutoff points when the total sample size = 300,000 and r r ~ U 1.2 ,   4 .
True Signal Ratio0.030.050.1
MethodCutoff *PowerSensitivityPPVPowerSensitivityPPVPowerSensitivityPPV
ROR10.9870.5180.1640.9970.5100.2620.9950.4640.437
1.50.9840.4190.2500.9970.4080.3640.9950.3620.531
20.9710.3230.3160.9960.3130.4370.9950.2680.581
2.50.9340.2430.3480.9840.2310.4680.9940.1900.597
30.8640.1770.3530.9430.1630.4670.9840.1330.591
PRR10.9870.5180.1640.9970.5100.2620.9950.4640.437
1.50.9840.4190.2500.9970.4080.3640.9950.3610.532
20.9710.3220.3170.9960.3120.4370.9950.2670.582
2.50.9330.2410.3490.9840.2290.4690.9940.1890.597
30.8610.1750.3540.9400.1620.4670.9830.1320.591
IC log 2 1 0.9440.4720.1820.9840.4690.2900.9840.4130.486
log 2 1.5 0.9010.3340.3780.9690.3310.5270.9760.2840.702
log 2 2 0.8350.2220.5820.9420.2170.6960.9650.1790.811
log 2 2.5 0.7820.1750.6620.9050.1650.7590.9460.1290.850
log 2 3 0.5690.0870.7440.7160.0770.8200.8280.0560.882
LRT0.20.6730.1490.8670.7850.1360.9170.8590.1110.955
0.10.6020.1220.9110.7250.1110.9480.8150.0890.981
0.050.5540.1040.9390.6700.0910.9750.7480.0710.988
0.0250.5090.0880.9660.6090.0770.9840.6790.0570.992
0.010.4440.0710.9770.5340.0600.9870.6120.0440.999
GPS10.4300.0790.9830.6110.0950.9890.7040.0900.992
1.50.0510.0081.0000.2500.0260.9960.5610.0470.999
20.0280.0051.0000.1650.0171.0000.4650.0340.999
2.50.0150.0021.0000.0650.0051.0000.0570.0021.000
30.0030.0001.0000.0080.0011.0000.0030.0001.000
BCPNN log 2 1 0.8640.3060.4770.9140.2920.6240.9430.2590.802
log 2 1.5 0.6400.1350.9150.7450.1210.9370.8240.0980.976
log 2 2 0.3390.0480.9840.4260.0400.9960.5070.0281.000
log 2 2.5 0.1130.0131.0000.1400.0101.0000.1660.0061.000
log 2 3 0.0220.0031.0000.0260.0021.0000.0250.0011.000
sB10.6690.1710.8600.7430.1610.9270.7780.1330.969
1.50.4720.0800.9910.5690.0720.9950.6280.0520.997
20.2470.0310.9960.2940.0241.0000.3450.0151.000
2.50.0840.0091.0000.0940.0061.0000.0980.0031.000
30.0160.0021.0000.0210.0011.0000.0180.0011.000
TreeScan0.20.6710.1480.8660.7720.1370.9190.8430.1100.945
0.10.6190.1250.9300.7260.1140.9650.7920.0880.964
0.050.5730.1070.9610.6710.0960.9820.7290.0730.983
0.0250.5200.0920.9700.6140.0820.9900.6850.0610.991
0.010.4520.0740.9850.5500.0650.9930.6090.0480.995
ROR, Reporting Odds Ratio; PRR, Proportional Reporting Ratio; IC, Information Component; LRT, Likelihood ratio test; GPS, Gamma Poisson Shrinker; BCPNN, Bayesian Confidence Propagation Neural Network; sB simplified Bayes; TreeScan, Tree-based Scan Statistic; * Cutoff values for the lower bound of the 95% CI for ROR, PRR, IC, BCPNN, and sB, for EB05 for GPS, and for the p-value for LRT and TreeScan.
Table 5. Causality assessment criteria.
Table 5. Causality assessment criteria.
CriterionLevel
The context of administration and use of medicines is reasonable.Certain, Probable, Possible
It is not described as another medication, chemical, or accompanying illness.Certain, Probable
In case of administration interruption, there is a clinically reasonable response.Certain, Probable
In case of readministration, there is a pharmacologically conclusive response.Certain
It could be described as another medication, chemical, or accompanying illness.Possible, Unlikely
It is a temporary condition, not related to the administration and use of medicines.Unlikely
It requires more information to assess or it is under examination.Unclassified
It is not assessable and cannot be supplemented.Unassessable
Table 6. Subset of WHO-ART’s hierarchical structure of adverse events.
Table 6. Subset of WHO-ART’s hierarchical structure of adverse events.
CodeLevelAdverse Event
100SOCSkin and appendages disorders
100.0001.001PTACNE
100.0001.003ITACNEIFORM DERMATITIS
100.0001.004ITRASH ACNEIFORM
100.0001.005ITACNE CYSTIC
100.0001.006ITACNE PUSTULAR
100.0001.007ITACNE AGGRAVATED
100.0001.008ITACNE CONGLOBATA
100.0002.001PTALOPECIA
100.0002.003ITHAIR THINNING
100.0002.004ITALOPECIA AREATA
100.0002.005ITATRICHIA
100.0002.006ITBALDNESS
100.0002.007ITHAIR LOSS
100.0002.008ITATRICHOSIS
100.0002.009ITLOSS OF EYELASHES
100.0002.010ITALOPECIA TOTALIS
100.0002.011ITALOPECIA SCARRING
100.0002.012ITALOPECIA UNIVERSALIS
100.0002.013ITDEFLUVIUM
100.0002.014ITLOSS OF EYEBROWS
100.0002.015ITAGGRAVATED HAIR LOSS
Table 7. Signal detection criterion for each method.
Table 7. Signal detection criterion for each method.
MethodDetection Criterion
ROR, PRR95% CI lower bound > 2
IC, BCPNN95% CI lower bound > log2(2)
GPSEB05 > 2
BCPNN95% CI lower bound > log2(2)
sB95% CI lower bound > 2
LRT, TreeScanp-value < 0.05
ROR, Reporting Odds Ratio; PRR, Proportional Reporting Ratio; IC, Information Component; LRT, Likelihood ratio test; GPS, Gamma Poisson Shrinker; BCPNN, Bayesian Confidence Propagation Neural Network; sB simplified Bayes; TreeScan, Tree-based Scan Statistic.
Table 8. Overall detection: the number of signals detected by each method in the 2012–2016 Korea adverse event reporting system (KAERS) database contained 1615 kinds of adverse events and 1716 kinds of drugs.
Table 8. Overall detection: the number of signals detected by each method in the 2012–2016 Korea adverse event reporting system (KAERS) database contained 1615 kinds of adverse events and 1716 kinds of drugs.
Method (# of Pairs)ROR & PRRICLRTGPSBCPNNsBTreeScan
PT levels (2,546,544)43,96025,71483246147529043979175
SOC levels (54,912)4142214722381342125611631380
Total (2,601,456)48,10227,86110,56274896546556010,555
ROR, Reporting Odds Ratio; PRR, Proportional Reporting Ratio; IC, Information Component; LRT, Likelihood ratio test; GPS, Gamma Poisson Shrinker; BCPNN, Bayesian Confidence Propagation Neural Network; sB simplified Bayes; TreeScan, Tree-based Scan Statistic.
Table 9. Detected signals by each method for voglibose and acarbose.
Table 9. Detected signals by each method for voglibose and acarbose.
Adverse EventObsExpRORPRRICLRTGPSBCPNNsBTreeScan
Voglibose500_165Anorexia82.622.42 *2.42 *0.590.9401.120.351.370.504
600Gastrointestinal system disorders11573.731.831.830.320.001 *1.280.311.460.001 *
600_204Constipation113.962.26 *2.26 *0.600.7821.240.431.470.336
600_205Diarrhea126.431.331.330.061.0000.920.001.230.980
600_268Abdominal pain103.632.20 *2.20 *0.550.9101.170.371.430.447
600_279Dyspepsia357.225.16 *5.15 *1.77 *0.001 *3.11 *1.62 *3.64 *0.001 *
600_285Flatulence150.3940.28 *40.00 *4.49 *0.001 *20.89 *2.79 *10.76 *0.001 *
800Metabolic and nutritional disorders375.078.11 *8.10 *2.37 *0.001 *4.67 *2.15 *6.11 *0.001 *
800_389Hypoglycemia240.5548.25 *47.86 *4.84 *0.001 *27.47 *3.41 *17.12 *0.001 *
800_392Hyponatremia20.189.67 *9.65 *1.44 *0.9960.49−0.300.030.772
800_407Weight decrease20.218.19 *8.18 *1.24 *0.9980.47−0.340.040.860
1100Respiratory system disorders169.461.231.230.031.0000.94−0.021.210.981
1100_515Epistaxis20.218.18 *8.17 *1.24 *0.9980.47−0.340.040.861
1100_523Pharyngitis40.853.79 *3.79 *0.810.9920.850.150.800.745
1810_401Edema peripheral30.713.15 *3.15 *0.441.0000.60−0.200.430.973
Acarbose500_172Depression20.0825.07 *25.02 *2.68 *0.6380.66−0.180.040.209
600_205Diarrhea73.011.651.640.111.0000.84−0.050.900.924
600_285Flatulence120.1872.43 *72.02 *5.16 *0.001 *31.93 *2.62 *10.06 *0.001 *
600_336Tooth disorder20.01180.65 *177.84 *5.43 *0.010 *1.88−0.100.020.008 *
800Metabolic and nutritional disorders62.381.791.790.151.0000.81−0.060.890.920
800_383Hyperkalemia20.1117.08 *17.06 *2.16 *0.8460.58−0.220.030.390
800_389Hypoglycemia30.2610.77 *10.76 *1.88 *0.6320.910.230.640.202
1210Red blood cell disorders40.625.70 *5.70 *1.26 *0.7981.000.330.980.326
1210_544Anemia40.517.13 *7.13 *1.54 *0.5561.100.431.050.176
1300Urinary system disorders62.162.06 *2.06 *0.290.9980.870.040.910.819
1300_619Renal function abnormal20.1116.52 *16.50 *2.12 *0.8600.57−0.230.030.396
1810_711Abdomen enlarged20.0825.76 *25.71 *2.72 *0.6320.66−0.180.020.197
Obs, Observed count; Exp, Expected count; ROR, Reporting Odds Ratio; PRR, Proportional Reporting Ratio; IC, Information Component; LRT, Likelihood ratio test; GPS, Gamma Poisson Shrinker; BCPNN, Bayesian Confidence Propagation Neural Network; sB simplified Bayes; TreeScan, Tree-based Scan Statistic; * signal.

Share and Cite

MDPI and ACS Style

Park, G.; Jung, H.; Heo, S.-J.; Jung, I. Comparison of Data Mining Methods for the Signal Detection of Adverse Drug Events with a Hierarchical Structure in Postmarketing Surveillance. Life 2020, 10, 138. https://doi.org/10.3390/life10080138

AMA Style

Park G, Jung H, Heo S-J, Jung I. Comparison of Data Mining Methods for the Signal Detection of Adverse Drug Events with a Hierarchical Structure in Postmarketing Surveillance. Life. 2020; 10(8):138. https://doi.org/10.3390/life10080138

Chicago/Turabian Style

Park, Goeun, Heesun Jung, Seok-Jae Heo, and Inkyung Jung. 2020. "Comparison of Data Mining Methods for the Signal Detection of Adverse Drug Events with a Hierarchical Structure in Postmarketing Surveillance" Life 10, no. 8: 138. https://doi.org/10.3390/life10080138

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop