Published registry-based pharmacoepidemiologic associations show limited concordance with agnostic medication-wide analyses

Objectives: To assess how the results of published national registry-based pharmacoepidemiology studies (where select associations are of interest) compare with an agnostic medication-wide approach (where all possible drug associations are tested). Study Design and Setting: We systematically searched for publications that reported drug associations with any, breast, colon/colo-rectal, or prostate cancer in the Swedish Prescribed Drug Registry. Results were compared against a previously performed agnostic medication-wide study on the same registry. Protocol: https://osf.io/kqj8n. Results: Most published studies (25/32) investigated previously reported associations. 421/913 (46%) associations had statistically signiﬁcant results. 134 of the 162 unique drug-cancer associations could be paired with 70 associations in the agnostic study (corresponding drug categories and cancer types). Published studies reported smaller effect sizes and absolute effect sizes than the agnostic study, and generally used more adjustments. Agnostic analyses were less likely to report statistically signiﬁcant protective associations (based on a multiplicity-corrected threshold) than their paired associations in published studies (McNemar odds ratio 0.13, P 5 0.0022). Among 162 published associations, 36 (22%) showed increased risk signal and 25 (15%) protective signal at P ! 0.05, while for agnostic associations, 237 (11%) showed increased risk signal and 108 (5%) protective signal at a multiplicity-corrected threshold. Associations belonging to drug categories targeted by individual published studies vs. nontargeted had smaller average effect sizes; smaller P values; and more frequent risk signals

Population-wide health registries can investigate adverse outcomes across giant real-world datasets [1,2].The Swedish Prescribed Drug Register (PDR) is a nationwide registry of prescribed pharmaceuticals, established in 2005, with highly representative coverage [3].In giant datasets, any investigation involves numerous analytic choices, for example, how to select the study population, classify study variables, and select adjustments; thus the same material can yield very different results (''vibration of effects'') [4].Analytic choices made after examining data [5], unaccounted multiplicities [6] and selective reporting of favorable results ultimately creates many false positive discovery claims [7].However, when analyses target previously discovered and reported associations, selective reporting may operate differently; for example, if investigators are motivated to disprove previously proposed associations with ''negative'' results [8].
One way to counteract selective reporting is to run and report analyses for all possible drug associations.This process resembles genome-wide association studies that transformed the field of genetics, boosting its validity [9].While candidate-gene genetics (testing few associations based on biological plausibility, similar to standard practice also in pharmacoepidemiology for drug-cancer associations) had dismal replication record, ''agnostic'' genome-wide analyses enjoy high replication success [9e11].Drugs and gene variants may differ in mechanistic knowledge available to inform biological plausibility and the extent of potential confounders [12].It is interesting to explore how results of the current standard practice of pharmacoepidemiology in large-scale registry databases compare with agnostic massive-testing.Patel et al. [6] used an agnostic, exposure-wide approach to investigate associations between all 552 categories of prescribed PDR pharmaceuticals and cancers adjusting for multiple comparisons.
Here, we aimed to evaluate how drug-cancer associations in the published literature using the PDR compare with agnostic exposure-wide analyses on the same registry.A systematic literature review assessed which drug-cancer associations have been reported in published studies that use the PDR.Furthermore, we assessed whether PDRbased published studies report stronger or weaker signals vs. the agnostic study for the same associations.We also assessed whether associations targeted by published studies were likely to have more statistically significant signals than the nontargeted ones, based on the agnostic study results.

Methods
In this metaresearch study (protocol: https://osf.io/kqj8n; amendments in Appendix 1), the medication-wide study we apply as comparison [13] used a Swedish nationwide cohort of 9 million individuals, resulting from linkage between the PDR and the Cancer Register [3,14] between July 2005 and December 2010.As exposures, it investigated agnostically all 552 drug categories (Anatomical Therapeutic Classification (ATC) level 4, chemical subgroup, e.g., A01AA) [15].Associations with 4 outcomes were modeled: time to first occurrence of any, female breast, colon, and prostate cancer, adjusted for age, sex, and prescription of any other drugs (for breast and prostate cancer, cohort was restricted to one sex instead of adjusting).Lag time of 180 days from surveillance start until prescription, plus one additional year, was used for a person to be considered exposed.Cox proportional hazards regressions (CPHRs) with Bonferroni correction for multiplicity were performed in 2 equally large training and testing sets for validation.To demonstrate how findings may change because of modeling choices, case-crossover analysis was also performed, but that model had many methodological biases, as we discuss [13], yielding only protective drugcancer associations, a nonplausible pattern.Therefore, here we applied the Cox analysis results from the testing set with statistical significance claimed for P !0.05/552 (9 Â 10 À5 ).Overall, the study reported on 2,155 associations between ATC level 4 drug categories and any of the cancer types (excluding 53 associations lacking meaningful data).

Eligibility criteria
For comparison with the medication-wide study, published studies were eligible that evaluated associations between pharmaceuticals (assessed with the PDR) and any of 4 cancer categories, defined with International Classification of Diseases version 10 codes [16]: any cancer (C00-C80); female breast cancer (C50.0-C50-9);colon (C18.0-C18.9)or colorectal cancer (C18-C20); or prostate cancer (C61).Studies were included regardless of target population (general or patient-specific) and of whether they used the PDR for sampling (e.g., identifying all Swedish individuals receiving a certain drug) or recruited their sample independently of the registry.Studies had to use the PDR as a source for exposure variables.There were no restrictions on exposure definitions (drug type and categorization).Eligible as comparators were, e.g., no exposure to the drug category in question or a general population control.We excluded analyses comparing a drug to itself, e.g., higher vs. lower dose, and studies specifically focused on cancer recurrence.''Any'' cancer referred to a composite measurement of all cancers, not just few specific types.Eligible study designs were observational (e.g., cohort or casecontrol) or quasiexperimental studies, or randomized trials.We excluded reviews.

Search strategy
We systematically searched PubMed/MEDLINE, Embase, Web of Science, and Google Scholar from July 1, 2005 (inception date of the PDR), to July 17, 2022, for publications that used PDR to investigate associations between prescribed drugs and eligible cancer types.Search terms were based on a previous systematic search for PDR articles [3] (Appendix 2).

Data variables and extraction
For each eligible drug-cancer association in the study sample, descriptive information was extracted: for example, population, exposure (with explicitly stated or inferred ATC codes [15]), comparison, outcome, study type, sampling strategy, and whether or not a statistical analysis plan was prospectively registered.We also extracted statistical model type, adjustment variables in the most fully adjusted model, and point estimates with 95% confidence intervals (CIs) for effect sizes.We accepted effect measures such as risk ratios, odds ratios (ORs), hazard ratios, incidence rate ratios, or standardized incidence ratios.
From main texts and main figures/tables, all presented eligible drug-cancer associations were extracted, including subgroups and overlapping definitions of population, exposure, comparison, and outcome (PECO).Sensitivity analyses, crude analyses, and analyses in supplementary material that overlapped with another association (e.g., as subgroups) were not extracted, using selection rules outlined below.With this selection, we sought to extract the totality of comparisons considered by the authors when responding to their research question (rather than serving the purpose of testing robustness).

Statistical analysis
We calculated summary statistics on the number of unique eligible associations, and total number of eligible associations including any overlapping variants, presented in the individual published studies.We made a graphical representation of their mapping onto the associations in the agnostic study (specifically, the agnostic study's CPHRs performed in the testing set [13]), grouping associations at ATC level 2 (therapeutic subgroup, e.g., A01) for interpretation purposes.Summary statistics are presented as median, range, and interquartile range for the effect sizes and P values reported in individual publications and the agnostic study.
We first reclassified the drug categories in the eligible drug-cancer associations according to those used in the agnostic study (ATC level 4) using selection criteria for overlapping associations (Appendix 3).We paired associations presented in individual publications with their agnostic counterparts (for ATC level 4 and cancer type) and compared the distribution of effect sizes and P values, respectively (paired Wilcoxon rank-sum test).In this comparison, agnostic P values were without multiplicity correction.We added a group-wise comparison of associations in individual publications vs. all 2,155 associations in the agnostic study (Mann-Whitney test).Also in this comparison, agnostic P values were without multiplicity correction.All effects were also converted to values O 1 to capture deviations from the null (by inverting effect sizes that were !1).We evaluated the concordance between results in published studies and their paired agnostic counterparts for showing statistical significance or not, using McNemar OR and chi-squared test.This was done for increased risk or protective associations, using P !0.05 for the published studies and p ! 9 Â 10 À5 for agnostic results (multiplicityadjusted).A group-wise comparison between the proportion of statistically significant results in published studies (at P !0.05) vs. the agnostic study (at p ! 9 Â 10 À5 ) was done with Pearson's chi-squared test.
We also explored whether the drug categories targeted by individual studies, among the 552 available, were more likely to represent those where the agnostic study detected statistically significant signals.Thus, we compared agnostic study associations belonging to drug categories targeted vs. not targeted by individual publications, on the relevant cancer type, for their group-wise distribution of effect sizes and P values (with nonpaired Wilcoxon rank-sum test).We further compared targeted vs. nontargeted drug categories for the proportion of statistically significant signals (at p ! 9 Â 10 À5 ).
In sensitivity analyses, we excluded from group-wise comparisons the associations in individual publications with drugs from O1 ATC level 4 category (for better alignment with the medication-wide study), and repeated analyses per cancer category.R version 4.1.2[18] was used for analyses.

Review of published studies on eligible drug-cancer associations
Thirty-two publications from 2009 to 2021 (25 cohort and 7 case-control studies) were included (Appendix 4, Fig. 1) [19e50].They presented 162 unique drug-cancer associations (median per study 3, range 1e32; Table 1), in 913 different overlapping variants of for example, subgroups or drug definitions (median per study 20, range 1e83).Of the 162 unique associations, 134 could be paired with 70 corresponding drug-cancer associations in the agnostic study (excluding those with broader drug categories).Six studies used a nationwide general population sample, 23 studies a nationwide patient sample, and 3 studies a narrower patient sample.Median sample size was 187,000 (range 5,442e8,573,000).Median follow-up time was 7.5 years.Studies investigated any (n 5 11), breast (n 5 11), colon or colorectal (n 5 12), and prostate cancer (n 5 13).Outcome data came from the Swedish Cancer Registry in 27/32 studies.Twenty-two studies reported using lag time to account for prevalent usage and/ or reverse causation.Only 2 studies were publicly registered [19,34]; 2 others mentioned a protocol without public registration [27,37].Only 2 studies adopted any multiplicity control (adjusting to P value!0.01 [40], or posthoc Bonferroni adjustments [48]) and several mentioned multiplicity as a limitation [25,32,41].A large majority of studies (25/32) reported on previously proposed rather than novel associations, with referenced prior studies showing increased risk (n 5 9), protection (n 5 7), mixed findings (n 5 6), or null findings (n 5 3).Of the 913 reported associations, 421 (46%) were statistically significant (346/754 (46%) among those in the 25 studies that reported on Fig. 1.Statistical significance for drug-cancer associations from published individual studies and an agnostic medication-wide study using the same national registry, grouping drugs by Anatomical Therapeutic Chemical level 2. The 32 published individual studies reported 913 associations, 162 of which were unique in terms of population, exposure, comparison, and outcome, the other overlapping.Associations that combined drugs from different therapeutic subgroups (n 5 87) were not included in the graph.37 associations with P 5 1 are coded with ''positive'' sign.previously proposed associations, and 75/159 (47%) among those in the 7 studies that reported on some new associations).Studies used CPHR (n 5 16), logistic regression (n 5 11), Poisson regression (n 5 4), or standardized incidence ratio calculation (n 5 2), with very wide diversity in adjustments (Table 2).Across all 913 overlapping association variants, the number of adjustment variables was correlated with a slight decline in effect size, taking into account outcome category (Appendix 5).
Among the 134 associations, 17 showed statistically significant increased risk signal in both study categories, 16 in only published studies, 22 in only the agnostic study, and 79 in neither (OR 1.37, P 5 0.42).Regarding protective signals, 1 association showed statistically significant protective signal in both categories, 16 in only published studies, 2 in only the agnostic study, and 115 in neither (OR 0.13, P 5 0.0022).
Sensitivity analyses appear in Appendix 6.

Discussion
We contrasted the findings of 32 individual ''hypothesisdriven'' studies on drug-cancer associations with a previous medication-wide study, all using the same nationwide pharmaceutical registry.These large registry-based hypothesisdriven studies mostly addressed previously proposed associations rather than making new discoveries.Most found nonstatistically significant (''negative'') results and, accordingly, most effect sizes were small or very small.The median deviation was only 1.14 in the OR scale.The published studies targeted mostly associations with small effects which were nevertheless more likely to have statistically significant risk signals in the agnostic medicationwide analyses than the nontargeted associations.
No previous study has compared findings of hypothesisdriven pharmacoepidemiology studies with a medicationwide analysis.In genomics, the typical pattern has been that hypothesis-driven (''candidate'') gene study findings were statistically significant but generally not replicated in genome-wide association studies, with few exceptions [11].In the current pharmacoepidemiology paradigm, we saw a different pattern, as both hypothesis-driven and agnostic results were mostly ''negative''.In contrast to genomics, where genome-wide association studies are widely considered a more reliable approach than candidate gene studies, in pharmacoepidemiology one cannot make a similar claim.The 2 approaches have both strengths and weaknesses.For hypothesis-driven published studies, if just a few choice combinations are reported, uncertainty from analytical flexibility remains hidden [51].However, it remains unclear whether there is selective reporting based on the nature of the results, and, if so, the direction of bias: in favor of statistically significant or nonstatistically significant results.In fact, most national registry-based publications addressed previously discovered signals rather than novel ones.Therefore, there may have been selection in favor of disproving previous claims [8,52], a type of inverse publication bias favoring nonsignificant results.The national registry studies being mostly validation rather than discovery exercises found mostly small effects, as previously documented in general for associations and predictive signals upon validation [53].It is also possible that careful, thoughtful choice of adjustments resulted in shrinking the magnitude of associations that would otherwise have reflected mostly uncontrolled confounding.Conversely, the agnostic approach by definition does not suffer from potential selective reporting: all associations are tested and reported consistently.However, it is prone to large uncontrolled confounding, because the same adjustments are applied routinely and may not be sufficient for all associations.They may be more insufficient for some associations than others, potentially explaining why absolute effects were larger in the agnostic study than in published studies.Medication-wide analysis may be unbiased regarding selective reporting but more biased regarding confounding.Criteria for the credibility of observational associations are heavily contested [54e57].Therefore, it is difficult to arrive at an unambiguous gold standard of evidence for each claimed association.Most pharmacoepidemiology does not depend on large, national population-level registries but more focused studies with limited sample size.Thus, large registries can offer a valuable layer of evidence for either discovery or validation purposes.Using different analytical approaches may help understand whether results are susceptible to analytical choices.Exploration of medication-wide analyses (like other environment-wide association analyses [58,59]) also have their own analytical choices to make, in particular whether to use multiplicity corrections and/or selection methods (e.g., lasso), and if so, which specific options.
Limitations should be noted.First, methodological differences between hypothesis-driven and medication-wide studies, in particular regarding adjustment variables, unavoidably influence results.Expert consensus regarding adjustment variables is not likely [2]; indeed, the selection among hypothesis-driven studies was heterogeneous.Differences between the medication-wide study and single publications could also reflect the modestly longer follow-up in some single publications and their exploration of dose-effects and/or duration of use.Time-lag for cancer development may exceed 5 years.Other methodological variations were observed, for example, lag-time and cancer subtypes (such as more closely specified histological subtypes, although for colorectal and prostate cancer, malignancies are mostly comprised of adenocarcinomas) but were not consistently different between the 2 approaches.
Second, the risk of false negative findings is likely higher in the medication-wide study because of the stringent multiplicity control [10].Third, the comparative patterns might have been very different if one were to compare discovery studies that first claimed a pharmacoepidemiologic association vs. an agnostic approach.Very few hypothesis-driven studies in our sample belonged in this category.Fourth, almost every publication presented several eligible associations, often in many variants.Selection criteria, mainly prespecified, were used to increase comparability but are imperfect.
Wider use of national registries in pharmacoepidemiology should be encouraged for discovery, validation, and agnostic analyses.These different approaches may offer complementary insights about the architecture of the distribution and validity of drug risks and benefits.

Declaration of Competing Interest
There are no additional relationships, patents, or other activities to disclose.

Fig. 2 .
Fig. 2. Display of drug-cancer associations in published individual studies (n 5 134) and their counterparts in the medication-wide study (n 5 70): (A) Effect sizes, (B) Absolute effect sizes, (C) P values: log (1/p).Graphs show group-level descriptive statistics, while statistical comparisons were done pairwise.The P value plot is truncated at log (1/p) 5 85, not showing 5 outliers.

Table 1 .
Characteristics of included studies (n 5 32) by study design

Table 2 .
Statistical models and adjustment variables of included studies (n 5 32) by study design