An international genome-wide meta-analysis of primary biliary cholangitis: Novel risk loci and candidate drugs

Backgrounds & Aims Primary biliary cholangitis (PBC) is a chronic liver disease in which autoimmune destruction of the small intrahepatic bile ducts eventually leads to cirrhosis. Many patients have inadequate response to licensed medications, motivating the search for novel therapies. Previous genome-wide association studies (GWAS) and meta-analyses (GWMA) of PBC have identified numerous risk loci for this condition, providing insight into its aetiology. We undertook the largest GWMA of PBC to date, aiming to identify additional risk loci and prioritise candidate genes for in silico drug efficacy screening. Methods We combined new and existing genotype data for 10,516 cases and 20,772 controls from 5 European and 2 East Asian cohorts. Results We identified 56 genome-wide significant loci (20 novel) including 46 in European, 13 in Asian, and 41 in combined cohorts; and a 57th genome-wide significant locus (also novel) in conditional analysis of the European cohorts. Candidate genes at newly identified loci include FCRL3, INAVA, PRDM1, IRF7, CCR6, CD226, and IL12RB1, which each play key roles in immunity. Pathway analysis reiterated the likely importance of pattern recognition receptor and TNF signalling, JAK-STAT signalling, and differentiation of T helper (TH)1 and TH17 cells in the pathogenesis of this disease. Drug efficacy screening identified several medications predicted to be therapeutic in PBC, some of which are well-established in the treatment of other autoimmune disorders. Conclusions This study has identified additional risk loci for PBC, provided a hierarchy of agents that could be trialled in this condition, and emphasised the value of genetic and genomic approaches to drug discovery in complex disorders. Lay summary Primary biliary cholangitis (PBC) is a chronic liver disease that eventually leads to cirrhosis. In this study, we analysed genetic information from 10,516 people with PBC and 20,772 healthy individuals recruited in Canada, China, Italy, Japan, the UK, or the USA. We identified several genetic regions associated with PBC. Each of these regions contains several genes. For each region, we used diverse sources of evidence to help us choose the gene most likely to be involved in causing PBC. We used these ‘candidate genes’ to help us identify medications that are currently used for treatment of other conditions, which might also be useful for treatment of PBC.


Introduction
Primary biliary cholangitis (PBC) is a chronic liver disease in which autoimmune injury to the small intrahepatic bile ducts eventually leads to cirrhosis. Only 2 medications, ursodeoxycholic acid (UDCA) and obeticholic acid (OCA), are licensed for the treatment of PBC. Many patients have inadequate response to both agents, leaving them at risk of progressive liver disease. Notwithstanding recent advances, novel therapies are needed for this condition.
Delineating the genetic architecture of PBC can provide insight into its aetiologyand more specifically, identify potential drug targets. Therefore, over the past decade, our respective groups have undertaken genome-wide association studies (GWAS) of PBC in Canadian-US, 1 Italian, 2 British, 3 Japanese, 4 and Chinese 5 cohorts; and in 2015, we undertook a genome-wide meta-analysis (GWMA) of the Canadian-US, Italian, and British discovery panels. 6 These studies have identified genome-wide significant associations at the human leukocyte antigen (HLA) locus and 42 non-HLA loci.
Our GWMA in 2015 did not include the Japanese or Chinese discovery panels. Furthermore, since 2015, our respective groups have undertaken genome-wide genotyping of substantially expanded Canadian, Italian, UK, and US cohorts. Therefore, we present an updated GWMA of PBC that includes these expanded cohorts, as well as the Japanese and Chinese discovery panels. In this study, we aimed to: i) capitalise on the increased sample size to discover additional risk loci for PBC; ii) explore populationspecific genetic heterogeneity at known and newly identified risk loci; iii) integrate GWMA statistics with publicly available epigenetic, gene expression, and proteomic datasets to prioritise causal variants and candidate genes; and iv) use these candidate genes for in silico drug efficacy screening to identify agents potentially suitable for re-purposing to PBC.

Materials and methods
Participants and genotyping are summarised in Table 1 and detailed in the supplementary information. Written informed consent was obtained from each participant. The research conformed to the ethical guidelines of the 1975 Declaration of Helsinki.

Quality control
For the European and Japanese panels, quality control (QC) checks were performed at Newcastle University, UK, using the software package PLINK. 7 Specific QC thresholds to determine outliers were based on visual inspection and varied by panel. For the European panels, we first removed variants with minor allele frequency (MAF) <0.01; genotype call rate <97% (<95% for the 'old' Italian, WTCCC3, and 'new' US panels); or significant deviation from Hardy Weinberg Equilibrium (HWE) (p <10 −6 ). We then removed samples with rates of missing data >2% (>4% for the new US panel); whole-genome heterozygosity >3.25 standard deviations from the mean; apparent gender discrepancies (based on X-chromosomal heterozygosity >0.2 for men and <0.2 for women); estimated proportion of identity-by-descent sharing with another sample >0.1 (based on subsets of between 38,000 and 97,000 variants pruned for linkage disequilibrium [LD]); or that did not cluster with the CEU HapMap2 population (based on visual inspection of the first 2 principal components). For the Japanese panel, we used the dataset described in Kawashima et al. (2012), 4 except for the additional removal of 4 cases and 10 controls with apparent gender discrepancies.
All samples recruited in China were processed and analysed on Chinese servers to comply with the Regulation of the People's Republic of China on the Administration of Human Genetic Resources. Thus, for the Chinese panel, QC checks were undertaken on a local server in Shanghai, China. Variants were removed with MAF <0.5%, genotype call rate <95%, or deviation from HWE in controls p < − 1x10 −6 . Samples were removed with rates of missing data > − 5% or pairwise identity-by-state, PI_HAT >0.25. Population outliers were identified for exclusion using principal component analysis.

Genome-wide imputation and post-imputation quality control
For the European and Japanese panels, we used the autosomal variants and samples passing QC to carry out genome-wide imputation within each individual panel using the Michigan Imputation Server with Eagle2 phasing, 8 informed by the 1000 Genomes Phase 3 reference panel. Following imputation, we discarded variants with imputation R 2 <0.5; non-unique alleles at the same position; or imputation call rate <90% (based on assigning genotypes according to the most likely genotype call and setting genotypes to missing if the most likely genotype call had posterior probability <0.9). We also used the resulting common set of imputed variants to check for sample duplicates/ relationships across the European panels (based on estimated identity-by-descent sharing using 25,873 variants pruned for LD) and removed 1 person from each of the 137 identified relative pairs.
For the Chinese panel, genome-wide imputation was undertaken on a local server in Shanghai, China, using SHAPEIT 9 and IMPUTE2, 10 and the 1000 Genomes Phase 3 reference panel. Following imputation, we discarded variants with call rates <95% (having set genotypes to missing if the most likely genotype call had posterior probability <0.9), MAF <0.01, or HWE p <1×10 -6 in controls. The resulting imputation summary statistics (log odds ratios [lnORs], standard errors, and p values) were submitted without individual-level data to Newcastle, UK, for meta-analysis with the other panels.

Statistical analysis of European and Japanese cohorts
Within each panel, we performed association analysis of the genome-wide imputed data using logistic regression of disease phenotype on single nucleotide polymorphism (SNP) genotype (coded 0,1,2) in PLINK, 7 with the first 10 principal components (from a pruned set of SNPs with the HLA region removed) included as covariates to correct for population stratification. (The rationale for removing the HLA region was that inclusion of SNPs in this region would risk generating components that explain variation primarily caused by strong HLA-disease association, rather than population stratification.) For all but the new Canadian-UK panel, the resulting genomic control (GC) inflation factor k was modest (<1.026); therefore, we carried out GC correction within each panel by multiplying the standard error (SE) of the estimated lnOR for each SNP by Ok. For the new Canadian-UK panel, k was somewhat inflated at 1.091; therefore, we re-analysed the new Canadian-UK data using a logistic mixed model score test (including the first 10 principal components as covariates) as implemented in the GMMAT package, 11 resulting in a slightly deflated k of 0.971. The SE of the estimated lnOR for each SNP from PLINK was then (conservatively) adjusted to match that implied by the GMMAT test statistic. Specifically, we multiplied the PLINK-derived SE for each SNP by a SNP-specific factor c, where c was chosen so that the resulting s 2 test statistic (lnOR/cSE) 2 for that SNP had a p value equal to the p value from GMMAT. GC correction was also performed for the Chinese summary statistics (k = 1.050) by multiplying the SE of the estimated lnOR for each SNP by Ok.
Meta-analysis of European, Asian, and combined cohorts We used the software package META 12 to perform fixed-effect meta-analysis of the resulting lnORs and adjusted SEs from i) the 5 European panels; ii) the 2 Asian panels; and (3) all 7 panels, in each case restricting the analysis to variants that (following post-imputation QC) appeared within all panels. Within each meta-analysed set (European, Asian, and combined), a further GC correction was performed (to adjust for the inflation factors of 1.041, 1.033, and 1.080, seen within the European, Asian, and combined cohorts, respectively) to produce the final set of genome-wide results. Specifically, as for the individual panels above, the SE of the final lnOR for each SNP was multiplied by Ok, and the test statistic and p value were re-calculated accordingly. This use of "double" GC correction might be considered overly conservative, given that part of the observed inflation could be due to polygenicity. We explored this using LD score regression (LDSR) 13 to compare our original results with those obtained using no GC (or GMMAT-derived) correction at all. We also compared our results from all panels combined with those obtained using trans-ethnic meta-regression analysis as implemented in the software package MR-MEGA 14 (see supplementary information for details).

Prioritisation of candidate causal variants and candidate genes
We used the FINEMAP 15 package and Conditional and Joint Analysis (COJO) 16 implemented within GCTA 17 to refine and look for independent associations within genome-wide significant risk loci. We used FINEMAP to construct 'credible sets' of variants most likely to be causal in PBC. We used the ENSEMBL Variant Effect Predictor, 18 FUMA (Functional Mapping and Annotation) GWAS 19 platform, and reference panels from the Avon Longitudinal Study of Parents and Children (ALSPAC, http://www.bristol. ac.uk/alspac/) 20 and the INTERVAL study (http://www. donorhealth-btru.nihr.ac.uk/studies/interval-study/) 21 for mapping and functional annotation of the first set of 'credible causal variants' at each risk locus.
Adapting the approach of Barbeira et al. (2018), 22 we used the MetaXcan package; our European GWMA summary statistics; and reference panels from ALSPAC, the Genotype-Tissue Expression (GTEx) project (https://gtexportal.org/), 23 and the INTERVAL study to derive genome-wide genetic prediction models of DNA methylation, gene expression, and serum protein levels in cases and controls. We used these models to correlate predicted DNA methylation, gene expression, and serum protein levels with disease status in methylome-wide, transcriptomewide, and serum proteome-wide association studies (MWAS, TWAS, and PWAS, respectively).
We used the moloc package 24 to look for co-localisation of association signals from our GWMA of the European panels with those derived from mapping of methylation, expression, and protein-quantitative trait loci (mQTLs, eQTLs, pQTLs) in ALSPAC, the GTEx project, and the INTERVAL study, respectively. Finally, we used the DEPICT package 25 to prioritise the most likely causal gene at risk loci based on gene function.

Enrichment analysis
We used the STRING Database 26 to look for enrichment of protein-protein interactions and functional annotations amongst candidate genes; and the DAVID resource 27 to look for enrichment of KEGG pathways by genes with minimum p GWMA <0.01.
Network-based in silico drug efficacy screening We employed the approach of Guney et al. (2016) 28 in which known drug targets and candidate genes for a disease are used to estimate a drug-disease proximity measure, z, that quantifies the closeness (or proximity) of the drug and disease gene networks, respectively, correcting for the known biases of the interactome. For this analysis, we used the drug targets listed in DrugBank (https://www.drugbank.com/, accessed January 2021) and candidate genes for PBC prioritised as above. See the supplementary information for details.

Results
GWMA identifies 21 additional genome-wide significant risk loci for PBC Following QC, the European panels consisted of 5,186,747 variants across 8,021 cases and 16,489 controls; Asian panels, 5,347,815 variants across 2,495 cases and 4,283 controls; and all panels combined, 2,817,608 variants across 10,516 cases and 20,772 controls (Table 1). Of note, there was substantial reduction in the number of variants in all panels combined compared to the European or Asian panels. This resulted from limited overlap of variants that passed post-imputation QC in the European compared to the Asian panels, explained by our use of different genotyping platforms across cohorts, and different LD patterns in Europeans compared to Asians. GWMA of the European panels identified 46 loci at genomewide significance (p <5×10 −8 ); GWMA of the Asian panels, 13 loci at genome-wide significance; and GWMA of all panels combined, 41 loci at genome-wide significance (Fig. S1). Altogether, we identified 56 genome-wide significant risk loci in one or other meta-analysis (Table S1, Fig. S2). Using COJO, we identified an additional risk locus at 19p13.11 with genome-wide significance in conditional analysis of European panels (p = 4.66×10 -8 ), having narrowly missed this threshold in the main, unconditional analysis (p = 6.55×10 -8 ) (Fig. S2.57). Thus, a total of 57 genome-wide significant risk loci were identified in the current study. Of these, 21 were not identified in previous studies; and 2, 1q23.1 and 11q24.3, were previously identified at suggestive rather than genome-wide significance (Tables 2A&B). 4,29 At 6 newly identified or newly confirmed risk loci, we considered evidence of association to be conclusive because: i) an unequivocal association signal was evident in both the European and Asian panels; and ii) where the lead variant at the locus was different in the European compared to the Asian panels, permutation testing confirmed the significance of a signal in the validating dataset, located in proximity to the primary signal in the index dataset (p permutation <0.00217, corresponding to p <0.05 Bonferroni-corrected for 23 tests; see supplementary information for details) (Table 2A, Table S1, Fig. S2).
At 17 newly identified or newly confirmed risk loci, we considered evidence of association to be strong but not conclusive because unequivocal association was evident in the European but not the Asian panels, or permutation testing was not significant at p permutation <0.00217 (Table 2B, Table S1, Fig. S2). We note, however, that most of these loci achieved levels of significance in the Asian panels that were suggestive for validation, including 2 loci with suggestive permutation p values (4q24, p permutation = 0.0040; and 5q21.1, p permutation = 0.0032).
We confirmed genome-wide significant associations at 34 of 43 previously identified risk loci for PBCbut not at 9 previously identified risk loci. Seven of these 9 loci nevertheless showed a convincing association signal, albeit at p >5×10 -8 (Table S2, Fig. S3). We found no evidence of association at the 15q25.1 locus (harbouring IL16) that was discovered and validated in the Chinese GWAS by Qiu et al. (2017) 5 ; this is explained by the absence of a signal in the Japanese and European panels. Coverage of the 19p13.2 locus was too sparse to test association.
We compared our original results to those obtained without GC (or GMMAT-derived) correction. As expected, without correction, all loci previously identified as genome-wide significant reached slightly higher levels of significance, while a few loci that did not reach genome-wide significance in our original analysis, now (just) did so ( Fig. S4 and Table S4). We also compared our original results for all panels combined with those obtained using trans-ethnic meta-regression analysis, implemented in MR-MEGA. Results from MR-MEGA were highly concordant with those from our original analysis (Fig. S5), also providing genome-wide significant confirmation of an independent association signal at 7q32.1, which exhibited significant heterogeneity in the direction of effects between the Asian and European cohorts (Table S5 and Fig. S6).

PBC shows genetic correlation with other autoimmune conditions
Recognising that most risk loci for PBC are also risk loci for other autoimmune conditions (Table S6), we used LDSR implemented via LD Hub 30 to evaluate the genetic correlation between PBC (using summary statistics from our European panels) and complex traits with GWAS summary statistics in the LD Hub database. We found significant genetic correlation between PBC and other immune-mediated inflammatory disorders, including systemic lupus erythematosus (SLE, rg = 0.54, p = 2.87×10 -14 ), rheumatoid arthritis (RA, rg = 0.26, p = 3.77×10 -5 ), and inflammatory bowel disease (IBD, rg = 0.23, p = 6.97×10 -5 ) (Table S7). We were unable to test genetic correlation of PBC with autoimmune thyroid disease, Sjögren syndrome, or systemic sclerosis because GWAS summary statistics for these conditions were not available in LD Hub at the time of interrogation (19.09.2019).
The genetic architecture of PBC is broadly shared across European and Asian populations To evaluate consistency between European and Asian signals, we applied permutation testing where warranted and standard meta-analysis measures of heterogeneity to the lead variants at  Results for the lead variant at newly identified or newly confirmed risk loci with p <5×10 -8 in fixed-effect meta-analysis of the European, Asian, or combined panels. (A) Evidence of association was taken to be conclusive because: i) an unequivocal association signal at the same locus was observed in both the European and the Asian panels; and ii) where the lead variant at the locus was different in the European vs. the Asian panels, permutation testing confirmed the significance of the signal in the validating dataset at p perm <0.00217 (see supplementary information and Table S1). (B) Evidence of association was taken to be strong but not conclusive because unequivocal association was evident in the European but not the Asian panels, or permutation testing was not significant at p perm <0.00217. Gene: candidate gene at the risk locus (which is not necessarily the mapped gene). A1, tested allele; A2, alternative allele; BP, base pair position; Chr, chromosome; p perm , permutation p value; OR, odds ratio. *Note that 1q23.1 and 11q24.3 were previously identified at suggestive level of significance in the study by Kawashima et al. (2017).
each of the 56 genome-wide significant risk loci identified or confirmed in the main, unconditional analyses (Table S1). We found concordance between risk loci operating in European and Asian populations, considering i) the much smaller sample size of the Asian panels; and ii) the interrogation of different variants in the European compared to the Asian panels, for reasons given above (for a detailed commentary of each risk locus, see Fig. S2).
With few exceptions, we also found concordance between the lnORs seen in the combined Asian and combined European panels (Fig. S7).
To investigate overall concordance in the genetic basis of PBC between European and Asian populations, we estimated the proportion of trait variance explained (on the liability scale) in the Japanese cohort (for which individual-level genotype data were available) by sets of variants chosen according to their p values in the European GWMA (see supplementary information). Regardless of the p value threshold and the assumed trait prevalence, variants showing some level of association in the European GWMA explained more of the trait variance than an equivalent number of randomly chosen variantsin most instances, significantly moresupporting the conclusion that loci influencing the risk of PBC in Europeans, also influence its risk in Asians (Table S8).
Thus, while equivalently powered cohorts, accurately genotyped at the same set of variants, would be required to fully address the question of population-specific genetic heterogeneity, our results provide preliminary evidence that the genetic architecture of PBC is broadly shared across European and Asian populations.

Co-localisation and DEPICT enable prioritisation of candidate genes
In functional annotation, we found that credible causal variants included missense variants in 21 genes at 14 risk loci; splice variants in 8 genes at 5 risk loci; and stop variants in 2 genes at 2 risk loci (Table S9). Few of these variants were predicted to be deleterious. Credible causal variants at all genome-wide significant risk loci mapped to chromatin interacting regions (CIRs), mQTLs, eQTLs, or pQTLs (Tables S10-12); and in the MWAS, TWAS, and PWAS, we predicted differential methylation, transcription, or translation of genes at and beyond GWMAsignificant loci (Tables S13-15, Fig. S8). These observations suggest that the genetic architecture of PBC confers susceptibility to disease mainly by influencing the regulation of expression of causal genes. Therefore, we sought co-localisation of GWMA with mQTL, eQTL, or pQTL association signals, aiming to pinpoint causal variants and genes across the genome. Using moloc, we identified 251 co-localisation models with posterior probability of association > − 0.80, implicating variants and genes at 60 loci (Table S16, Fig. S8C). Of these, 28 correspond to GWMAsignificant risk loci, where co-localisation models implicate candidate genes such as IL12RB2 (1p31.3), FCRL3 (1q23.1), and INAVA (1q32.1). Association at the other 32 loci did not reach genome-wide significance in the GWMA; co-localisation models nevertheless implicate highly plausible candidate genes at some of these loci, such as CCL21 (9p13.3) and IL2RB (22q12.3).
We found that candidate genes implicated by co-localisation were broadly concordant with those implicated by functional annotation of credible causal variants, and by the MWAS, TWAS, and PWAS. As in previous studies, we also observed that candidate genes at disparate risk loci are evidently related in function, e.g., IL12A (3q25.33), IL12B (5q33.3), IL12RB1 (19p13.11), and IL12RB2 (1p31.3). Therefore, we used DEPICT 25 to prioritise candidate genes at genome-wide significant risk loci based on gene function. In this way, we identified 82 candidate genes with a false discovery rate (FDR) <5% across 48 loci (Table S17). As expected, genes prioritised by DEPICT overlapped with those prioritised by the other approaches (Table S18).
We used the information garnered above to finalise a list of top candidate genes at genome-wide significant risk loci Results for top-ranking agents and current treatments for primary biliary cholangitis, z being a drug-disease proximity measure, defined as z = (d c -l)/r where d c is the average shortest path length between the drug's targets and the nearest disease gene, and l and r are calculated via a randomisation procedure as described in the supplementary information. Guney et al. define a drug to be proximal to a disease if its proximity follows z < − −0.15 (p < − 0.44), and distant otherwise. (Table S18). Using STRING, 26 we found these genes to be highly enriched for protein-protein interactions (p <1.0×10 -16 ), with enrichment at FDR <5% of the following KEGG pathways: T helper (T H )1 and T H 2 cell differentiation, T H 17 cell differentiation, and toll-like receptor (TLR), RIG-I-like receptor (RLR), TNF, NF-jB, and JAK-STAT signalling pathways, amongst others (Fig. S9). For comparison, we undertook enrichment analysis using DAVID 27 of 1,388 genes with minimum p GWMA <0.01, which identified enrichment at FDR <5% of the following KEGG pathways: antigen processing and presentation, FccR-mediated phagocytosis, NK cell-mediated cytotoxicity, and T cell receptor, B cell receptor, PI3K-AKT, FcεRI, JAK-STAT, NF-jB, and MAPK signalling pathways, amongst others (Table S19).
In silico drug efficacy screening identifies agents potentially suitable for re-purposing to PBC In the approach of Guney et al. (2016), 28 the more negative the value of z, the closer the drug and disease gene networks. A cutoff of z < − -0.15 is taken to show that the drug is proximal to the disease and thus, might exert pharmacological effects on it. In our analysis, we identified many agents with z < − -0.15, which are therefore predicted to exert pharmacological effects on PBC ( Table 3, Table S20). Top-ranking drugs that might be predicted to ameliorate PBC included several immunomodulators, such as ustekinumab, an anti-IL-12/23 monoclonal antibody used for psoriasis and Crohn's disease (z = -4.757); belatacept, a CTLA-4 fusion protein used in organ transplantation (z = -4.709); and abatacept, a CTLA-4 fusion protein used for RA, juvenile idiopathic arthritis (JIA), and psoriatic arthritis (z = -4.603). Of interest, other top-ranking agents include the retinoids etretinate and its metabolite acitretin, both of which are used for the treatment of psoriasis (z = -3.879 and z = -4.548, respectively).
Top-ranking drugs that might be predicted to exacerbate PBC included the pharmacological interferons, such as interferon alfa-2a and interferon beta-1b (z = -2.748 and z = -2.688, respectively). Amongst recognised treatments for PBC, fenofibrate scored z = -0.986; bezafibrate, z = -0.866; and OCA, z = -0.737, respectively. Thus, these drugs might be predicted to exert pharmacological effects on PBC. Conversely, UDCA scored z = +0.171, meaning it is not predicted to treat the genetically determined component of disease in PBC.

Discussion
We report the largest GWMA of PBC undertaken to date, with a sample size four times greater than that of our previous study. In this better-powered study, we identified 21 additional genomewide significant risk loci; showed that the genetic architecture of PBC is broadly shared across European and Asian populations; prioritised candidate genes at known and newly identified genome-wide significant risk loci; and used these candidate genes to identify medications predicted to treat the genetically determined component of disease in PBC, which might therefore be suitable for re-purposing to this condition. Candidate genes at newly identified or newly confirmed risk loci provide additional insights into the pathogenesis of PBC (Fig. 1). Thus, INAVA (1q32.1) amplifies pattern recognition receptor (PRR) signalling; DNMT3A (2p23.3), ZC3HAV1 (7q34), and TRIM14 (9q22.33) are each involved in RLR signalling; TET2 (4q24) represses transcription of IL-6; and PVT1 (8q24.21) regulates inflammation via NF-jB and MAPK pathways. Chemokine receptor 6 (CCR6, 6q27) interacts with CCL20 in the chemotaxis of dendritic cells and lymphocytes to inflamed epithelia; ST8SIA4 (5q21.1) is required for the interaction of CCR7 with CCL21 in the trafficking of immune cells to secondary lymphatic organs; and CD226 (18q22.2) participates in lymphocyte and NK cell adhesion and signalling. Fc receptor-like protein 3 (FCRL3, 1q23.1), ID2 (2p25.1), TET2 (4q24), RARB (3p24.2), NDFIP1 (5q31.3), ITGB8 (7p21.1), and CD226 (18q22.2) are each involved in the differentiation of T H 1, T H 17, or regulatory T cells. As expected, enrichment analysis of candidate genes reiterated the importance of PRR, TNF, and NF-jB signalling, and T H 1/T H 17 cell differentiation in this disease. These findings are consistent with functional data emphasising the importance of innate immune cell hypersensitivity, chemokine signalling and immune cell trafficking, and T H 1/T H 17 cell polarisation in PBC pathogenesis, as summarised by Gulamhusein and Hirschfield (2020) 31 in their recent review.
There is considerable current interest in the 'Druggable Genome', i.e., the use of genome-wide approaches to find targets for drug discovery (for example, see the Open Targets initiative at https://www.opentargets.org/). In the current study, having prioritised candidate genes, we used network-based in silico drug efficacy screening to identify agents potentially suitable for repurposing to PBC. Given our other findingsincluding genetic correlation of PBC with SLE, RA, and IBDit is expected that the top-ranking medications should include immunomodulators already approved for the treatment of RA, JIA, IBD, MS, or psoriasis.
The evidence to support re-purposing of these immunomodulators to PBC is circumstantial yet convincingbut circumspection is required. For example, in the current study, LDSR demonstrated genetic correlation with IBD; enrichment analysis showed association with 'T H 1 and T H 2 cell differentiation'; and drug efficacy screening suggested that ustekinumab, an anti-IL-12/23 monoclonal antibody used for treatment of Crohn's disease, might exert pharmacological effects on PBC. Therefore, it is notable that ustekinumab showed minimal effect on PBC in the trial by Hirschfield et al. (2016). 32 Similarly, drug efficacy screening suggested that abatacept, a CTLA-4 fusion protein used for treatment of RA, might be effective for treatment of PBCbut abatacept showed no effect on PBC in the trial by Bowlus et al. (2019). 33 A potential explanation for these discrepant observations, also expounded by Bowlus et al., 33 is that the evaluation of immunomodulators in PBC might require a change in clinical trial design. Thus, immunomodulators might require immunological rather than cholestatic endpoints; might be more effective in early disease, before the cholestatic liver injury predominates; and might require combined treatment of both the autoimmune and cholestatic injuries. Re-design of clinical trials in PBC might be contentious but the use of genomic data to prioritise potential agents for PBC is not, as new treatments for PBC are needed and the druggable genome provides a framework to find them.
It is notable that in drug efficacy screening, UDCAwellestablished as first-line treatment for PBCwas not predicted to be therapeutic in this condition. One possibility is that UDCA serves primarily to treat a cholestatic liver injury that is critical to disease progression but orthogonal to the genetically determined, autoimmune processes that confer risk of disease. Conversely, OCA (a potent FXR agonist) and the fibrates, bezafibrate and fenofibrate (PPAR-a/d/c and PPAR-a agonists, respectively), are expected to have immune-modulatory as well as anti-cholestatic effects. 34,35 We acknowledge 2 major limitations of the study. First, the absence of an independent validation cohort meant we were unable to confirm several newly identified risk loci. Other strategies, such as cross-phenotype meta-analysis, may be required for external validation of these loci. Second, the use of different genotyping platforms across cohorts meant that at many risk loci, the lead variant in the European panels was not represented in the Asian panels, or vice versa. This, together with marked disparity in the sample size of the European vs. the Asian panels, meant that we were unable to fully address the question of population-specific genetic heterogeneity.
In conclusion, our large, trans-ethnic GWMA of PBC has identified additional risk loci; found little evidence for population-specific genetic heterogeneity; and, through functional annotation of credible causal variants and multi-omic analysis, allowed us to prioritise candidate genes, and thereby prioritise drugs potentially suitable for re-purposing to PBC. This study emphasises the value of genomic approaches to provide biological insight and guide the development of novel therapies.