Abstract
Background The coronavirus disease 2019 (COVID-19) is an infectious disease that mainly affects the host respiratory system with ∼80% asymptomatic or mild cases and ∼5% severe cases. Recent genome-wide association studies (GWAS) have identified several genetic loci associated with the severe COVID-19 symptoms. Delineating the genetic variants and genes is important for better understanding its biological mechanisms.
Methods We implemented integrative approaches, including transcriptome-wide association studies (TWAS), colocalization analysis and functional element prediction analysis, to interpret the genetic risks using two independent GWAS datasets in lung and immune cells. To understand the context-specific molecular alteration, we further performed deep learning-based single cell transcriptomic analyses on a bronchoalveolar lavage fluid (BALF) dataset from moderate and severe COVID-19 patients.
Results We discovered and replicated the genetically regulated expression of CXCR6 and CCR9 genes. These two genes have a protective effect on the lung and a risk effect on whole blood, respectively. The colocalization analysis of GWAS and cis-expression quantitative trait loci highlighted the regulatory effect on CXCR6 expression in lung and immune cells. In the lung resident memory CD8+ T (TRM) cells, we found a 3.32-fold decrease of cell proportion and lower expression of CXCR6 in the severe than moderate patients using the BALF transcriptomic dataset. Pro-inflammatory transcriptional programs were highlighted in TRM cells trajectory from moderate to severe patients.
Conclusions CXCR6 from the 3p21.31 locus is associated with severe COVID-19. CXCR6 tends to have a lower expression in lung TRM cells of severe patients, which aligns with the protective effect of CXCR6 from TWAS analysis. We illustrate one potential mechanism of host genetic factor impacting the severity of COVID-19 through regulating the expression of CXCR6 and TRM cell proportion and stability. Our results shed light on potential therapeutic targets for severe COVID-19.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
reorganize the discussion and supplemental files.
List of abbreviations
- BALF
- bronchoalveolar lavage fluid
- BIOS
- Biobank-based Integrative Omics Studies
- ChromHMM
- chromatin-state hidden Markov model
- COVID-19
- coronavirus disease 2019
- CLPP
- colocalization posterior probability
- CSEA-DB
- cell-type-specific expression database
- DAP
- deterministic approximation of posteriors
- DEG
- differentially expressed gene
- DICE
- database of immune cell expression
- DrivAER
- Driving transcriptional programs based on AutoEncoder derived Relevance scores
- eQTL
- expression quantitative trait
- GReX
- genetically regulated expression
- GWAS
- genome-wide association study
- HGI
- Host Genetics Initiative
- Hi-C
- high-throughput chromatin interaction
- LD
- linkage disequilibrium
- MAF
- minor allele frequency
- MASHR
- multivariate adaptive shrinkage in R
- Mb
- million base pairs
- MSigDB
- molecular signatures database
- PIP
- posterior inclusion probability
- PWM
- position weight matrix
- SARS-CoV-2
- severe acute respiratory syndrome coronavirus 2
- RCP
- regional colocalization probability
- SCGG
- Severe COVID-19 GWAS Group
- scRNA-seq
- single cell RNA sequencing
- tSNE
- t-Distributed Stochastic Neighbor Embedding
- TF
- transcription factor
- TRMcells
- resident memory CD8+ T cells
- TWAS
- transcriptome-wide association study