Skip to main content

Sparse Regression Models for Unraveling Group and Individual Associations in eQTL Mapping

  • Protocol
  • First Online:
eQTL Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2082))

  • 1997 Accesses

Abstract

As a promising tool for dissecting the genetic basis of common diseases, expression quantitative trait loci (eQTL) study has attracted increasing research interest. Traditional eQTL methods focus on testing the associations between individual single-nucleotide polymorphisms (SNPs) and gene expression traits. A major drawback of this approach is that it cannot model the joint effect of a set of SNPs on a set of genes, which may correspond to biological pathways. To alleviate this limitation, in this chapter, we propose geQTL, a sparse regression method that can detect both group-wise and individual associations between SNPs and expression traits. geQTL can also correct the effects of potential confounders. Our method employs computationally efficient technique, thus it is able to fulfill large scale studies. Moreover, our method can automatically infer the proper number of group-wise associations. We perform extensive experiments on both simulated datasets and yeast datasets to demonstrate the effectiveness and efficiency of the proposed method. The results show that geQTL can effectively detect both individual and group-wise signals and outperform the state-of-the-arts by a large margin. This book chapter well illustrates that decoupling individual and group-wise associations for association mapping is able to improve eQTL mapping accuracy, and inferring individual and group-wise associations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Andrew G, Gao J (2007) Scalable training of L1-regularized log-linear models. In: Proceedings of the 24th international conference on machine learning

    Google Scholar 

  2. Bochner BR (2003) New technologies to assess genotype-phenotype relationships. Nat Rev Genet 4:309–314

    Article  CAS  Google Scholar 

  3. Braun R, Buetow K (2011) Pathways of distinction analysis: a new technique for multi-SNP analysis of GWAS data. PLoS Genet 7(6):e1002101

    Article  CAS  Google Scholar 

  4. Brem RB, Storey JD, Whittle J, Kruglyak L (2005) Genetic interactions between polymorphisms that affect gene expression in yeast. Nature 436:701–703

    Article  CAS  Google Scholar 

  5. Broman KW, Wu H, Sen S, Churchill GA (2003) R/QTL: QTL mapping in experimental crosses. Bioinformatics 19(7):889–890

    Article  CAS  Google Scholar 

  6. Chen X, Shi X, Xu X, Wang Z, Mills R, Lee C, Xu J (2012) A two-graph guided multi-task Lasso approach for eQTL mapping. In: 15th International conference on artificial intelligence and statistics, AISTATS 2012, pp 208–217

    Google Scholar 

  7. Cheng W, Zhang X, Wu Y, Yin X, Li J, Heckerman D, Wang W (2012) Inferring novel associations between SNP sets and gene sets in eQTL study using sparse graphical model. In: ACM conference on bioinformatics, computational biology and biomedicine ’12, pp 466–473

    Google Scholar 

  8. Cheng W, Zhang X, Guo Z, Shi Y, Wang W (2014) Graph-regularized dual Lasso for robust eQTL mapping. Bioinformatics 30(12):i139–i148

    Article  CAS  Google Scholar 

  9. Cheung VG, Spielman RS, Ewens KG, Weber TM, Morley M, Burdick JT (2005) Mapping determinants of human gene expression by regional and genome-wide association. Nature 437:1365–1369

    Article  CAS  Google Scholar 

  10. Fusi N, Stegle O, Lawrence ND (2012) Joint modelling of confounding factors and prominent genetic regulators provides increased accuracy in genetical genomics studies. PLoS Comput Biol 8(1):e1002330

    Article  CAS  Google Scholar 

  11. Holden M, Deng S, Wojnowski L, Kulle B (2008) GSEA-SNP: applying gene set enrichment analysis to SNP data from genome-wide association studies. Bioinformatics 24(23):2784–2785

    Article  CAS  Google Scholar 

  12. Huang da W, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57

    Article  Google Scholar 

  13. Joo JW, Sul JH, Han B, Ye C, Eskin E (2014) Effectively identifying regulatory hotspots while capturing expression heterogeneity in gene expression studies. Genome Biol 15(4):r61

    Article  Google Scholar 

  14. Lander ES (2011) Initial impact of the sequencing of the human genome. Nature 470(7333):187–197

    Article  CAS  Google Scholar 

  15. Listgarten J, Kadie C, Schadt EE, Heckerman D (2010) Correction for hidden confounders in the genetic analysis of gene expression. Proc Natl Acad Sci USA 107(38):16465–16470

    Article  CAS  Google Scholar 

  16. Listgarten J, Lippert C, Kang EY, Xiang J, Kadie CM, Heckerman D (2013) A powerful and efficient set test for genetic markers that handles confounders. Bioinformatics 29(12):1526–1533

    Article  CAS  Google Scholar 

  17. Mazumder R, Hastie T, Tibshirani R (2010) Spectral regularization algorithms for learning large incomplete matrices. J Mach Learn Res 11:2287–2322

    PubMed  PubMed Central  Google Scholar 

  18. McClurg P, Janes J, Wu C, Delano DL, Walker JR, Batalov S, Takahashi JS, Shimomura K, Kohsaka A, Bass J, Wiltshire T, Su AI (2007) Genomewide association analysis in diverse inbred mice: power and population structure. Genetics 176(1):675–683

    Article  CAS  Google Scholar 

  19. Michaelson J, Loguercio S, Beyer A (2009) Detection and interpretation of expression quantitative trait loci (eQTL). Methods 48(3):265–276

    Article  CAS  Google Scholar 

  20. Musani SK, Shriner D, Liu N, Feng R, Coffey CS, Yi N, Tiwari HK, Allison DB (2007) Detection of gene x gene interactions in genome-wide association studies of human population data. Hum Hered 63(2):67–84

    Article  CAS  Google Scholar 

  21. Pujana MA, Han J-DJ, Starita LM, Stevens KN, Tewari M et al. (2007) Network modeling links breast cancer susceptibility and centrosome dysfunction. Nat Genet 39(11):1338–1349

    Article  CAS  Google Scholar 

  22. Shabalin AA (2012) Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28(10):1353–1358

    Article  CAS  Google Scholar 

  23. Smith EN, Kruglyak L (2008) Gene-environment interaction in yeast gene expression. PLoS Biol 6(4):e83

    Article  Google Scholar 

  24. The Gene Ontology Consortium (2000) Gene ontology: tool for the unification of biology. Nat Genet 25(1):25–29

    Article  Google Scholar 

  25. Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc B 58(1):267–288

    Google Scholar 

  26. Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89(1):82–93

    Article  CAS  Google Scholar 

  27. Yang C, Wang L, Zhang S, Zhao H (2013) Accounting for non-genetic factors by low-rank representation and sparse regression for eQTL mapping. Bioinformatics 29(8):1026–1034

    Article  CAS  Google Scholar 

  28. Yvert G, Brem RB, Whittle J, Akey JM, Foss E, Smith EN, Mackelprang R, Kruglyak L (2003) Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat Genet 35(1):57–64

    Article  CAS  Google Scholar 

  29. Zhu J, Zhang B, Smith EN, Drees B, Brem RB, Kruglyak L, Bumgarner RE, Schadt EE (2008) Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nat Genet 40(7):854–861

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Cheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Cheng, W., Zhang, X., Wang, W. (2020). Sparse Regression Models for Unraveling Group and Individual Associations in eQTL Mapping. In: Shi, X. (eds) eQTL Analysis. Methods in Molecular Biology, vol 2082. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0026-9_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-0026-9_8

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-0025-2

  • Online ISBN: 978-1-0716-0026-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics