Skip to main content

Practical Informatics Approaches to Microsatellite and Variable Number Tandem Repeat Analysis

  • Protocol
  • First Online:
Genetic Variation

Part of the book series: Methods in Molecular Biology ((MIMB,volume 628))

Abstract

The second most common source of genetic variation after SNPs is polymorphic tandem repeats, the alleles of which consist of a variable number of repeated units that can be either small (e.g., CA) or large (to >100 nucleotides in length). There are perhaps over half a million of these in the human genome. They have been implicated as functional promoter polymorphisms acting as common genetic risk factors for complex disorders (in diabetes and depression), as pathogenic mutations (Spinocerebellar Ataxias, Huntington’s Disease) and in association mapping, linkage and forensics, but while they enjoyed much success and use in early genetic linkage and association studies, they have recently been neglected. While SNPs are markers of great utility in genetic studies, different alleles of a polymorphic tandem repeat represent a very large physical and chemical change to a stretch of DNA sequence. They can act variously as: (a) functional elements binding transcription factors and other proteins that inhibit or promote expression; (b) motif elements affecting the efficiency of mRNA splicing; and (c) elements having physical effects, such as varying the spacing between functional motifs or in altering the structure and melting properties of DNA in their proximity. For these reasons, they are very good a priori functional candidates. Geneticists wishing to work with these polymorphisms need to know how to find them in sequence, use their annotation in genome browsers and online databases, use specialist bioinformatics web-tools for their analysis, and how to go about analyzing them in the lab and for genetic association.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Epplen JT, Maueler W, Santos EJ. (1998) On GATAGATA and other “junk” in the barren stretch of genomic desert. Cytogenet Cell Genet. 80, 75–82.

    Article  PubMed  CAS  Google Scholar 

  2. Richard GF, Dujon B. (1996) Distribution and variability of trinucleotide repeats in the genome of the yeast Saccharomyces cerevisiae. Gene. 26, 165–74.

    Google Scholar 

  3. Field D, Wills C. (1998) Abundant microsatellite polymorphism in Saccharomyces cerevisiae, and the different distributions of microsatellites in eight prokaryotes and S. cerevisiae, result from strong mutation pressures and a variety of selective forces. Proc Natl Acad Sci USA. 95, 1647–52.

    Article  PubMed  CAS  Google Scholar 

  4. Young ET, Sloan JS, Van Riper K. (2000) Trinucleotide repeats are clustered in regulatory genes in Saccharomyces cerevisiae. Genetics. 154, 1053–68.

    PubMed  CAS  Google Scholar 

  5. Hancock JM. (1994) Evolution of sequence repetition and gene duplications in the TATA-binding protein TBP (TFIID). Nuc Acids Res. 21, 2823–2830.

    Article  Google Scholar 

  6. Gendrel CG, Boulet A, Dutreix M. (2000) (CA/GT)(n) microsatellites affect homologous recombination during yeast meiosis. Genes Dev 14, 1261–8.

    PubMed  CAS  Google Scholar 

  7. Karlin S, Burge C. (1996) Trinucleotide repeats and long homopeptides in genes and proteins associated with nervous system disease and development. Proc Natl Acad Sci USA 93, 1560–5.

    Article  PubMed  CAS  Google Scholar 

  8. Webster MT, Smith NG, Ellegren H. (2002) Microsatellite evolution inferred fromhuman-chimpanzee genomic sequence alignments. Proc Natl Acad Sci USA. 99, 8748–53.

    Article  PubMed  CAS  Google Scholar 

  9. Vafiadis P, Bennett ST, Colle E, Grabs R, Goodyer CG, Polychronakos C. (1996) Imprinted and genotype-specific expression of genes at the IDDM2 locus in pancreas and leucocytes. J Autoimmun. 9, 397–403.

    Article  PubMed  CAS  Google Scholar 

  10. Bowater RP, Wells RD. (2001) The intrinsically unstable life of DNA triplet repeats associated with human hereditary disorders. Prog Nucleic Acid Res Mol Biol. 66, 159–202.

    Article  Google Scholar 

  11. Todd JA. (1999) From genome to aetiology in a multifactorial disease, type 1 diabetes. Bioessays. 21:164–74.

    Article  PubMed  CAS  Google Scholar 

  12. Caspi A, Sugden K, Moffitt TE, Taylor A, Craig IW, Harrington H, et al. (2003). Influence of life stress on depression, moderation by a polymorphism in the 5-HTT gene. Science. 301, 386–389.

    Article  PubMed  CAS  Google Scholar 

  13. Brookes KJ, Mill J, Guindalini C, Curran S, Xu X, Knight J, et al. (2006) A common haplotype of the dopamine transporter gene associated with attention-deficit/hyperactivity disorder and interacting with maternal use of alcohol during pregnancy. Arch Gen Psychiatry. 63, 74–81.

    Article  PubMed  CAS  Google Scholar 

  14. Guindalini C, Howard M, Haddley K, Laranjeira R, Collier D, Ammar N, et al. (2006) A dopamine transporter gene functional variant associated with cocaine abuse in a Brazilian sample. Proc Natl Acad Sci USA. 103, 4552–7.

    Article  PubMed  CAS  Google Scholar 

  15. Contente A, Dittmer A, Koch MC, Roth J, Dobbelstein M. (2002) A polymorphic microsatellite that mediates induction of PIG3 by p53. Nat Genet. 30, 315–20.

    Article  PubMed  Google Scholar 

  16. Lian Y, Garner HR. (2005) Evidence for the regulation of alternative splicing via complementary DNA sequence repeats. Bioinformatics. 21, 1358–64.

    Article  PubMed  CAS  Google Scholar 

  17. Kolesov G, Wunderlich Z, Laikova ON, Gelfand MS, Mirny LA. (2007) How gene order is influenced by the biophysics of transcription regulation. Proc Natl Acad Sci USA. 104, 13948–13953.

    Article  PubMed  CAS  Google Scholar 

  18. Gupta M, Liu JS. (2005) De novo cis-regulatory module elicitation for eukaryotic genomes. Proc Natl Acad Sci USA. 102, 7079–7084.

    Article  PubMed  CAS  Google Scholar 

  19. Zhou Q, Wong WH. (2004) CisModule, de novo discovery of cis-regulatory modules by hierarchical mixture modeling. Proc Natl Acad Sci USA. 101, 12114–12119.

    Article  PubMed  CAS  Google Scholar 

  20. Ji H, Vokes SA, Wong WH. (2006) A comparative analysis of genome-wide chromatin immunoprecipitation data for mammalian transcription factors. Nucleic Acids Res. 34, e146.

    Article  PubMed  Google Scholar 

  21. Benson G. (1999) Tandem repeats finder, a program to analyze DNA sequences. Nucleic Acids Res 27, 573–80.

    Article  PubMed  CAS  Google Scholar 

  22. Alba MM, Laskowski RA, Hancock JM. (2002) Detecting cryptically simple protein sequences using the SIMPLE algorithm. Bioinformatics. 18, 672–8.

    Article  PubMed  CAS  Google Scholar 

  23. Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R. (2001) REPuter: the manifold applications of repeat analysis on a genomic scale.Nucleic Acids Res. 29, 4633–42.

    Article  PubMed  CAS  Google Scholar 

  24. O’Dushlaine CT, Shields DC. (2006) Tools for the identification of variable and potentially variable tandem repeats. BMC Genomics. 15, 7:290.

    Article  Google Scholar 

  25. Kolpakov R, Bana G, Kucherov G. (2003) mreps, Efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Res. 31, 3672–8.

    Article  PubMed  CAS  Google Scholar 

  26. Ydo Wexler, Zohar Yakhini, Yechezkel Kashi, and Dan Geiger (2004) Finding approximate tandem repeats in genomic sequences. Recomb proceedings. 223–232.

    Google Scholar 

  27. Barnes MR (2009) Exploring the landscape of the genome. Methods in Molecular Biology, (In this issue).

    Google Scholar 

  28. Fondon III JW, Mele GM, Brezinschek RI, Cummings D, Pande A, Wren J, et al. (1998) Computerized polymorphic marker identification: experimental validation and a predicted human polymorphism catalog. Proc Natl Acad Sci USA, 95, 7514–9.

    Article  PubMed  CAS  Google Scholar 

  29. Näslund K, Saetre P, von Salomé J, Bergström TF, Jareborg N, Jazin E. (2005) Genome-wide prediction of human VNTRs. Genomics, 85, 24–35.

    Article  PubMed  Google Scholar 

  30. Denoeud F, Vergnaud G. (2004) Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains, a web-based resource. BMC Bioinformatics. 5, 4.

    Article  PubMed  Google Scholar 

  31. Benson G (2006) TRDB - The tandem repeats database. Nucleic Acids Research, 00, D1-D8.

    Google Scholar 

  32. Veyrieras JB, Kudaravalli S, Kim SY, Dermitzakis ET, Gilad Y, Stephens M Pritchard JK. (2008) High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS genetics 4, e1000214.

    Article  PubMed  Google Scholar 

  33. Smit AFA, Hubley R, Green P. (1996–2007) RepeatMasker Open-3.0. http.//www.repeatmasker.org.

  34. Hillier LW, Marth GT, Quinlan AR, Dooling D, Fewell G, Barnett D, et al. (2008) Whole-genome sequencing and variant discovery in C. elegans. Nat Methods. 5, 183–8.

    Article  PubMed  CAS  Google Scholar 

  35. Franke L, de Kovel CG, Aulchenko YS, Trynka G, Zhernakova A, Hunt KA, et al. (2008) Detection, imputation, and ­association analysis of small deletions and null alleles on oligonucleotide arrays. Am J Hum Genet 82, 1316–33.

    Article  PubMed  CAS  Google Scholar 

  36. Browning BL, Browning SR. (2009) A unified approach to genotype imputation and haplotype phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet 84, 210–223.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gerome Breen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Breen, G. (2010). Practical Informatics Approaches to Microsatellite and Variable Number Tandem Repeat Analysis. In: Barnes, M., Breen, G. (eds) Genetic Variation. Methods in Molecular Biology, vol 628. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60327-367-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-60327-367-1_10

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-60327-366-4

  • Online ISBN: 978-1-60327-367-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics