Abstract
The site-frequency spectrum, representing the distribution of allele frequencies at a set of polymorphic sites, is a commonly used summary statistic in population genetics. Explicit forms of the spectrum are known for both models with and without selection if independence among sites is assumed. The availability of these explicit forms has allowed for maximum likelihood estimation of selection, developed first in the Poisson random field model of Sawyer and Hartl, which is now the primary method for estimating selection directly from DNA sequence data. The independence assumption, which amounts to assume free recombination between sites, is, however, a limiting case for many population genetics models. Here, we extend the site-frequency spectrum theory to consider the case where the sites are completely linked. We use diffusion approximation to calculate the joint distribution of the allele frequencies of linked sites for models without selection and for models with equal coefficient selection. The joint distribution is derived by first constructing Green’s functions corresponding to multiallele diffusion equations. We show that the site-frequency spectrum is highly correlated between frequencies that are complementary (i.e., sum to 1), and the correlation is significantly elevated by positive selection. The results presented here can be used to extend the Poisson random field to allow for estimating selection for correlated sites. More generally, the Green’s function construction should be able to aid in studying the genetic drift of multiple alleles in other cases.
Article PDF
Similar content being viewed by others
References
Abramowitz, M., Stegun, I., 1965. Handbook of Mathematical Functions: with Formulas, Graphs, and Mathematical Tables. Courier Dover, New York.
Adams, A., Hudson, R., 2004. Maximum-likelihood estimation of demographic parameters using the frequency spectrum of unlinked single-nucleotide polymorphisms. Genetics 168(3), 1699.
Barbour, A., Ethier, S., Griffiths, R., 2000. A transition function expansion for a diffusion model with selection. Ann. Appl. Probab., 123–162.
Baxter, G., Blythe, R., McKane, A., 2007. Exact solution of the multi-allelic diffusion model. Math. Biosci. 209(1), 124–170.
Braverman, J., Hudson, R., Kaplan, N., Langley, C., Stephan, W., 1995. The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140(2), 783.
Bustamante, C., Wakeley, J., Sawyer, S., Hartl, D., 2001. Directional selection and the site-frequency spectrum. Genetics 159(4), 1779.
De, A., Durrett, R., 2007. Stepping-stone spatial structure causes slow decay of linkage disequilibrium and shifts the site frequency spectrum. Genetics 176(2), 969.
Drake, J., Bird, C., Nemesh, J., Thomas, D., Newton-Cheh, C., Reymond, A., Excoffier, L., Attar, H., Antonarakis, S., Dermitzakis, E., et al., 2006. Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nat. Genet. 38(2), 223–227.
Durrett, R., 2008. Probability Models for DNA Sequence Evolution. Springer, Berlin.
Etheridge, A., Griffiths, R., 2009. A coalescent dual process in a Moran model with genic selection. Theor. Popul. Biol.
Evans, S., Shvets, Y., Slatkin, M., 2007. Non-equilibrium theory of the allele frequency spectrum. Theor. Popul. Biol. 71(1), 109–119.
Ewens, W., 1979. Mathematical Population Genetics. Springer, New York.
Fay, J., Wu, C., 2000. Hitchhiking under positive Darwinian selection. Genetics 155(3), 1405.
Fisher, R., 1930. The distribution of gene ratios for rare mutations. In: Proc. R. Soc. Edinb., vol. 50, pp. 204–219.
Fu, Y., 1995. Statistical properties of segregating sites. Theor. Popul. Biol. 48(2), 172–197.
Griffiths, R., 1979. A transition density expansion for a multi-allele diffusion model. Adv. Appl. Probab. 11(2), 310–325.
Griffiths, R., 2003. The frequency spectrum of a mutation, and its age, in a general diffusion model. Theor. Popul. Biol. 64(2), 241–251.
Griffiths, R., Li, W., 1983. Simulating allele frequencies in a population and the genetic differentiation of populations under mutation pressure. Theor. Popul. Biol. 23(1), 19.
Griffiths, R., Tavaré, S., 1998. The age of a mutation in a general coalescent tree. Stoch. Models 14(1), 273–295.
Hill, W., Robertson, A., 2009. The effect of linkage on limits to artificial selection. Genet. Res. 8(03), 269–294.
Karlin, S., Taylor, H., 1981. A Second Course in Stochastic Processes. Academic Press, New York.
Kim, Y., Stephan, W., 2002. Detecting a local signature of genetic hitchhiking along a recombining chromosome. Genetics 160(2), 765.
Kimura, M., 1955. Random genetic drift in multi-allelic locus. Evolution 9(4), 419–435.
Kimura, M., 1956. Random genetic drift in a tri-allelic locus; exact solution with a continuous model. Biometrics 12(1), 57–66.
Kimura, M., 1964. Diffusion models in population genetics. J. Appl. Probab. 1(2), 177–232.
Kimura, M., 1969. The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 61(4), 893.
Li, W., 1977. Maintenance of genetic variability under mutation and selection pressures in population. Proc. Natl. Acad. Sci. USA 74(6), 2509–2513.
Littler, R., 1975. Loss of variability at one locus in a finite population. Math. Biosci. 25(1–2), 151–163.
Littler, R., Fackerell, E., 1975. Transition densities for neutral multi-allele diffusion models. Biometrics 31(1), 117–123.
Marth, G., Czabarka, E., Murvai, J., Sherry, S., 2004. The allele frequency spectrum in genome-wide human variation data reveals signals of differential demographic history in three large world populations. Genetics 166(1), 351.
Nei, M., Maruyama, T., Chakraborty, R., 1975. The bottleneck effect and genetic variability in populations. Evolution 29(1), 1–10.
Nielsen, R., 2005. Molecular signatures of natural selection. Annu. Rev. Genet. 39, 197–218.
Ohta, T., Kimura, M., 1969. Linkage disequilibrium at steady state determined by random genetic drift and recurrent mutation. Genetics 63(1), 229.
Polanski, A., Kimmel, M., 2003. New explicit expressions for relative frequencies of single-nucleotide polymorphisms with application to statistical inference on population growth. Genetics 165(1), 427.
Przeworski, M., 2002. The signature of positive selection at randomly chosen loci. Genetics 160(3), 1179.
Przeworski, M., Wall, J., Andolfatto, P., 2001. Recombination and the frequency spectrum in Drosophila melanogaster and Drosophila simulans. Mol. Biol. Evol. 18(3), 291.
Roach, G., 1982. Green’s Functions. Cambridge University Press, Cambridge.
Sawyer, S., Hartl, D., 1992. Population genetics of polymorphism and divergence. Genetics 132(4), 1161.
Shimakura, N., 1977. Equations differentielles provenant de la genetique des populations. Tohoku Math. J. 29, 287.
Tajima, F., 1989. The effect of change in population size on DNA polymorphism. Genetics 123(3), 597.
Tavaré, S., 1984. Line-of-descent and genealogical processes, and their applications in population genetics models. Theor. Popul. Biol. 26(2), 119–164.
Wakeley, J., Nielsen, R., Liu-Cordero, S., Ardlie, K., 2001. The discovery of single-nucleotide polymorphisms and inferences about human demographic history. Am. J. Hum. Genet. 69(6), 1332–1347.
Watterson, G., 1977. Heterosis or neutrality? Genetics 85(4), 789.
Wright, S., 1938. The distribution of gene frequencies under irreversible mutation. Proc. Natl. Acad. Sci. USA 24(7), 253.
Wright, S., 1942. Statistical genetics and evolution. Bull. Am. Math. Soc. 48(4), 223–246.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Xie, X. The Site-Frequency Spectrum of Linked Sites. Bull Math Biol 73, 459–494 (2011). https://doi.org/10.1007/s11538-010-9534-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11538-010-9534-3