The need for high-quality oocyte mitochondria at extreme ploidy dictates germline development

Selection against severe mitochondrial mutations is facilitated by germline processes, lowering the risk of genetic diseases. How selection works is disputed: experimental data are conflicting and previous modelling work has not clarified the issues. Here we develop computational and evolutionary models that compare the outcome of selection at the level of individuals, cells and mitochondria. Using realistic de novo mutation rates and germline development parameters, the evolutionary model accurately predicts the observed prevalence of mitochondrial mutations and diseases in human populations. We show that biogenesis of high-quality mitochondria at extreme ploidy in mature oocytes can only be achieved under realistic parameters through selective pooling of mitochondria into the Balbiani body. The principal mechanisms debated in the literature, bottlenecks and follicular atresia, fail to predict these clinical data, because neither process effectively eliminates mitochondrial mutations under realistic conditions. Our findings explain the major features of female germline architecture, notably the longstanding paradox of over-proliferation of primordial germ cells followed by massive germ cell loss. The near-universality of these processes across animal taxa makes sense in light of the need to maintain mitochondrial quality at extreme ploidy in mature oocytes, in the absence of sex and recombination.


INTRODUCTION
In mammals, mitochondrial gene sequences diverge at 10-30 times the mean rate of nuclear genes [1,2]. This difference is typically ascribed to a faster underlying mutation rate and limited scope for purifying selection on mitochondrial genes, given negligible recombination and high ploidy [3]. At face value, weak selection against mitochondrial mutations might seem to be consistent with the high prevalence of mitochondrial mutations (~1 in 200 people) [4] and diseases (~1 in 5000 births) [5] in the general population. But it is not consistent with the strong signal of purifying selection [6], evidence of adaptive change [7] and codon bias [8] in mitochondrial genes, nor with the low transmission rate of severe mitochondrial mutations between generations [9][10][11]. Despite the high rate of sequence divergence, female germline processes apparently facilitate selection against mitochondrial mutations, but the mechanisms are disputed and poorly understood [12].
Here we develop computational and evolutionary models that compare three hypotheses of germline mitochondrial inheritance and selection: (i) mitochondrial bottlenecks and selection at the individual level; (ii) follicular atresia, which imposes selection on primordial germ cells during development; and (iii) selective transfer of mitochondria into the Balbiani body. Our approach incorporates plausible modes of selection against mitochondrial mutations, given the high (but not constant) ploidy of mitochondrial DNA (mtDNA) through all stages of germline development. We also incorporate important factors neglected in earlier work, in particular the input of de novo mitochondrial mutations and their segregation over multiple rounds of germ-cell division. This provides a realistic model of mutation, segregation and selection allowing the three hypotheses to be tested against the observed levels of mitochondrial mutation and disease in human populations [4,5].
'nurse cells' during the genesis of primary oocytes [32]. In either case, differential oocyte loss offers scope for between-cell selection. However, the basis for between-cell selection has long been questioned, on the grounds that it seems unlikely that 70-80% of oocytes have low fitness as a result of mitochondrial mutations [33]. We therefore test whether selection against oocytes with higher loads of mitochondrial mutations during follicular atresia is capable of giving rise to the distribution of mutations observed in humans.
A more recent interpretation of germ-cell loss links it to the formation of the Balbiani body [34], a prominent feature of female germlines across invertebrates and vertebrates [32,35,36], including humans [34]. In the mouse and other mammals, proliferating germ cells form clusters of 5-8 cells that establish cytoplasmic bridges, through which around half the mitochondria from each nurse cell are streamed into the Balbiani body of the primary oocyte [34]. Cytoplasmic transfer is an active cytoskeletal process that depends in part on the membrane potential of discrete mitochondria [37,38], offering scope for purifying selection through the preferential exclusion of dysfunctional mitochondria. The remaining nurse cells, now denuded of half their mitochondria, undergo apoptosis [34]. We consider the consequence of different strengths of selection at the level of mitochondrial function in the production of the Balbiani body.
To systematically distinguish between the predictions of these three different hypotheses, under a range of reasonable parameter values, we use a computational model to evaluate the patterns of mutation load generated over a single generation in each case. We then use an evolutionary model to compare the predictions of our computational model with clinical data on the prevalence of mutations and disease from human studies. Our results show that . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint germline bottlenecks and follicular atresia cannot alone explain the observed prevalence of mitochondrial mutations or disease. Only the selective pooling of high-quality mitochondria into the Balbiani body is likely to account for the clinical data. This process also pleasingly clarifies the longstanding paradox of germ-cell over-proliferation followed by massive loss that is almost universally conserved in the female germline of animal taxa.

Computational model including realistic de novo mutational input
The computational model follows the distribution of mitochondrial mutations across a single generation, using model parameters derived from human data [39] (Figure 1). The zygote is assumed to have ~500,000 copies of mitochondrial DNA (exact number 2 19 ), which are randomly partitioned to the daughter cells at each cell division. We assume independent segregation of mitochondria with one mtDNA per mitochondrion, and do not consider complications that arise from the packaging of multiple mtDNA copies per mitochondrion [14].
Mitochondrial replication is not active during early embryo development [40], so the mean mitochondrial number per cell approximately halves with each division (Figure 1B). After 12 cell divisions a random group of 32 cells form the primordial germ cells (PGC) [41], with a mean of 128 mitochondria per PGC. Mitochondrial replication resumes at this point [39,40].
Each mitochondrion (and mtDNA) doubles prior to random partitioning at each cell division.
With probability , one of the daughter mitochondria acquires a new deleterious mutation through a copying error. We consider in the range 10 -9 to 10 -7 per base pair per cell division, consistent with the range of estimates for the female germline, and assume no . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a  [42]. Mitotic proliferation of PGCs gives rise to ~8 million oogonia, which are reduced to ~1 million primary oocytes during late gestation (Figure 1) [39,40]. Proliferation is followed by a quiescent phase during which the mitochondria in primary oocytes are not actively replicated. Mutations accumulate far more slowly during this phase, which persists over decades in humans [40,43]. For simplicity, we assume no mutational input during this period (not marked in Figure 1). At puberty, the primary oocytes mature through clonal amplification of mitochondria back to the extreme ploidy in mature oocytes (~500,000 copies; Figure 1B) [44]. The same copying error mutation rate µ is applied during this process.
We consider three different forms of selection on mitochondria: selection at the level of the organism, cells, or mitochondria. We apply selection at the level of the organism on the zygotic mutation load. Selection at the level of cells or mitochondria is applied during culling at late gestation when primary oocytes are produced. Each of these processes can be captured by modifications of the computational model, allowing easy comparison between them. The model extends earlier modelling work that considered segregational variation of a fixed burden of existing mutations [13,[19][20][21] but neglected the input of new mutations during PGC proliferation and oocyte maturation, as well as the loss of germ cells during late gestation. The analysis here shows the importance of considering these additional processes governing the population of mitochondria in germline development.
. CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint

Germline bottleneck increases variance but introduces more de novo mitochondrial mutations
The effect of a bottleneck was assessed in the model by allowing extra rounds of cell division without replication during early embryonic development (e.g., two extra rounds shown in Figure 2A). Each additional cell division leads to an average reduction of (0.5) mitochondria in PGCs compared to the base model. We then held mitochondrial numbers at this lower value through the period of PGC proliferation, consistent with some views of the bottleneck [45]. Tighter bottlenecks at this early developmental stage generate greater segregational variance in mutation load between cells ( Figure 2B). This increase in variance persists and is enhanced through PGC proliferation to the production of primary oocytes and ultimately in mature oocytes ( Figure 2B). The bottleneck not only creates a wider spread of mutation number per cell, but also the possibility that cells can be mutation free even when initiated from a zygote that contains significant numbers of mutations ( Figure   2B). Bottlenecks in themselves do not change the mean mutation load, as they occur before the start of mitochondrial replication (i.e. at PGC specification; Fig. 2B) [40]. But oocyte maturation requires the expansion of mitochondrial number back to half a million. Cells starting with lower numbers must therefore undergo more rounds of replication, and hence will accumulate more de novo mutations. So, the mean mitochondrial mutation load in mature oocytes increases with tighter bottleneck size, albeit this effect is small with low mutation rates ( = 10 −8 ; Fig. 2B). Nonetheless, the tension between variance and mean determines the overall selective consequence of the bottleneck.
The advantage that the bottleneck brings depends on how selection acts against the mutation load carried by an individual. Based on the observed dependence of mitochondrial . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint diseases on mutation load [46][47][48], in which more serious phenotypes typically manifest only at high mutant loads of >60 % [46][47][48], it is thought that individual fitness is defined by a concave fitness function, indicative of negative epistasis ( Figure 2C). This assumes that each additional mitochondrial mutation causes a greater reduction in fitness beyond that expected from independent effects. In other words, low mutation loads have a relatively trivial fitness effect, whereas higher mutation loads produce a steeper decline in fitness.
The change in mutation load ( ) over a single generation after individual selection was measured against 5 mean bottleneck sizes ( ̅ = 128, 64,32,16,8), for three initial mutation loads ( 0 ) and three mutation rates ( ). The bottleneck shows an ambiguous relationship with fitness, dependent on the inherited mutation load ( 0 ). For the estimated mutation rate ( = 10 −8 ) there is always an increase in mutation load in individuals who inherit low or medium mutation loads ( 0 = 0.001, 0.01; Figure 2D). This increase in load becomes more deleterious with a tighter bottleneck ( Figure 2D). The bottleneck only confers a benefit among individuals who inherit a high mutation load ( 0 = 0.1; Figure 2D The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint S1B). In sum: even though bottlenecks generate greater variance, they impose the need for additional rounds of replication during oocyte maturation, resulting in greater de novo mutational input. This makes tight bottlenecks advantageous only for individuals who inherit high mutation loads, but not for the great majority of the population, where the prevalence of mitochondrial mutations is generally between 0.001 and 0.01 [4,14].

Follicular atresia cannot be explained by realistic selection against cells with high mitochondrial mutation loads
In the analysis of bottlenecks above, the culling of ~8 million oogonia to 1 million primary oocytes at the end of PGC proliferation was assumed to be a random process (Figure 2A).
This loss has a minimal effect on the mean and variance in frequency of mitochondrial mutations in germ cells, given the large numbers involved (and no effect at all when averaged over a population). However, the loss of ~80% of oocytes via follicular atresia during late gestation has long been puzzling and could arguably reflect selection against cells with higher mutation loads.
To analyse follicular atresia, cell-level selection was applied to oogonia at the end of PGC proliferation ( Figure 3A). PGCs vary in mutation frequency due to both the random segregation of mutants during the multiple cell divisions of proliferation and the chance input of new mutations during mtDNA replication. In principle, we assume that between-cell selection is governed by a negative epistatic fitness function ( Figure 3B) similar to that thought to apply at the individual level, and vary selection from linear ( = 1), weak ( = 2) to strong epistasis ( = 5). Positive epistasis ( < 1), whereby a single point mutation produces a steep loss of fitness, but additional mutations have less impact (i.e. mutations . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint are less deleterious in combination), seems biologically improbable, so we do not consider it here.
The effect of cell selection during follicular atresia was calculated as the change in mutation frequency for individuals carrying different mutation loads ( 0 ) over a single generation, given standard values for de novo mutations ( = 10 −8 ) and bottleneck size ( ̅ = 128).
Under strong negative epistasis ( = 5), only the few cells with very high mutation loads (generated by segregation) are eliminated. Cell-level selection does not reduce mutation load, even for individuals with a high initial frequency of mutations ( 0 = 0.1; Figure 3C).
Cell-level selection is more effective with weak epistasis ( = 2) or linear selection ( = 1) as this makes cells with lower mutation loads more visible to selection, and has a greater benefit in individuals carrying higher initial mutation loads ( Figure 3C). However, in individuals who inherit low or medium mutation load ( 0 = 0.001, 0.01) cell selection offers a minimal constraint against mutation input. The only case in which cell selection produces a benefit is with high mutation load ( 0 = 0.1) under linear selection ( = 1) ( Figure 3C). This pattern holds for a lower mutation rate ( = 10 −9 ; Figure S2A), while there is no benefit at all at a higher mutation rate ( = 10 −7 ; Figure S2B).

The Balbiani body pools high-quality mitochondria and restricts de novo mutation input
An alternative interpretation of atresia lies in the formation of the Balbiani Body, a nearly universal feature of female germlines in animals [49]. We model the developmental process giving rise to the Balbiani body by assuming that cysts of 8 oogonia form at the end of PGC proliferation ( Figure 4A) as is typical in mammalian development [32,50]. Cells within a cyst are derived from a common ancestor (i.e. via 3 consecutive cell divisions). At the 8-cell . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint stage, intercellular bridges form between the oogonia. These allow cytoplasmic transfer of a proportion of mitochondria ( ) from each cell to join the Balbiani body of the single cell destined to become the primary oocyte ( Figure 4A). The mitochondria that undergo cytoplasmic transfer are sampled at random (without replacement), with different weights for wildtype ( ) and mutant ( ) mitochondria, until have moved to the Balbiani body. The oogonia that donate their cytoplasm to the primary oocyte are now defined as nurse cells, and undergo programmed cell death -atresia ( Figure 4A).
The model shows that two benefits accrue from cytoplasmic transfer. The first benefit of mitochondrial transfer into the Balbiani body is that pooling increases the number of mitochondria in primary oocytes. As the proportion of mitochondria transferred increases towards the estimated rate of = 50% [32], the number of mitochondria in primary oocytes increases 4-fold. Pooling therefore cuts the number of rounds of replication needed to reach the extreme ploidy required by mature oocytes, which decreases the input of new mutations from replication errors during oocyte maturation. This benefit accrues whatever the initial mutation load, and more dramatically with a higher mutation rate ( Figure S3).
The second benefit arises from selective transfer of mitochondria. Preferential exclusion of mutant mitochondria ( > ), as suggested by experimental evidence [32], lowers the mutation load in primordial oocytes ( Figure 4B). The difference between and determines the extent to which the mutation load is reduced, with stronger exclusion of mutant mitochondria (lower ) reducing the number of mutations when the inherited load is medium or high ( 0 = 0.01, 0.1), albeit with a negligible effect at low initial mutation load ( 0 = 0.001; Figure 4C). The same effect is seen with lower and higher . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint mutation rates ( Figure S4). Nurse cells retain a higher fraction of mutant mitochondria but undergo apoptosis, removing mutants from the pool of germ cells, and explaining the need for an extreme loss of germ cells during late gestation. This effect acts in concert with pooling leading to a reduction in both the mean and variance of mitochondria mutation load ( Figure 4B).

Evolutionary model
The By iterating the patterns of germline inheritance and selection, the equilibrium mutation distribution was calculated across a range of mutation rates and bottleneck sizes. The accuracy of the three models was then assessed as the likelihood of reproducing the observed levels of mitochondrial mutations in the human population (Fig. 5). Specifically, we used estimated values of 1/5000 for mitochondrial disease (>60% mutant), 1/200 for carriers of mitochondrial mutants (2-60% mutant) and hence 99.5% of individuals are . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint 'mutation free' (i.e. carry <2% mutants, the threshold for detection in these estimates of mutation frequency (REFS). Recent deep-sequencing estimates using a mutation detection threshold of >1% [14], show that a minor allele frequency of 1-2% is relatively common in selected human PGCs, but this does not alter earlier population-level estimates of the proportion of carriers not suffering from overt mitochondrial disease, defined as a 2-60% mutation load used here.  Figure 5D).

DISCUSSION
How selection operates on mitochondria has long been controversial. At the heart of this problem is the paradox that mtDNA accumulates mutations faster than nuclear genes, yet is under stronger purifying selection. Mitochondrial mutations accumulate through Muller's ratchet, as mtDNA is exclusively maternally inherited, and does not undergo recombination . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint through meiosis [3]. In addition, mitochondrial genes are highly polyploid, which obscures the relationship between genotype and phenotype, hindering the effectiveness of selection on individuals. Despite these constraints, deleterious mitochondrial mutations seem to be eliminated effectively [6][7][8][9][10][11], facilitated by female germline processes that have long been mysterious. These include: the excess proliferation of primordial germ cells (PGCs) [51]; the germline mitochondrial bottleneck (when mitochondrial numbers are reduced to a disputed minimum in PGCs) [13][14][15]; the formation of the Balbiani body ('mitochondrial cloud') in primary oocytes [32,38]; the atretic loss of 70-80% of germ cells during late gestation [26,30]; the extended oocyte quiescence until puberty or later (during which time mitochondrial activity and replication is suppressed) [43,52]; and the generation of around half a million copies of mtDNA in mature oocytes [44]. The key question is how do these processes facilitate the maintenance of mitochondrial quality over generations?
In this study, we introduced a computational model that considers these germline processes from the perspective of mitochondrial proliferation, segregation and selection, using realistic estimates of parameter values, drawn from the human literature [39,40]. Most work to date [13,15,45,53,54] has focused on the mitochondrial bottleneck as a means of generating variation in mitochondrial content between oocytes and by extension zygotes (Figure 2B), furnishing the opportunity for selection to act on individuals in the following generation. These studies have been unable to reconcile serious differences in experimental estimates of mitochondrial numbers during PGC proliferation, inciting inconclusive debates over the tightness of the bottleneck [13,15,45,53,54]. More significantly, this earlier work neglects an important germline feature, the introduction of de novo mitochondrial mutations produced by copying errors [55] rather than damage by reactive oxygen species . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint [42,56]. These accumulate during PGC proliferation and, equally importantly, during the mass-production of mtDNAs in the mature oocyte. Tighter bottlenecks are disadvantageous as they impose the need for more rounds of mitochondrial replication which means a greater input of de novo mutations. Our modelling shows that for most individuals the mean mutation load shows little meaningful change (Figure 2D), regardless of whether the mutation rate is set low or high ( Figure S1) and in fact increases with tighter bottleneck size ( Figure 2D). Most individuals have low mutation loads (~99.5% in human populations [4,5]), and for them, the normal process of repeated segregation during cell division generates sufficient variance in itself. Any marginal increase in variance caused by bottlenecks is more than offset by increased mutational input. Tighter bottlenecks only benefit individuals who already carry high mutation loads (i.e. 0 ≥ 0.1, Figure 2D). For them, there is benefit in further reductions in bottleneck size as this increases the fraction of mature oocytes with significantly reduced mutation load ( Figure 2D).
These results show that the popular idea that a germline mitochondrial bottleneck facilitates selection against mitochondrial mutations is misconstrued. The value of a bottleneck depends on the unforeseen trade-off between increasing genetic variance and mutation input. In fact, the reduction in mitochondrial copy numbers from zygote to primordial germ cells should be thought of as the reestablishment of a typical copy number at the start of cellular differentiation, which commences after multiple cell divisions without mtDNA replication. What counts as a bottleneck are the 'extra' rounds of cell division reducing mitochondrial number below the 'normal' number, and the incremental increase in variance this induces. Most critically, the bottleneck needs to be understood in relation to oogamy, the massively exaggerated mitochondrial content of the female gamete. This is a . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint characteristic of metazoan gametogenesis [44]. Previous work has shown it is beneficial in animals with mutually interdependent organ systems [44]. The extreme ploidy in the zygote allows early rounds of cell division to occur without replication, and hence without de novo mutational input. These initial cell divisions generate little between-cell differences, as segregational variance is weak when numbers are high (e.g. Fig 1B before PGC specification). So at the point of cellular differentiation (~12 cell divisions) there is homogeneity in the mutation load among the different organ systems and no one system is likely to fail, which would massively lower the fitness of the whole organism [44]. This contrasts with organisms that have modular growth, such as plants and basal metazoa (sponges, corals, placazoa), which neither sequester a recognizable germline nor have oocytes with massively expanded mitochondrial numbers [44,57].
Follicular atresia is another female germline feature examined in our modelling, in which there is over-proliferation of PGCs followed by ~80% loss early in development, before oocyte maturation [26,30]. This massive reduction in germ cell number has long been enigmatic, for it is unlikely to be random, yet does not obviously serve a selective function, as it seems unlikely that such a high proportion of germ cells could have low fitness [23][24][25].
The model confirms this intuition. Selection among PGCs at the end of the period of proliferation has little effect in significantly reducing mutation load (Figure 3C). Assuming a concave fitness function (Figure 3B), which seems reasonable by extension from the severity of mitochondrial diseases [46,48], between-cell selection is ineffective, as it only eliminates PGCs with very high mutational numbers. This has little effect in constraining the burgeoning of lower mutation loads. Linear selection does better, even if it seems unrealistic, as it will act against a broader range of mutational states. But as with . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint bottlenecks, it is only beneficial in individuals already carrying significant mutation loads (i.e. 0 ≥ 0.1, Figure 3C). We conclude that cell-level selection produces little measurable reduction in mutation load and so is unlikely to be responsible for follicular atresia.
A more recent explanation of PGC loss relates to the formation of the Balbiani body in primary oocytes [32,34]. In many metazoa, including clams [36], insects [35,58], mice [50] and probably humans [34], the over-proliferation of PGCs culminates in their organization into germline cysts of multiple oogonia connected by cytoplasmic bridges [32,50,58]. These connections allow the transfer of mitochondria and other cytoplasmic constituents by active attachment to microtubules, into what becomes the primary oocyte [32]. The surrounding oogonia that transferred their mitochondria, now termed nurse cells, die by apoptosis [32].
The plethora of terms should not mask the key point that nurse cell death accounts for a considerable fraction of the germ cell loss usually ascribed to follicular atresia. We modelled selective mitochondrial transfer into the Balbiani body, perhaps in part reflecting membrane potential [35,38]. This achieves two complementary things: it purges mutations and pools high-quality mitochondria in a single cell. If the germline cyst is composed of eight cells that contribute half of their mitochondria to the Balbiani body, then the primary oocyte gains four times as many mitochondria, cutting the need for additional rounds of mtDNA copying, and so reducing de novo mutations during oocyte maturation. Selective transfer and pooling lowers the mutation load across a wide range of mutation rates and inherited loads ( Figure   4C, Figure S3-4). This process differs from mitophagy, the main route used in somatic cells for maintaining mitochondrial quality [59,60], as it not only removes mutant mitochondria, but crucially also increases mitochondrial numbers, a key requirement for prospective gametes. The requirement for pooling of mitochondria to lower the mutation load from . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint copying errors also aligns with experimental observations of active spindle-associated mitochondrial migration to the generative oocyte in the formation of polar bodies during meiosis I of oogenesis [61]. We predict that selection for mitochondrial quality occurs during this process (i.e. polar bodies retain mutant mitochondria) but have not dealt with that explicitly in the model.
These insights depend in part on the parameter values used in the modelling, many of which are uncertain. We have examined variation around the most representative values drawn from the literature [2,14,62,63], and aimed to be conservative wherever possible. We considered mutation rates across two orders of magnitude, around 10 -8 per bp as the standard [62] and a similar range of bottleneck sizes ( ̅ = 8 − 128). Strong selective pooling of mitochondria into the Balbiani body predicts the observed prevalence of mitochondrial mutations and diseases in human populations [4,5] under a wide range of mutation rates and bottleneck sizes (Figure 5). Selection at the level of individuals or cells are much more constrained explanations, although we do not rule out some role for these processes ( Figure   5). In general, higher mutation rates (10 -7 per base pair) strengthen the conclusions discussed here (Figures S1-S4) whereas the lowest mutation rates are more commensurate with weaker forms of evolutionary constraint generated by selection on individuals or cells.
Plainly, weaker selection approximates best to clinical data when the mutation input tends towards zero (Figure 5). However, such low mutation rates are not consistent with the 10-30-fold faster evolution rates of mtDNA compared with nuclear genes [1,2], or with the strong signatures of purifying [6] and adaptive [7] selection on mitochondrial genes. In addition, we have ignored the contribution of oxidative damage caused by reactive oxygen species. While this is likely low compared with copying errors [42,55], oxidative mutations . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint may accumulate over female reproductive lifespans [56], perhaps contributing to the timing of the menopause [64]. As primary oocytes contain ~6000 mitochondria [64], expansion up to ~500,000 copies in the mature oocyte will amplify any mutations acquired during oocyte arrest at prophase I, potentially over decades [55]. The metabolic quiescence of oocytes can best be understood in light of the need to repress mitochondrial mutation accumulation during the extended period before reproduction [43,52]. In any case, our assumption that zero mutations are caused by oxidative damage is plainly conservative.
We have addressed here a simple paradox at the heart of mitochondrial inheritance. Like Gibbon's Decline and Fall of the Roman Empire, mitochondrial DNA is often portrayed as being in continuous and implacable decline through Muller's ratchet [3]; yet like the Empire, which endured for another millennium, mitochondrial DNA has persisted and has been at the heart of eukaryotic cell function for over a billion years [65]. Strong evidence for purifying and adaptive selection implies that the female germline facilitates selection for mitochondrial quality, but the mechanisms have remained elusive. We have modelled segregation and selection of mitochondrial DNA at each stage of germline development, and shown that direct selection for mitochondrial function during transfer into the Balbiani body is the most likely explanation of the observed prevalence of mitochondrial mutations and diseases in human populations. More remarkably, this mitochondria-centric model elucidates the complexities of the female germline. It explains why mature oocytes are crammed with mitochondria [44], whereas sperm mitochondria are typically destroyed, giving rise to two sexes [22]; why germ cells over-proliferate during early germline development; why oogonia organize themselves into germline cysts, forming the Balbiani body; why the majority of germ cells then perish by apoptosis as nurse cells; why primary . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint oocytes enter metabolic quiescence, sometimes for decades; and even why polar bodies channel most of their mitochondria into a single mature oocyte. Most fundamentally, this perspective challenges the claim that complex multicellularity requires passage through a single-celled, haploid stage to constrain the emergence of lower-level, selfish genetic elements [66,67]. This is true for nuclear genes in oocytes, whose quality is maintained by sexual exchange and recombination [67], but is not the case for mitochondria, which are transmitted uniparentally, without sexual exchange or recombination. In animals, the oocyte cytoplasm is not derived from a single cell, but instead requires the selective pooling of mitochondrial DNA from clusters of progenitor cells, which together generate highquality mitochondria at extreme ploidy in mature gametes.
. CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint

Initial conditions
We use a computational model to follow the distribution of mitochondrial mutations in the female germline over a single generation from zygote to a new set of mature oocytes, as set out in the developmental history given in the main text (Fig. 1A). The initial state of the system is a zygote containing 0 = 2 19 = 524,288 copies of mtDNA, of which 0 carry a deleterious mutation. Three specific models are considered: bottleneck, follicular atresia and cytoplasmic transfer. A list of terms and parameter values is given in Table 1, which also apply in the evolutionary model considered below.

Parameters and variables
Symbols and values  again are randomly drawn from a binomial distribution (as described above).

Specific models of selection
We consider three specific models in the main text with modifications to the base model described above.
The first model adds a bottleneck stage at the time of PGC determination (Fig. 2). As before, A second model considers non-random death during the cull of oogonia as these cells transition to being primary follicles (Fig. 3). Selection in this case is applied at the cell level.
Cell fitness is expressed as ( ) = 1 − ( ) , where is the number of mutant mitochondria. The parameter determines the strength of epistatic interactions (Fig. 3B).
As in other models, the number of cells is reduced from = 8,388,608 to /8 = 1,048,576. This is achieved by sampling without replacement the surviving cells at random, with weights proportional to cell fitness (i.e., every cell has a probability of survival proportional to its fitness).
The coefficient , = ( )( − − ) ( 2 ) ⁄ models the probability of transitioning from a state with mutants and − wildtype to a state with mutants and − wild type via the segregation of 2 mitochondria into two daughter cells with mitochondria each; the remaining part of the equation models the probability of reaching a state with mutant mitochondria through replication and mutation of mitochondria, of which are mutant (this corresponds to the probability of introducing − new mutations). The system is updated ⃗ (2) = × ⃗ (1) , across rounds of PGC cell division. We then apply particular processes to capture the effects of the bottleneck, follicular atresia and cytoplasmic transfer.
. CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint As before, we model the bottleneck as extra rounds of segregation before the onset of mtDNA replication, following Eq(1) with + cell divisions. This has no effect on the mean mutational number but increases mutational variance between the resulting PGCs. The transition between oogonia and primary occytes occurs at random, and so does not alter the frequency distribution of mutants. Finally, during oocyte maturation, the mtDNA content of each cell doubles at every time step until the initial ploidy 0 is restored. The transition matrix is analogous to the first term of Eq(2), incorporating replication and mutation, but without segregation (last term of Eq(2)): Eq(3) models the probability of transitioning from a state with to a state with mutants, which is equivalent to the probability that exactly − out of ( ) − wildtype acquire a deleterious mutation. As the bottleneck reduces mtDNA copy number per cell, there is the need for extra rounds of replication of mtDNA during oocyte maturation. Hence, the transition coefficient is applied + 12 times in the bottleneck model, to restore the number of mtDNA copies per oocytes to the original ploidy level 0 : ⃗ (3) = (∏ ( ) ) × ⃗ (2) .
At the end of the maturation phase, for the bottleneck model, selection is applied on individual fitness using a vector whose elements are equal to the corresponding . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint fitness: = ( ) = 1 − ( ) 5 . This causes a change in the population mutation load as the system is updated to: where is the identity matrix.
In the model of follicular atresia, an extra step is included to reflect selection that operates when the population of oogonia are culled to produce the primary oocytes. This causes a change in the population mutation load analogous to that described in Eq (4) For all three models (bottleneck, follicular atresia and cytoplasmic transfer), the frequency distribution of mutation loads after these steps is used as the starting point for the next generation.

Evolutionary dynamics and model accuracy
The processes described above are iterated until the Kullback-Leibler divergence (a theoretical measure of how two probability distributions differ from each other [69]) between the new and the old distribution is smaller than a threshold = 10 −9 . We then assume that the system has reached a stationary state, e.g. without significant changes in the overall distribution of mutation loads between generations (mutation-selection balance).
In order to compare the prediction of the model with the clinical data, we use the equilibrium distribution to calculate the fraction of the population which carries a detectable load of mitochondrial mutations but does not manifest any detrimental phenotype ( 1 ) and the fraction of individuals affected by mitochondrial disease ( 2 ) using a threshold of 60% mutation load to discriminate between carrier and disease status.
Individuals are assumed to be mutation free beyond the detection threshold of 2% [4].

Estimation of the deleterious mutation rate
The parameter values for the deleterious mutation rate we investigate reflect data collected from a number of species. Estimates of mtDNA point mutation rates in the crustacean Daphnia pulex range between 1.37 × 10 −7 and 2.28 × 10 −7 per site per generation [70].
Assuming this rate applies to humans and there are ~20 cell divisions before oocyte maturation, leads to a range between 0.68 × 10 −8 and 1.14 × 10 −8 per site, per cell division. Analysis of Caenorhabditis elegans mtDNA leads to a similar estimate of ~1.6 × 10 −7 per site, per generation [71], which corresponds to a rate of 0.8 × 10 −8 per . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a  [72]. Finally, analysis of human mtDNA point mutation rates give a mutation rate of 0.0043 per genome per generation [62], corresponding to ~1.3 × 10 −8 mutations per site, per cell division.
These values do not take into account the presence of a number of processes likely to remove mutants and is therefore a conservative estimate. The loss of mutations would mean that the actual mutation rate is higher than the estimates above. But unlike nuclear rates, the compact structure of mtDNA where intergenic sequences are absent or limited to a few bases, means that the rate of point mutations is probably not much higher than the rate of deleterious mutations. Therefore, for this study we consider a broad interval of possible deleterious mutation rates, ranging between 10 −9 and 10 −7 .
. CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020.      . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint    . CC-BY 4.0 International license preprint (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.03.280628 doi: bioRxiv preprint