The limits to parapatric speciation 3: evolution of strong reproductive isolation in presence of gene flow despite limited ecological differentiation

Gene flow tends to impede the accumulation of genetic divergence. Here, we determine the limits for the evolution of postzygotic reproductive isolation in a model of two populations that are connected by gene flow. We consider two selective mechanisms for the creation and maintenance of a genetic barrier: local adaptation leads to divergence among incipient species due to selection against migrants, and Dobzhansky–Muller incompatibilities (DMIs) reinforce the genetic barrier through selection against hybrids. In particular, we are interested in the maximum strength of the barrier under a limited amount of local adaptation, a challenge that many incipient species may initially face. We first confirm that with classical two-locus DMIs, the maximum amount of local adaptation is indeed a limit to the strength of a genetic barrier. However, with three or more loci and cryptic epistasis, this limit holds no longer. In particular, we identify a minimal configuration of three epistatically interacting mutations that is sufficient to confer strong reproductive isolation. This article is part of the theme issue ‘Towards the completion of speciation: the evolution of reproductive isolation beyond the first barriers’.

We assume that the loci A, B and C are in this order along the chromosome. The recombination rate between loci A and B is given by r AB and between B and C by r BC . In addition, we provide a decomposition of the system of equations, using . x 2 as an example and emphasizing which term corresponds to which biological mechanism, equation (S3).
For loose linkage, the system of equations given in eq. (S1) reduces to equation (S4).

S 1.2 2-locus model
For the two locus model in loose linkage, the expression for m Ab max has been previously derived (see Bank et al. [2012], equation 11) for negative epistasis( AB < 0). Below, we provide this expression using our notation: (S5)

S 1.2.1 Extension to positive epistasis
We now consider two epistatically interacting loci, A and B in loose linkage. All results for maximal migration rates can be derived explicitly (for negative epistasis, see Bank et al. [2012].) β, α, α + AB ]. However, if AB is the best haplotype on the island, there is no 2-locus genetic barrier. Therefore, w(Ab) > w(AB) is a necessary condition for the existence of a genetic barrier, leading to β + AB < 0. Given that AB > 0, we can deduce that β < 0 and −β > AB .
This means that we can safely ignore the third term in the definition of Λ Ab max . Under these conditions, the definition of the maximum amount of local adaptation in the 2-locus model remains unchanged.
We have established in the main manuscript that m Ab max ≤ Λ Ab max , independently of the sign of epistasis. If we compare the 2-locus barrier to its equivalent without epistasis, the genetic barrier is strengthened whenever there is some asymmetry in the selection strength at both loci (α = −β) and when there is epistasis between the two derived alleles, AB , according to the following conditions: From equation (S7), we can deduce that if selection against the continental allele B is weaker than selection for allele A (−β < α), then allele B can become fixed at a lower migration rate than allele A. Negative epistasis can strengthen the genetic barrier (in comparison to the case without epistasis). Vice-versa, if α < −β, A is lost first, and positive epistasis can strengthen the genetic barrier (in comparison to the case without epistasis), as illustrated in Figure S1.
In the former case (α > −β), A and B form a DMI ( AB < 0) and we obtain the same upper bound m Ab max ≤ Λ Ab max 2 as in the case with non-interacting loci. For −β > α, the maximal barrier strength can exceed the threshold for non-interacting loci and reaches m Ab max = Λ Ab max in the limit α → 0, β → −Λ Ab max and AB → Λ Ab max .
For a given (α, β), we can estimate the strength of epistasis for which the genetic barrier is maximal. This condition is given in equation (S8) and the expression for the corresponding strength of the genetic barrier is provided in the main manuscript, equation (2).

S 1.2.2 Strongest genetic barrier for a given value of Λ Ab max
One can wonder when the strongest barrier is formed for a given value of Λ Ab max . From the previous paragraph, if we only allow for negative epistasis, the absence of epistasis and a fair repartition of local adaptation is the best solution.
If we now assume that epistasis is positive, then β < 0 is a necessary condition for the existence of the 2-locus genetic barrier, leading to Λ Ab max = α − β. Then, assuming α = Λ Ab max + β and using equation (S6), one can infer when m Ab max is maximized. If α + β + AB < 0, then m Ab  Figure S1: Range of parameters where the presence of epistasis increases the strength of the genetic barrier m Ab max . The X axis corresponds to the selective advantage of allele B on the island and the Y axis corresponds to the strength of epistasis between A and B. The filled area indicates that epistasis between A and B leads to a stronger genetic barrier than the barrier formed if A and B do not interact. More precisely, yellow indicates that allele B can rescue allele A and orange that allele A can prevent the fixation of allele B. Red indicates that the genetic barrier only exists in the presence of epistasis. It is an illustration of equation (S7). The black dashed line indicates the strongest genetic barrier for a given {α, β}. Its expression is given in equation (S8).
is an increasing function of AB and reaches its maximum for AB = −Λ Ab max − 2β, leading to and therefore reaches its maximum The optimal choice of parameters in this case corresponds to α → 0, β → −Λ Ab max and AB → Λ Ab max . Figure S2 illustrates the dynamics of the system under these conditions (α = 0, β = −Λ Ab max and AB = Λ Ab max ). With this parameter combination, there is no single point equilibrium. The solution is an isocline to which all trajectories converge.
This section technically covers all cases, including the case of two island or two continental adaptations since the system is fully parameterized with 2 selective advantage parameters and one epistatic term. Nevertheless, we will detail these two cases in the following sections.

S 1.2.3 Epistasis between two island adaptations
We now consider two mutations A i and A j appearing on the island, therefore α i > 0 and α j > 0. If A i and A j are in tight linkage, then m max is always true in this setting. We now assume that A i and A j are in loose linkage. The amount of local adaptation is then given by Λ is not compatible with the existence of the 2-locus genetic barrier. By applying the transformation given in Table S1, we obtain that the genetic barrier will reach max . This corresponds to maximizing the selective advantage of both alleles A i and A j when they are rare, making them more resistant to swamping. Table S1: Fitness table and equivalence between two island adaptations A i and A j interacting with each other, and an island adaptation A k interacting with a continental adaptation B l . Here, the continental haplotype has a fitness of zero on the island. Then, the two parametrizations are equivalent with the following change: Figure S3 illustrates the parameter space for which epistasis can strengthen a genetic barrier by A i and A j (in comparison with the case without epistasis), while maintaining the amount of local adaptation constant. Epistasis will strengthen the genetic barrier if it is negative, therefore allowing every allele to have a higher marginal fitness when rare, relative to the case without 0.5 0.6 0.7 0.8 0.9 1.0 Figure S3: Range of parameters for which epistasis between 2 island adaptations improves the genetic barrier, for a fixed amount of local adaptation. The X axis corresponds to the selective advantage of allele A i and the Y axis to the selective advantage of A j . The blue area corresponds to parameters for which having epistasis between A i and A j generates a stronger genetic barrier compared to the absence of epistasis. The amount of local adaptation is identical for any combination of α i and α j .

epistasis.
Without loss of generality, we can assume Λ leading to the following expression for the strength of the 2-locus genetic barrier: Because A i and A j are equivalent, we only need to determine the maximum of m First we consider m In addition, for this specific expression of the genetic barrier (third term), α i + α j ≤ 1 is a necessary condition. Therefore, for a given α i ∈]0, 1[, the strongest genetic barrier is obtained for α j = 1 − α i , implying A i A j = 0, leading to α i = α j = 1 2 . We now consider m .5] and a decreasing function of α j ∈ [.5, 1]. In addition, m The genetic barrier formed with epistasis is stronger than the one formed without epistasis. In particular, if α i = 1, α j = 1 and A i A j → −1, the genetic barrier is equal to the maximum amount of local adaptation, despite the two loci being in loose linkage.
max is an increasing function of α i , the maximum is reached for α i = .5 and therefore the optimal situation corresponds to A i A j = 0.
if α i > .5, then the maximum of m Therefore, the optimal way to distribute Λ A i A j max between α i , α j and A i A j is the following: , then the dynamical system does not admit a single point equilibrium, but an isocline . All trajectories starting inside the simplex converge to this line.

S 1.2.4 Epistasis between two continental adaptations
This case is equivalent to the case presented above, with the substitutions presented in Table S2. Therefore the optimal genetic barrier for a given Λ b k b l max is given by β k = 0, β l = 0 In this scenario, the role of the genetic barrier is to prevent the fixation of alleles B k and B l . With a limited amount of local adaptation available, the most effective approach to prevent this is to have strong selective pressure against B k and B l when they are close to fixation.
The definition of Λ b k b l max is more cumbersome than in the previous scenarios: . β l < 0 and β k < 0 are necessary conditions for the existence of the two-locus genetic barrier, leading to If Λ b k b l max is given by −β k , i.e. β k < β l and β l + B k B l > 0, then (using Mathematica), it is . Per symmetry, if Λ b k b l max is given by −β k , we obtain the same result.
If Λ b k b l max is given by −(β k + β l + B k B l ), then this case is identical to the 2 island adaptations case, the genetic barrier is the largest when β k → 0, β l → 0 and The best parameter combination to optimize the genetic barrier with a fixed Λ max is β k = 0, S 1.3 3-locus model S 1.3.1 One island mutation and two continental ones.
We now focus on the three-locus model. In the main manuscript, we demonstrated that Λ max > m AbC max , with allele C appearing on the island, assuming that there were only two epistatic interactions between the derived alleles. In the following paragraph, we will focus on the case of allele C appearing on the continent. increases the marginal fitness of allele B and C. As a consequence, allele B and C will fix on the island at a lower migration rate. We therefore obtain m Abc max < m bc|a max . Since the two-locus barrier is a subset of the three-locus barrier, we have Λ bc|a max ≤ Λ Abc max . Therefore we obtain m Abc max < Λ Abc max .
• We now assume that one interaction is positive and the other negative. We assume that allele A interacts negatively with B and positively with C. In the absence of allele B, we know that for two loci with positive epistasis m Ac|b max ≤ Λ Ac|b max . If we now assume that allele B exists on the island, then the marginal fitness of allele A is reduced due its association with allele B. We therefore obtain m Abc max ≤ m Ac|b max . Since the two-locus barrier is a subset of the three-locus barrier, we have Λ Ac|b max ≤ Λ Abc max . Therefore we obtain m Abc max < Λ Abc max .
• Finally, we assume that alleles A and B interact negatively and alleles B and C positively.
In the absence of allele B, we know that for two loci without epistatic interaction, m  Figure S4: Fitness graphs for the different scenarios with only two pairwise epistatic interactions a) Allele C appears on the continent and there is negative epistasis between A and B and A and C. b) Alleles A and B interact negatively. There is positive epistasis between allele A and C or B and C but it is weak with regards to AB . b) Alleles A and B interact negatively. There is positive epistasis between allele A and C or B and C and its strength is equivalent to AB . d) The two epistatic interactions are positive (if C appears on the island, AB > 0 and bc > 0; if C appears on the continent, AB > 0 and AC > 0.
Λ Ac|b max . If we now assume that allele B exists on the island, then the marginal fitness of allele A is reduced due its association with allele B. We therefore obtain m Abc max ≤ m Ac|b max .
Since the two-locus barrier is a subset of the three-locus barrier, we have Λ Ac|b max ≤ Λ Abc max .
Therefore we obtain m Abc max < Λ Abc max .

S 1.3.2 A genetic barrier can reach Λ max even when all three loci are in loose linkage
We assume that there are three island adaptations A 1 , A 1 and A 3 . The direct effect of all adaptations is α. Pairwise epistasis between derived alleles is negative and given by . Threelocus positive epistasis exists between the three derived alleles and is given by − . The whole fitness table is given in Table S3.
Using this model, the definition of the maximum amount of adaptation is: Unfortunately, this system has very complex equilibrium solutions so we could only calculate the strength of the genetic barrier numerically. Fig. S5 illustrates the strength of the resulting as a function of epistasis. The X axis corresponds to the strength of pairwise epistasis between the different island adaptations and the Y axis to the ratio of the strength of the genetic barrier to the amount of local adaptation.
genetic barrier relative to the amount of local adaptation. First, the genetic barrier is always smaller than the amount of local adaptation; however The genetic barrier is always the same and given by m A 1 A 2 A 3 max = α, despite the increasing negative epistasis.
Epistasis does not affect the strength of the genetic barrier but the frequencies of the A i alleles as illustrated in Fig. S6. Indeed epistasis plays a role only when the derived alleles are at high frequencies and therefore it has no effect on the genetic barrier, since the barrier is defined when those alleles are rare. The trend in Fig. S5 is due to the reduction of Λ A 1 A 2 A 3 max as epistasis increases and does not reflect a change in the strength of the genetic barrier. If = −α, then the solution of the system is a stable plane (1 negative eigenvalue and 2 eigenvalues equal to 0 Because the three-locus system is fully parameterized with 7 parameters, we obtain the equivalent result for different evolutionary histories: 2 island adaptations and a continental one, 2 continental adaptations and 1 island one, or three continental adaptations, see Table   (S4). The first line for each model corresponds to the full parameterization of each model. As explained above, for the three islands adaptations, for simplicity we reduce the model to two parameters, selection (α) and epistasis ( ), making the model completely symmetric. Then we report the equivalent of this model for all other linkage architectures (2nd line for each model).
Lastly, we focus on the condition to observe One can notice that, if we reconstruct the fitness of each haplotype, we observe the following pattern: all haplotypes except the continental one have the exact same fitness, and the difference between the continental one  In the main manuscript, we present the model for three loci, with epistasis existing between all derived alleles. Below, we present a general formulation for an arbitrary number of loci.

and all the others is Λ
We consider diallelic loci classified into two classes of derived mutations, depending on their origin: • Type-A mutations are locally adaptive on the island, but deleterious on the continent.
Specifically, mutant allele A i has a selective advantage α i over the ancestral a i allele on the island, with i ∈ {1, n A } and n A the number of loci of type A. Type-A mutations always appear on the island and are absent from the continent.
• Type-B mutations are beneficial on the continent, where they appear and fix. On the Epistasis can occur between any combination of derived alleles and is denoted by , with the involved alleles indicated as subscript. For example, A i A j B k denotes 3-way epistasis between two island-derived alleles A i and A j and one continental-derived allele B k . As an example, the fitness table for a pair of A i and B j loci is given in Table S5.
Due to the complexity of equation (S10), it is not possible to explicitly integrate this expression into a general formula beyond equation (1). However, if we assume that epistasis is only pairwise, then we can derive a more detailed expression for the whole system when all loci are in loose linkage:

S 2.2 Tight linkage case
As mentioned in the main manuscript, when all loci are in tight linkage, the system is equivalent to a single-locus model with 2 n A +n B alleles.
If all B-type mutations are deleterious on the island and only negative epistasis between continental and island adaptations exists Conversely, if we assume that there is at least one B k mutation that is advantageous on the island compared to b k and only negative epistasis between continental and island adaptations exists, the genetic barrier will always be smaller than the amount of local adaptation In that case, all B k , β k > 0 mutations must generate some negative epistasis with at least one A i mutation, otherwise the genetic barrier ..
not exist. In this situation, stronger negative epistasis between B k and A i will never weaken the genetic barrier, but m . Indeed, in the absence of B k mutations, we recover the results above. Adding B k mutations reduces the fitness difference between the continental and the best island haplotypes; however Λ remains unchanged as it includes any subset of the genetic barrier, including the one without B k mutations.
We now assume that we have A i , i ∈ I = {1, 2, .., n I } island adaptations and B j , j ∈ J = {1, 2, .., n J } continental adaptations. Continental adaptations can be advantageous or not on the island.
Epistasis can only exist between any pair of derived alleles. If all loci are in tight linkage, this means that haplotype ..A i ..b j .., i ∈ I, j ∈ J is the best haplotype on the island.
We define as V ⊂ I a subset of island adaptations and as O ⊂ J a subset of continental . is the best haplotype, if for any subset V and O, the fitness of ., leading to equation (S12).
In particular, if epistasis only happens between A i and B j loci and assuming V = ∅, equation (S12) reduces to equation (S13). If all B j adaptations are deleterious and epistasis is only negative, it is obvious that equation (S13) is true.

S 2.3 Island adaptations in the absence of positive epistasis
We now consider that all loci are in loose linkage.
Any island adaptation A i that is not involved in positive interactions will be swamped by gene flow if m > α i [Blanckaert and Hermisson, 2018, Appendix C, eq. C15]. Indeed, using (S11) as a starting point, it is clear that if m > α, the only solution to this equation is . The maximum amount of local adaptation is equal to or larger than α i (because the fitness advantage of a haplotype that differs by a single island adaptation from the ancestral haplotype is always included in the maximum local adaptation). Therefore we obtain m

S 2.4 Continental adaptations in the absence of negative epistasis
We consider that all loci are in loose linkage. If allele B k is advantageous on the island, it will fix on the island as long as a single copy of the allele B k is introduced. Therefore, under these conditions, no genetic barrier can be built including locus B k . Any continental adaptation B j , deleterious on the island and not involved in negative interactions, will fix on the island for m > −β j Blanckaert and Hermisson [2018]. Indeed, using similarly equation (S11), we can rewrite this expression as m * with Ep corresponding to the various epistatic terms (Ep ≥ 0). If m > −β j , then it is clear that the previous expression is positive for any value of p B j in [0, 1[. Therefore allele B j will always fix, regardless of its starting for any barrier with at least a single locus of this type.

S 2.5 Only island adaptations
Assume next that divergence is entirely due to type-A mutations that originate on the island, with arbitrary interactions among the derived alleles. In loose linkage, all haplotypes that can be constructed from ancestral and derived alleles at the barrier loci, {a i , A i }, still exist on the island in the final state of the barrier (all A i mutations have appeared on the island). The maximum amount of local adaptation Λ max is the fitness difference between the continental haplotype (a 1 ...a n ) and the fittest of these haplotypes. It follows that the marginal fitness of any derived allele A i that is part of the fittest haplotype can be, in the best case scenario, equal to the maximum amount of local adaptation Λ max and otherwise must be smaller than Λ max , independently of the population composition. Since an allele is swamped from the island when gene flow is larger than its marginal fitness, we conclude that Λ max is once again an upper bound to the barrier strength.

S 2.6 Only continental adaptations
Let us now consider a barrier formed only by type-B mutations such that the island haplotype b 1 ..b n B has the highest intrinsic fitness value (this implies that all B i are deleterious on the island in the ancestral background, β j < 0). If all loci are in tight linkage, we have m .
Due to recombination, allele b i will no longer be associated with the best haplotype 100% of the time and will be found in other haplotypes of lower fitness. Its marginal fitness is therefore decreased, leading to the fixation of allele B i at a lower migration rate. Therefore Λ max is again an upper bound to the barrier strength.

S 2.7 Only one type of epistasis
Let us now consider a genetic barrier formed by an arbitrary number of A i and B j mutations, all in loose linkage, and let us assume there is at least one mutation of each type (the case with no A i or B j has been covered above, sections S 2.5 and S 2.6).
We now assume that all epistatic interactions are negative. In that case, there is at least one allele A i , that is not generating any positive epistasis. Based on the results derived above (section S 2.3), we can conclude that m The maximum amount of local adaptation is equal to or larger than α i (because the fitness advantage of a haplotype that differs by a single island adaptation from the ancestral haplotype is always included in the definition of the maximum amount of local adaptation). Therefore, we obtain m We now assume that all epistatic interactions are positive. In that case, there is at least one allele B j , that is deleterious on the island and not generating any negative epistasis. Based on the results derived above (section S 2.4), we can then conclude that m as mentioned above, the maximum amount of local adaptation is equal to or larger than −β j (because the fitness advantage of a haplotype that differs by a single continental adaptation from the ancestral haplotype is also always included in the definition of the maximum amount of local adaptation). Therefore, we obtain m

S 2.8 Negative epistasis between 2 sets
We now assume that there are n A island adaptations with arbitrary interactions among them and n B continental adaptations, with only negative epistasis between the type-A and type-B mutations. Based on the previous paragraph, in the absence of the continental adaptations, we have m . Considering now that continental adaptations are present, the marginal fitness of the island adaptations is reduced due to the negative epistasis generated with the immigrating B j alleles. Therefore, the genetic barrier is reduced, m However, since having all b j fixed is a subset of the general model, we have Λ We therefore demonstrate that Λ max is once again an upper bound to the strength of the genetic barrier.
If there are two (or more) disjoint sets of barrier loci that are mutually unlinked and not connected by epistasis of any order, the maximum amount of local adaptation Λ max of the union of the two sets is always given by the sum of the Λ max of the two subsets. In addition, the genetic barrier, m max , is determined by the weaker of the two subset barriers as the two sets of loci do not interact with each other (no linkage disequilibrium or any form of indirect selection).
Therefore, if we are interested in the conditions under which a strong barrier exists despite limited local adaptation, we can always assume that all derived alleles are connected either by linkage or by epistasis (i.e., there should always be an unbroken path of interactions between any two derived alleles).

S 2.9 Illustration for 6 loci
To illustrate the general rules described above, we simulated the case of a 6-locus genetic barrier. With 6 diallellic loci, there are a total of 64 possible haplotypes, leading to a system of 63 equations (based on equation (1) in the main text). Such a system is impossible to solve (even numerically using Mathematica). We therefore used a custom C++ program to find the equilibrium state of a given population starting from a secondary contact situation (i.e. haplotype A 1 A 2 A 3 b 1 b 2 b 3 starts fixed on the island). We iterate the continuous time equations until the sum of all absolute frequeny changes (for all 64 haplotypes) between two time steps becomes less than 10 −10 . We start at a low migration rate and increase until we either end up with an equilibrium at which all 6 loci are not polymorphic or if the program did not converge after a million time steps (convergence can become slow when the leading eigenvalues are close to zero, indicating that a change in stability is likely to happen for a value of m very close to the currently investigated one). We assumed that the different loci are found in the following order along the chromosome: A 1 , B 1 , B 2 , A 2 , A 3 and B 3 . We then consider two epistatic scenarios: A 1 interacts with B 1 , A 2 with B 2 and A 3 with B 3 (Figure S7 left panels) and A 1 interacts with B 3 , A 2 with B 1 and A 3 with B 2 (Figure S7 right panels). We start the simulations with haplotype A 1 A 2 A 3 b 1 b 2 b 3 fixed on the island, as it corresponds to a secondary scenario case, and the most likely starting point to find a potential internal stable equilibrium. The resulting genetic barriers are shown in Figure S7. We recover our theoretical result: the strength of the genetic barrier cannot exceed the maximum amount of local adaptation. The source code can be found at https://gitlab.com/evoldyn/strong_ri.
The black lines correspond to the limits derived in the main manuscript, with the solid line corresponding to all loci in tight linkage (

) and the dashed line to all loci in loose linkage (Λ
The colored lines correspond to simulations for 6 loci (100 points equally distributed on a log scale). For the left panels, pairwise negative epistasis happens between alleles of the same index (e.g., A 1 interacts with B 1 , A 2 with B 2 and A 3 with B 3 ). For the right panels, pairwise negative epistasis happens between alleles A 1 and B 3 , A 2 and B 1 , and A 3 and B 2 . All loci are equidistant and the recombination rate between adjacent loci scaled by Λ max corresponds to the X axis. Each panel corresponds to a different sets of parameters chosen such as Λ S 2.10 The strength of genetic barrier can reach Λ max with all loci in loose linkage In this section, we show that a genetic barrier of independent loci (loci in loose linkage) can be as least as strong as a barrier of tightly linked loci for a given maximum amount of local adaptation Λ max , given the right pattern of epistatic interaction among the derived alleles. This is a generalization of the example provided above for three loci (Section S 1.3.2). We have seen above that m max = Λ max is an upper limit for the gene flow from the continent to the island to keep n tightly linked loci polymorphic. Below, we show that for any m < Λ max also a barrier of n loci in loose linkage can be constructed that prevents swamping at these loci.
Barriers of this type can be constructed for any number of continental and island adaptations, but the proof is most straightforward if all adaptations (derived alleles) are on the island.
Consider n loci in loose linkage, with ancestral alleles a i fixed on the continent and derived alleles A i segregating on the island with frequency p i , 1 ≤ i ≤ n. We assume the following fitness scheme: The fitness of the ancestral haplotype (a 1 , . . . , a n ) on the island is normalized to 0, the fitness of the haplotype with all island alleles, (A 1 , . . . , A n ), is Λ max and the fitness of all other haplotypes is Λ max − for some small < Λ max . Obviously, Λ max is the maximum fitness deficit of continental invaders relative to optimal island individuals, conforming to our definition given in the model section of the main manuscript. We obtain a marginal fitness for all island alleles and a mean fitness in LĒ

The dynamical equation for p i readṡ
At an equilibrium, each p i can either be zero or take some positive value. For the symmetric fully polymorphic equilibrium p i = p > 0 we have which has a unique solution 0 < p < 0.5 for m < Λ max − . Stability at this equilibrium follows from the Jacobian with off-diagonal elements The eigenvalues of this Jacobian are J ii − J i =j = − p n−1 < 0 (as (n − 1)-fold eigenvalue) and nJ i =j < 0 for small . For any m < Λ max , we thus find a sufficiently small < Λ max − m such that the fully polymorphic equilibrium is stable, which proves the claim. We note that for any fixed value of the monomorphic equilibrium p i = 0 is stable for m > Λ max − .
Via simple reparametrization of the fitness landscape, we obtain fully polymorphic equilibria for any combination of continental and island adaptations (and interactions only among derived alleles) up to the same limit Λ max . All these architectures require a combination of positive and negative epistasis of high order.

S 3 Part III: Cryptic epistasis
We now consider the scenario with cryptic epistasis with three loci, A, B and C, as introduced in the main text. In this part, we use the alternative notation introduced in the main manuscript: allele C generates a negative 3-locus epistasis ( abC < 0) with the ancestral background ab with a selective advantage γ . Indeed, let us assume that such strong genetic barrier is also possible for a globally sta- the mechanisms behind the genetic barrier: selection against hybrids in that case is the major force that shapes the genetic barrier. Or selection against hybrids is frequency-dependent, and therefore any equilibrium will be only locally stable, as the most common type represses the rare one.
Using Mathematica, we calculate the necessary conditions to observe m Ab|C max > Λ Ab|C max given in equation (5) in the main manuscript. Figure S8 shows that allele A remains the majority allele in all cases when the genetic barrier is stronger than the maximum amount of local adaptation. For the three locus epistatic interaction, we assume that there is potential epistasis as soon as an ancestral allele at locus A or B and a derived one at locus C are in contact, i.e. allele C is in contact with allele a or b. The fitness scheme is given in Table S6. The term abCijk corresponds to this 3-locus epistasis component, with i, j and k counting the number of derived alleles at the loci, A, B, and C.

S 3.2.1.2 Conditions under which the system of equations is analytically solvable
First, we assume that A and B are in loose linkage. We derive the conditions, given in equation (S17) under which we can analytically solve the full system. These conditions can be summarized as follows: if A (resp. B) is homozygous, then going from one derived allele B (resp. A) to none increases the amount of epistasis generated by a factor 2. In addition, the epistasis generated by the double heterozygote AaBb has to be equal to the mean epistasis generated by the two single homozygotes (i.e. the epistasis generated by AaBB and Aabb as well as the epistasis generated by aaBb and AABb In particular, assuming that the DMI between A and B is codominant (but this is not a necessary condition), this system reduces to the classical codominant 2-locus model, presented in Bank et al. [2012].
Next, we consider that both A and B are in tight linkage. We apply the same reasoning. We derive the conditions for this system to reduce to a two-locus model with loci in tight linkage.
These conditions are given in equation (S18).

S 3.2.1.3 Recessivity and codominance
Whereas we define the model for arbitrary levels of dominance, we focus on two specific cases of dominance that are biologically relevant: • A recessive case, where the 3-locus incompatibility is expressed if there are two copies of allele C and at least one of alleles a and b or two copies of alleles a and b and at least one of allele C , i.e. all abCijk = 0 but abC002 = 2 abC , abC102 = abC012 = abC001 = abC .
The interaction between A and B is given by η AB = 0 and θ AB = 2 AB . Both the tight linkage and loose linkage models reduce to the diploid recessive model presented in Bank et al. [2012] with the proper substitutions.
• A codominant case, which, under the assumption of loose linkage, is equivalent to the haploid case. The list of substitutions needed to obtain the codominant model is given in equation (S19). Both the tight linkage and loose linkage models reduce to the model presented in Bank et al. [2012] with the appropriate substitutions.

S 3.2.1.4 Definition of the amount of local adaptation
Similar to the haploid model, we need to define the amount of local adaptation for the diploid model. The amount of local adaptation is obtained by comparing the optimal genotype (which is homozygous since we do not have any overdominance or heterozygote advantage) to the continental genotype corrected by ploidy. We exclude the three incompatible homozygous genotypes: abC/abC, ABc/ABc and ABC/ABC. We obtain the following expression for the amount of local adaptation: Λ . This is the exact same expression as in the haploid model,

S 3.2.2 Tight linkage and codominant interactions
Equation (S21) details under which condition we can observe a genetic barrier, the strength of which is larger than the maximum amount of local adaptation, for a diploid codominant incompatibility with A and B in tight linkage.
If cond 8 abC < 1 7 (6α + AB + 2β) If cond 9 abC < 2α + 7 AB + 6β Fixation of allele C on the island, for the 3-locus epistasis model For the scenario with cryptic epistasis, we always assume that C appears on the continent and fixes rapidly on the island. Here we derive and illustrate the condition for C to indeed fix on the island. Fixation of allele C depends on the position of allele C in the genome. Here we will provide these conditions for all linkage architectures and dominance schemes.
Equation (S22) gives the minimal selective advantage γ min for an allele C to fix on the island in the codominant model, with all loci in loose linkage. It also corresponds to the haploid case; as the two models share the same dynamics.
For the recessive case, with all loci in loose linkage, the fixation condition for allele C is given in equation (S23). However, we did not obtain an analytical expression for the equilibrium of the 2 locus system and therefore relied on numerical evaluation.
If the epistatic interactions are recessive, equation (S27) gives the minimal value of γ to ensure fixation of allele C on the island. As mentioned above, if A and B are in loose linkage and epistasis is recessive, we did not obtain an analytical expression for the internal equilibrium and therefore, we relied on numerical solutions to estimate p A and p B . The first condition corresponds to allele c invading in the background of allele b and the second one to invasion of allele c in the background of B, if B is advantageous enough.
Assuming A and C are in tight linkage and B is in loose linkage, equation (S28) gives the minimal value of γ to ensure fixation of C on the island if the epistatic interactions are codominant. The first condition corresponds to allele c invading in the background of allele a and the second one to invasion of allele c in the background of allele A. If the epistasis between A and B is extremely strong, being associated with A comes with a cost, as c will suffer from the epistasis between A and B.
Assuming A and C are in tight linkage and B is in loose linkage, equation (S29) gives the minimal value of γ to ensure fixation of C on the island if the epistatic interactions are recessive.
As mentioned above, if A and B are in loose linkage with recessive epistasis, we do not have an analytical expression of the equilibrium and therefore, we relied on numerical solutions to estimate p A and p B . The first condition corresponds to allele c invading in the background of allele a and the second one to invasion of allele c in the background of allele A. If epistasis between A and B is extremely strong, being associated with A comes with a cost, as c will suffer from the epistasis between A and B.
In the absence of migration, allele C is protected from invasion of allele c as long as it is advantageous on the island. This is true for all linkage architectures and dominance schemes detailed here.
In the main text, we always assume that C can "easily" fix on the island and the only condition we reported was one in which C does not contribute to local adaptation, i.e. γ > 0 and β < 0 in the case illustrated in Fig. S9 We can see here that assuming that C can "easily fix" is quite reasonable: γ does not require a large amount of adaptation to fix on the island. At the same time, if C can fix while being deleterious, it does so only if it is slightly deleterious (and therefore we do not remove most of the state space with our restriction on Λ Ab|C max ).
For the strength of the genetic barrier, the position of C in the genome is not important since it is fixed. Therefore, it is important to check for each scenario (AB codominant, A-B codominant and A-B recessive) whether C can fix in the first place. This means that at least one element among each of the following groups ((cod.: ABC,AB-C), (cod.:AC-B, A-BC, A-B-C), (rec.:AC-B, A-BC, A-B-C)) has to allow for fixation of C for a slightly positive γ .
As illustrated by Figure S9, migration has two opposite effects on γ min . On the first hand, migration favors fixation of allele C, as migrating individuals carry this allele, therefore as migration increases, γ min decreases. On the other hand, strong migration implies that haplotype ab is more frequent in the population than in a weak migration case. Therefore, the marginal fitness of allele C is reduced (due to the incompatibility being expressed), leading to γ min increasing as m increases.
In Figure S9(a), Λ Ab|C max is equal to 1. Therefore, Figure S9  In the previous section, we have derived the minimum value of γ that prevents allele c from invading on the island regardless of the migration rate. Alternatively, we can derive the migration rate at which the genetic barrier will collapse for a given value of γ > 0 with regards to allele c by looking at d( dp C dt )/d(p C ) evaluated at the stable equilibrium (A and B polymorphic) and with p C = 1. Using Mathematica, we can calculate when the genetic barrier will be lost due to allele c invading the equilibrium: if α(β + AB )( AB + 2 abC ) + AB abC (β + AB ) + 4γ ( AB + abC ) 2 < 0 or AB < 2 abC and α(β + AB )( AB + 2 abC ) + AB abC (β + AB ) + 4γ ( AB + abC ) 2 ≤ 0 or AB ≥ 2 abC and α AB + 2α abC + AB abC > 0 Ab|C max derived under the assumption that allele C is advantageous enough to always remain fix on the island (gray line) and compare it to the case where allele C is only moderately advantageous (lighter blue, γ α = 0, blue γ α = 0.2, dark blue γ α = 0.4). If allele C is moderately advantageous, it is possible that the genetic barrier is lost due to allele c being able to invade (when the blue line is below the gray line). In these cases, it remains possible to form a strong genetic barrier that will collapse due to allele c being able to invade this equilibrium (note that the final equilibrium in that case is the fixation of haplotype aBC, so invasion of allele c is temporary). If allele c is not able to invade the equilibrium, then the barrier collapses due to hybrid load.

S 3.4 Discrete model
We have seen in the main manuscript that strong incompatibilities generate strong genetic barriers to gene flow. Our results are obtained using the continuous time model, for which the equivalence between the continuous time approximation and the discrete time model is no longer automatic. Therefore, we investigated to which extent the results of the continuous time model for strong incompatibilities hold for the corresponding discrete time model. We implemented the equivalent discrete model assuming the following life cycle for haploids: reproduction (zygote formation, recombination and gametes production) followed by migration and then selection. For the haploid model, we explored the dynamics of the discrete time system given in equation(S31) with lethal incompatibilities, as recombination precedes selection allowing allele C to pass from the continental haplotype to an island one.      p A (n + 1) = α d (1−m d )p A (n)(−r d p B (n)+p A (n)+p B (n)) β d (m d −(m d −1)p B (n)(−r d p A (n)+p A (n)+p B (n)))−α d (m d −1)p A (n)(−r d p B (n)+p A (n)+p B (n)) p B (n + 1) = β d (m d −(m d −1)p B (n)(−r d p A (n)+p A (n)+p B (n))) β d (m d −(m d −1)p B (n)(−r d p A (n)+p A (n)+p B (n)))−α d (m d −1)p A (n)(−r d p B (n)+p A (n)+p B (n)) (S31) This system is solvable and admits 3 solutions: an obvious one, fixation of the continental type and 2 potential internal equilibria. Assuming that the loci are on different chromosomes, we can derive the maximum migration rate for the formation of a stable internal equilibrium,   Figure S11: Genetic barrier stronger than adaptation for the discrete model a) We represent the genetic barrier (blue) and the maximum amount of adaptation (red) for a haploid discrete time model with lethal incompatibilities. Here we used α d = 0.5. b) Parameter space (purple) for which the strength of the genetic barrier can be larger than amount of local adaptation, for a haploid discrete model with lethal incompatibilities.
continental and the island haplotype. In the absence of local adaptation, this ratio is 1. If the continental type is lethal on the island, this ratio is 0. Therefore, we consider 1-the minimum of the ratio calculated along all possible evolutionary histories: This also means that the transformation, between the discrete and continuous time dynamics, applied to the maximum amount of local adaptation is identical to the one applied to the maximum migration rates, which is reasonable since the quantities are comparable.
The genetic barrier can exceed the maximum amount of local adaptation if allele A is not too advantageous and B not too deleterious, see equation (S34) and Figure S11.