Nash Equilibrium in Evolutionary Competitive Models of Firms and Workers under External Regulation

The object of this paper is to study the labor market using evolutionary game theory as a framework. The entities of this competitive model are firms and workers, with and without external regulation. Firms can either innovate or not, while workers can either be skilled or not. Under the most simple model, called normal model, the economy rests in a poverty trap, where workers are not skilled and firms are not innovative. This Nash equilibria is stable even when both entities follow the optimum strategy in an on-off fashion. This fact suggests the need of an external agent that promotes the economy in order not to follow in a poverty trap. Therefore, an evolutionary competitive model is introduced, where an external regulator provides loans to encourage workers to be skilled and innovative firms. This model includes poverty traps but another Nash equilibria, where firms and workers are jointly innovative and skilled. The external regulator, in a three-phase process (loans, taxes and inactivity) achieves a common wealth, with a prosperous economy, with innovative firms and skilled workers.


Introduction
Poverty is a highly complex phenomenon, defined by multiple factors. This work is focused on poverty as a result of a rational answer from economic agents in a 1 downturn. In other words, the decisions ruled by the rest of the economy force a rational agent to conduct an inefficient activity in order to maximize its gain, or equivalently, reduce its losses. This approach can be formally studied in a gametheoretic framework. Specifically, we are interested in the dynamic behavior of different agents, so this paper will consider evolutionary game theory. The goal is to mathematically understand and fully characterize poverty traps, find alternative global optimum in the overall strategy-space and promote the economy to reach this global optimum.
The essential contribution of this paper is the introduction of an external regulator that avoids firms and workers to evolve to a poverty trap. The external regulator is able to define both incentive and tax policies. The successful procedure works in three stages. In the first stage, the regulator provides grants to innovative firms. As a consequence, non-innovative firms are encouraged to change their strategy. The means for the success is to incentive workers to educate and train in order to start innovative jobs with top-technology, while firms invest in technology and awards to trained workers. Once a "critical mass" of innovative firms and trained workers is achieved, such that the economy will not turn back to a poverty trap, the regulator introduces a tax policy. Its goal is to perceive a return enough to cancel the debt inherited from the first stage. Once the goal is reached, the action of the regulator ceases, and the economy keeps by itself in a joint innovation/modernization state, efficient from both aspects: incomes and stability under perturbations.
More specifically, the contributions of this work are summarized in the following items: • The definition of a non-cooperative network game between firms and workers, where its rules are inspired in real-life economics. The game is an extension, inspired in prior works [3,1,2].
• The expected dynamic evolution of the game using an intuitive fluid limit.
• A stochastic process that captures the evolution when the population is infinite.
• A full analysis of rest points and Nash equilibrium of the previous games, including both pure strategies and mixtures. This point includes a full characterization of poverty traps, and the fact that the economy evolves to the poverty trap under this model.
• An extended game with a new player, an External Regulator, where their strategies involve taxes, loans or inactivity.
• A three-phase dynamic strategy from the External Regulator, that forces the system to escape from the poverty trap and achieve a Pareto-efficient Nash equilibria. This is a product of the future work predicted in [2].
• Numerical results that highlight the matching between the stochastic process and fluid limit model when the population size grows, and the effectiveness of the External Regulator under different scenarios. These results are in harmony with the underlying theory here developed for the specific game under study.
This article is structured in the following manner. Section 2 includes the related terminology coming from economic game theory to obtain mathematical models of interaction between firms and workers. Classical propositions from the area are cited, including authoritative references. Additionally, previous works in poverty traps is discussed, that present an economic but an evolutionary game theory approach. Section 3 formally presents the mathematical models. A competitive model between firms and workers is introduced, with and without external action. Those games in normal form and evolutionary stochastic processes are analyzed, such as deterministic dynamics that tend to the stochastic process for infinite populations.
A brief numerical analysis is included in Section 5. They highlight the harmony between the theoretical predictions and simulations, carried out using classical Runge Kutta to solve ordinary differential equations from the models. Concluding remarks and trends for future work are presented in Section 6. 3

Background
In this section we outline the terminology and classical results that support the models developed in this article. Subsection 2.1 and 2.2 present key concepts coming from economical theory and evolutionary game theory, respectively.

Economical Theory
In [4], the authors provide a definition for poverty trap, which we consider an excellent point of departure.

Definition 2.1 (Poverty Trap) A poverty trap is any self-reinforcing mechanism that endures poverty.
Poverty trap is aligned with the conception that in certain circumstances poverty is far beyond the control of its economic agents. This is in contrast with the idea that poverty as a result of non-proactivity. The authors term "poverty circles" as a recurrent vice faced in several economies, strictly related with the reinforcing of poverty traps. The authors present a first non-cooperative model between firms and workers where a poverty trap is identified. Starting from a non-innovative economy, innovative agents cannot support their strategy, since their incomes are discouraging. The main characteristic of this model is that the poverty trap is a Nash equilibria, stable in an evolutionary viewpoint and Pareto-inefficient. This model has been further extended in [1,2]. These works add relevant economical parameters and propose the introduction of an external regulator as a future work. The aim of this paper is precisely the introduction of an external regulator, in order to tackle the poverty trap.
Poverty traps have been recognized in different scales and contexts. A derived concept from such different scales is fractal poverty traps, introduced in [5]. There, the authors present an informal theory where multiple dynamic equilibria occur simultaneously in different scales. They assure that there is no equilibrium under efficient or even high level of operation. On the contrary, all scales operate on a low or inefficient level. The dynamic analysis is based on growth economic models. In the following paragraphs, we briefly comment these dynamic schemes.
In growth models a certain economic variable x associated with wealth is studied, such as income, expenses or capital. The dynamic evolution x t of the economic variable is studied, assuming a discrete-time model t ∈ N + . A functional Figure 1: Resulting Dynamics in a Growth Model relation between x t+1 = F (x t ) is called growth function. A stationary state is a fixed point α = F (α) of the growth function. Such state will be stable whenever a small perturbation ǫ from the fixed point α (i.e. x 0 = α + ǫ) does not affect the limit x t t − → α. A formal definition of stability is provided in the book [9]. In global terms, if the growth function is a contraction mapping in a whole Banach space F : S → S, Banach fixed-point theorem will guarantee the convergence to the unique fixed point α ∈ S, no matter the initial condition x 0 ∈ S. In that case α is globally stable in the growth model. Local stability is assured if F is a contraction near a fixed point α [6]. Figure 1 shows a pictorial example of growth function with two stable and one unstable fixed points. The lowest stable point x P is a poverty trap, while the highest x E is an efficient equilibria. Between them, we can find an unstable fixed point x C , such that the evolution converges to it whenever x 0 ≥ x C , but the evolution converges to x P if x 0 < x C . Two sample trajectories that rest in the poverty trap x P are represented in blue, while two other trajectories that tend to the efficient equilibria x E are represented in red. 5

Evolutionary Game Theory:
Normal Games and Nash Equilibria We will follow the terminology from the book [13], which we consider an excellent point of departure to explore the world of evolutionary game theory. We will focus on multi-player non-cooperative games in normal form, and the dynamic selection associated to them. Let I = {1, . . . , n} be the playerset. Each player i has a finite set of pure strategies S i = {1, 2, . . . , m i }, being m i ≥ 2. The strategy selection of all players can be summarized in a vector s = (s 1 , s 2 , . . . , s n ), with s i ∈ S i for all i ∈ {1, . . . , n}. The space of pure strategies is the Cartesian product of individual strategies S = i S i , so s ∈ S.
Each player has an income function for pure strategies. Such function defines a preference for player i to strategy s i , given that the other players choose other pure strategies. The income function for pure-strategies from player i is π i : S → R. These functions are grouped to define the income function for pure strategies π : S → R n that assigns an income π(s) = (π 1 (s), π 2 (s), . . . , π n (s)) for each pure strategy s ∈ S. A game G in its normal form is defined by the tern G = (I, S, π). Now, we will consider mixed strategies. The mixed strategy for player i is a probability distribution over the set of pure strategies S i . We will store that distribution in a stochastic vector x i = (x i1 , x i2 , . . . , x im i ). Therefore, x ik ∈ [0, 1] is the probability that player i chooses pure strategy k.
The simplex of mixture strategies for player i, denoted by ∆ i , is the set of all possible mixture strategies: Pure strategies from player i are the vertices, or extremal set, of ∆ i . Mixture strategies are the convex hull of pure strategies, represented by the canonic vectors e 1 i = (1, 0, 0, . . . , 0), e 2 i = (0, 1, 0, . . . , 0),..e i m i = (0, 0, . . . , 1): The set of all mixture strategies of all players is expressed in a vector, x, called the profile of mixture strategies. Specifically, x = (x 1 , x 2 , . . . , x n ) is a vector from 6 the space of mixture strategies from the game Θ = i ∆ i .
The following notation will be useful, where z = (x i , y −i ) ∈ Θ is defined by z i = x i and z j = y j for all j = i. This vector represents the profile of mixture strategies, where player i applies mixture strategy x i but all other players apply strategy y ∈ Θ. We are able to define the expected incomes of all players. The definition assumes the selection events from different players are independent. Given a profile of pure strategies s = (s 1 , s 2 , . . . , s n ) ∈ S, the probability to select s given a profile of mixture strategies x ∈ Θ is P (s) = n i=1 x is i . The expected income of player i for the profile x ∈ Θ is: The set of expected incomes from all player can be summarized in a function called expected incomes of the game. This function is defined by u : Θ → R n , such that u(x) = (u 1 (x), u 2 (x), . . . , u n (x)).
Given a profile of strategies y ∈ Θ, the best answer for player i is the one with the best income. The correspondence of best answer in pure strategies is β i : Θ → S i : Analogously, the correspondence of best answer in mixture strategies isβ : Θ → ∆ i such that: It turns to be useful to group those best answers, in both pure and mixture strategies, in a level of games. The correspondences are β(y) = i β i (y) ⊂ S andβ(y) = iβ i (y) ⊂ Θ respectively.
We are in conditions to define the key concept from non-cooperative game theory, to know, Nash Equilibria. In words, a Nash equilibria is met whenever all players are simultaneously using one of their best answers to the strategies of the other players. In [11], Nash formally proves that the set of Nash equilibrium is nonempty for any finite game 7 G. A Nash equilibria is strict if the inequalities from the definition of best answers hold strictly. This means that players choose a strategy profile with a strictly large income that any other alternative.

Evolutionary Stable Equilibrium
The concept of evolutionary stable equilibrium is introduced in [10]. The idea comes from symmetric games between two players (i.e. where two players have identical set of pure strategies). Informally, a "resident strategy" x ∈ Θ is evolutionary stable if after an arbitrary small perturbation y ∈ Θ, the post-mutation profile w = x(1 − ǫ) + ǫy has a worse result than the resident profile (in terms of expected incomes). The idea can be extended to several players with different set of pure strategies.

Definition 2.3 (Evolutionary Stable Equilibria ESE)
The profile of mixture strategies x ∈ Θ is an evolutionary stable equilibria iff for each y = x ∈ Θ there exists ǫ y ∈ (0, 1) such that for all ǫ ∈ (0, ǫ y ], being w = x(1 − ǫ) + yǫ Parameter ǫ y is called invasion barrier mutation y, since it states the size where the mutation is not enough to move the resident strategy.
It is worth to notice that the inequality is not required for all i ∈ I, but only for some i ∈ I. However, the following proposition holds even for this weak definition: The profile of mixture strategies x ∈ Θ is ESE if and only if x is a strictly Nash equilibria.
An ESE discards all Nash equilibria but the strict one.

General Dynamic Selection
Let us assume an infinite population that are scheduled to choose a certain strategy during their lifetimes. These individuals are grouped by class, so each player-class i ∈ I will have a certain number of individuals. Among those classes, individuals are divided into sub-populations h ∈ S i associated with pure strategies from each player. The population state in time t is x(t) = (x 1 (t), . . . , x n (t)) ∈ Θ.
The population state will be governed by a system of first-order ordinary differential equations (ODEs). Let us call dynamic selection to a system of ODEs. For each player-class i ∈ I and subpopulation h ∈ S i we get that: Functions q ih : Θ → R, called rate functions. Rate functions must be Lipschitz in some open set X ⊂ R m such that Θ ⊂ X, with m = i m i . They are grouped by player-class and population level, where i ∈ I and q(x) = (q 1 (x), q 2 (x), . . . , q n (x)) respectively.
Rate functions share certain properties, known as regularity, income monotonicity and positiveness: Regularity implies that the population size of different player-classes hold constant, since hẋ ih = 0. Definition 2.6 A rate function q(x) verifies income monotonicity if the following equivalence holdS for all i ∈ I, x ∈ Θ and h, k ∈ S i : Income monotonicity implies that if for player-class i strategy h ∈ S i has higher expected income than other strategy k ∈ S i , then subpopulation h has higher rate than sub-population k.
Definition 2.7 A rate function q(x) is positive in the incomes iff the following equality holds for all i ∈ I, x ∈ Θ y h ∈ S i : The last property implies that if a certain player-class i gets higher expected income choosing strategy h than the current profile x, then the rate function for sub-population h must be positive. As a consequence, sub-population h will be increased with time. The expected income for player i given the current profile x is an average of the expected incomes for all pure strategies, An alternative way to express this property is that pure strategies that have incomes above (below) the average must have positive (negative) rates.
There exist positive rate functions that are not monotonous in the incomes. However, in dynamic selection with two strategies per player-class (m i = 2 for all i ∈ I) monotonicity and positiveness are equivalent. Proposition 2.8 can be found in [13].

Game in Normal Form
In the game in normal form G = {I, S, π} players are firms and workers: I = {W, F }, and S represents the set of pure strategies. Firms and workers interact in a labor market. They must decide the optimum profile in terms of their incomes.
Strategies from workers are to get skills or not. A worker is skilled when he invests in specialized education in order to access to qualified jobs in innovative firms. Another option is not to invest in education, refusing the possibility to access to qualified jobs. A skilled worker must have a continuous effort in specialized education. He can abandon this strategy and to be a non-skilled worker. These strategies are denoted by S W = {S, N S}, where the acronyms stand for Skilled Worker and Non-Skilled Worker.
Analogously, firms have two strategies. They can either be innovative or not. In the former, firms invest in technology and are benefited from skilled workers, while in the latter, firms do not invest in technology and hires non-skilled workers. Innovative firms are in the fore-front, and they can choose to abandon the technological career at any moment. These strategies are denoted by S F = {I, N I}, where the acronyms stand for Innovative Firm and Non-Innovative Firm.
Let us define the incomes for each scenario. The array of all possible pure strategies are {(S, I), (N S, I), (S, N I), (N S, N I)}. Denote π W (·, ·) and π F (·, ·) the incomes from firms and workers, respectively. They depend on economical parameters defined as follows.

Skilled-worker (S):
•s: This is the basic salary perceived by a skilled worker, no matter whether the firm is innovative (I) or not (N I).
•p: This is the bonus perceived by a worker when it is hired by an innovative firm (I).
• CE: This is the cost of a worker to acquire knowledge and to be updated.

Non-skilled worker (N S):
• s: This is the basic salary perceived by a skilled worker, no matter whether the firm is innovative (I) or not (N I).
• p: This is the bonus perceived by a worker when it is hired by an innovative firm (I). This bonus is lower for non-skilled workers in relation with skilled ones (p > p).

Innovative Firm (I):
• B I (S): This is the gain of an innovative firm (I) when a skilled worker (S) is hired.
• B I (N S): This is the gain of an innovative firm (I) when a non-skilled worker (N S) is hired.
• CI: This is the cost in technology in order to be an innovative firm.

Non-Innovative Firm (N I):
• B N I (S): This is the gain of a non-innovative firm (N I) when a skilled worker (S) is hired.
• B N I (N S): This is the gain of a non-innovative firm (N I) when a nonskilled worker (N S) is hired.
Innovative firms offer bonus for both skilled and non-skilled workers. On the other hand, non-innovative firms do not have this incentive policy. In order to reduce the number of parameters involved in the model, costs are perceived for skilled workers and innovative firms exclusively. They can be understood as the differential cost between skilled and non-skilled, or between an innovate firm and a non-innovative one.
The global incomes can be expressed as a function of the previous parameters. This is performed for each pair of strategies of firms and workers. Thus, we define the incomes for pure strategies from workers and firms respectively as π W : Incomes from Workers: Incomes from Firms: The incomes for pure strategies can be presented as usual in a bi-matrix: The parameters respect certain inequalities based on economic arguments. These relations are called Strategic Complements. They were also present in prior works by Accinelli et al. [1,2]. We will further extend those works: • The benefit from a skilled worker (S) hired by an innovative firm (I) is higher than a non-skilled worker: • The benefit from a non-skilled worker (N S) hired by a non-innovative firm (N I) is higher than a skilled worker: • The benefit from an innovative firm (I) is higher than a non-innovative firm (N I) when a skilled worker (S) is hired: • The benefit from a non-innovative firm (N I) is higher than an innovative firm (I) when a non-skilled worker (N S) is hired: • The benefit from an innovative firm (I) that hires a skilled worker (S) is higher than the benefit of a non-innovative firm (N I) that hires a non-skilled worker (N S): The five inequalities have a direct impact in Nash equilibrium on the game under study.

Nash Equilibrium
Nash equilibrium under pure strategies are found studying the best responses to pure strategies: From (7) and (8): (9) and (10): rium. Now we look for a mixed Nash equilibria. Players are assumed to follow mixture strategies x = (x S , 1 − x S ) ∈ ∆ W and y = (y I , 1 − y I ) ∈ ∆ F . First, we find the expected incomes: We will state the following equalities, that force players to be indifferent to change their mixture strategies and hence to result in a mixed Nash equilibria From the previous system of equations the following expression holds for the mixed Nash equilibria Θ N E Since the mixture strategy is a probability, x * S ∈ [0, 1] and y * I ∈ [0, 1]. From Inequality (10), the numerator of Equation (14) is negative. Using Inequality (9), the denominator in (14) is strictly lower than the numerator. Therefore, 0 < x * S < 1. Analogously, from the definition of bonusp and p, the denominator from Equation (15) is positive. Using (8) we know that the denominator from Equation (15) is non-negative. Finally, from Equation (7) we know that the denominator is higher than the numerator in (15). Therefore, 0 < y * I < 1. As a consequence, we get two Nash equilibrium in pure strategies:

Poverty Trap
In this paragraph we will show that the strategy profile {N S, N I} behaves like a poverty trap, in the sense that three conditions are met: i) Players do not have incentives to move to another strategy.
ii) There is another equilibria where at least one of the players can have better income and the other does not decrease the income.
iii) It is robust under mutations. In other words, it is not possible to escape from this trap from small perturbations.
Since {N S, N I} is a Nash equilibria (see 3.1.1), the first condition is met. Additionally, the Nash equilibria {N S, N I} is Pareto dominated by {S, I}: Indeed, bonus are positive. Using (7) we get that s + p > s, and consequentlȳ s +p − CE > s, which is equivalent to (16). Inequality (17) is equivalent to (11 By definition: , and an analogous expression holds for u F (e N S , y). Inequalities (18) and (19) can be re-written: These inequalities hold, since they are equivalent to Strategic Complements (8) and (10). As a conclusion, the strategy profile {N S, N I} verifies ESE, and behaves like a poverty trap.

Replicator Stochastic Process
We will present a stochastic process of population dynamics between the two player-classes {W, F }, divided in sub-populations {S, N S} and {I, N I}.
The stochastic dynamics is governed by the concept of replication of successful agents, an approach based on [13]. In this model, agents decide to check their strategies from time to time, in a random and different fashion for each subpopulation. Once the agent decides to check its strategy, he chooses another agent from his own class uniformly at random, and switches to that strategy only if he considers that the contacted player is more successful than himself. A measure of success is given in terms of the expected incomes perceived by different agents.
Here we will describe the elements that jointly compose stochastic processes in sub-populations. First, both the firm and worker population sizes are defined by N F y N W respectively. Their ratio is denoted by η = N F N W . In order to fix ideas, let us assume η < 1.
Player-classes are divided in sub-populations. From now on, we will denote stochastic processes in bold, while capital letters are reserved for random vectors. The stochastic processes that provide the size of different sub-populations S, N S, I and N I are: Since the interest resides in the evolution of different sub-populations, a normalization is useful: We can summarize the previous normalized processes in a vectorial stochastic process X(ω, t) : Ω × R + → S, being S = G 2 W × G 2 F ⊂ R 4 the state-space of the vectorial process.
It is mandatory to define the transitions and their respective rates. Let us denote τ A,B to transition from A ∈ S to B ∈ S: 3.
The respective rates respect the following conditions: i) The revision of the strategies from individual agents are governed by independent Poisson processes of rate r.
ii) Switchings are obtained by a division of the revision process into two parts, with a Bernoulli variable with success p. Therefore, once the strategy is under revision, the agent switches the strategy with probability p.
As a consequence, switchings follow a Poisson process, with rates p r. By sum-rule in Poisson process, if there are n independent identical agents, the number of switchings is a Poisson process with rate n p r. Transition rates consider this last observation. We will use the following notation: x = (x S (t), x N S (t)) ∈ G 2 W and y = (y I (t), y N I (t)) ∈ G 2 F . The transition rates between different states are functions q : S × S → R + . We define q(τ, X) as the transition rate between states X ∈ S and X + τ ∈ S.
Now, we define the value for revision rates: r S (y), r N S (y), r I (x) and r N I (x), given x ∈ ∆ W and y ∈ ∆ F . In the replication model of successful agents, revision rates for different sub-populations will decrease as a function of the income perceived by the agent. The aim is to model the fact that an agent that perceives a high income will revise its strategy less regularly than another with lower income. We will assume a linear model for rates, inspired in [13].
The following relations should be met in order to have non-negative and decreasing rates in the expected incomes: We must define switching probabilities, given that a strategy revision has been performed. Let us consider first agents from sub-population S. They decide to change the strategy to N S if they choose an agent from class N S and the expected income from that agent is better: The other probabilities are found analogously: It suffices to define the probabilities of the event corresponding to a comparison of expected incomes. The aim is to represent the fact that agents do not know a priori the expected income perceived by the current population, nor the one from the contacted agent. This fact comes from different possible sources, as incomplete, imperfect information of the expected incomes, agent preferences, or estimation errors. The expected incomes perceived by different agents will be different between firms and workers, but in sub-populations. This is a simplification of real-life, where different sub-populations might commit distinct errors when they estimate expected incomes. Nevertheless, this simplified model includes the uncertainty effect during the comparison of expected incomes, and the analysis can be further generalized to models with more uncertainty parameters in the perception of expected incomes.
If ε ∼ N (0, σ 2 W ) are independent identically-distributed random variables, for workers we will assume that: The probabilities of comparison for expected incomes are summarized in Equations (43): Switching rates are fully characterized from Equations (43): Given such transitions and their rates, it is possible to directly simulate the vectorial stochastic process, from an arbitrary starting point.

Replicator Dynamics
We will use non-bold symbols for deterministic functions. For instance, a deterministic approach for process X(t) will be denoted by X(t).
The system of ordinary differential equations for the replicator dynamics is a limit, for an infinite population, of the replicator stochastic process. In Section 3.3.1 we obtain, starting from the stochastic process 23, the system of ODE that is limit for an infinite population. Then, in Section 3.3.2 we study the main properties of this system and its relation with the game. Finally, a convergencemode of the stochastic process to the solution of the system of ODE is proved in Section 3.3.3.

System of ODE
As a first approach, we will present the flow-balance to find the system of ODEs from replicator model for an infinite number of agents. This method is simple and intuitive. The key is to consider transitions from infinite agents as a deterministic fluid model.
An argument for the equivalence between stochastic fluid and deterministic ones is that by Strong Law, rates for infinite populations converge almost surely to the expected number of transitions in a given interval.
The balance equations are specified as follows: Input flow in sub-population S are switchings fom agents in class N S (that decide to switch their strategy). From 3.2, the normalized flow rate (normalized with respect to population N W ) equals r N S (y) x S (1 − x S ) (1 − f W (y)). The rate, also normalized, corresponding to agents output from sub-population S equals r S (y) x S (1 − x S ) f W (y). Combining previous flows, we can re-write the following ODE:ẋ From Equation (48) we observe that the input flow in sub-population I are agents from N I, that decide to switch their strategy. The normalized rate with respect to population N F equals r N I (x) y I (1 − y I ) (1 − f F (x)). The output flow given by agents from I that switch to N I, has rate r I (x) y I (1 − y I ) f F (x).
Combining the previous flows we get the following ODE: Sub-populations x N S and y N I are the complement of x S and y I respectively. Therefore, we can find the former sub-populations as a function of the latter. The system of ODE for the dynamic replicator under stochastic process 23 is denoted by DR.
We present a second approach to retrieve DR starting from the stochastic process. The key idea is to re-write such process in its integral form, and then to take limit when the population size N W grows to infinity in order to reach a deterministic integral equation. This integral deterministic equation is precisely the dynamic replicator ODE. We will denote X(t) the normalized process with respect to the population sizes.
First, we need to define the drift vector fir each state v ∈ S.

Definition 3.1 (Drift)
The drift of stochastic process {X(t)} t∈R + ⊂ S is a function γ : S → R 4 such that: Being q(v, v ′ ) the transition rate from state v ∈ S to state v ′ ∈ S. Next, we use Definition 3.1 to write the normalized process from Section 3.2 in its integral form.
Equation (50) defines process M(t), and X 0 is a starting point. In Section 3.3.3 we prove the convergence in L 1 space, uniform in compacts of the stochastic process X(t) to function X(t), which is the solution of DR.
Informally, we can think that if N W → ∞, the integral stochastic equation is reduced to a deterministic integral equation, since M(t) is a Martingale that tends to 0, and the drift tends to a deterministic function that is the vector-field of the resulting system of ODE. The deterministic integral equation that is the limit of the stochastic one for infinite populations is: If Equation (51) is derived with respect to t, the system of ODE in its differential form is obtained: where the vector-field of the system of ODE is (for all X ∈ R 4 ): We wish to determine the vector-field of DR for the stochastic process presented in Section 3.2. For that purpose we need to find its drift using Definition 3.1, together with rates and transitions of the previously mentioned process.
Taking X = (x S , x N S , y I , y N I ) T ∈ R 4 .
It is worth to remark that function γ : R 4 → R 4 does not depend on N W . Then, we get that g(X) = γ(X), for all X ∈ R 4 . As a consequence, we reach the expression for the vector-field of DR. We can express the replicator dynamics for all components of the populations: In order to simplify the numerical resolution of the ODE dynamic replicator, we will reduce the dimension of the system to two. We must relate the unknowns for all t ∈ R + . Summing the first two components from (53) we see thatẋ S (t) + x N S (t) = 0. Since x S (0) + x N S (0) = 1, we get for all t ∈ R + that The previous relations allow us to take x = (x S (t), 1 − x S (t)) and y = (y I (t), 1 − y I (t)). The equivalent two-dimensional system is: The system is precisely the one obtained by a flow balance.

Properties of the System of ODE
The vector-field from Equation (53) is differentiable in R 4 hence Lipschitz continuous in the compact set [0, 1] 4 . By Picard's Theorem (see for instance Section 8.3 in [9]), the existence and uniqueness of the solutions of DR is guaranteed.
Second, the growth rate from the replicator dynamics is both regular and monotonous in the incomes (see 2.4). Regularity holds since we check in Section 3.3.1 that the derivative of the sum of sub-populations of workers and firms is null. Monotonicity in the incomes is based on finding the differences: If the perceived income of a strategy is higher than another, their respective rate functions verify the same relation. This is reflected from the sign of the differences (55) and (56). We will verify those relations for sub-populations S and N S as an example.
Let us assume that u W (e S , y) > u W (e N S , y) We can observe that from Definitions 43 and 32: Then: Or equivalently: An identical calculation provides the equivalence between u F (x, e I ) > u F (x, e N I ) and q F,I (X) > q F,N I (X). The monotonicity in the incomes is proved.
Using Proposition 2.8 and the fact that firms and workers have only two strategies to choose, the rate function is both monotonous and positive in the incomes as well (see Appendix A for a complete proof). As a consequence, the dynamics rests in the assumptions of the following result:

Theorem 3.2 (Weibull 1997) A strict Nash equilibria is asymptotically stable in each dynamic selection with a growth rate positive in the incomes.
In Section 3.1.2 we observe that the Nash equilibria called Poverty Trap is strict. We can conclude that the point {x S = 0, y I = 0} is asymptotically stable under the replicator dynamics. This motivates the study of its attraction set by means of numerical analysis performed in Section 5.
An important property from DR is that Nash equilibrium obtained in Section 3.1.1 are rest points of that set of ODE (a general analysis of this point can be found in Chapter 5 of [13]). We will verify the last statement for each Nash equilibria of the game.

26
The proof is analogous to that of Proposition 3.3.
Proof Combining Equations (14) and (15) for x * S and y * I , it suffices to recall the fact that the expected incomes respect Equations (12) and (13). From these conditions, we get that: Replacing the previous conditions in Equation (53), we obtain thatẋ S (t) = 0, y I (t) = 0. Therefore, the mixed Nash equilibria is also a rest point of DR.
Remark It is worth to notice that there are another rest points of DR that are not Nash equilibrium: {x S = 0, y I = 1} and {x S = 1, y I = 0}.

Convergence of the Replicator Stochastic Process
We analyze the convergence of the stochastic process to the solution of the system DR. The main reason of this analysis is to provide a formal deduction of DR using Drift.
The second reason is to formally justify that for large but finite populations, the study of the stochastic replicator model by means of the deterministic model is satisfactory.
The source of inspiration of the following result is prior works [7] and the book [12], where similar processes are studied.
Theorem 3.6 The stochastic process X(t) from (23) converges in L 1 norm and uniformly on any compact set [0, T ], to X(t), the solution of the system (53).
being · the Euclidean norm.

27
Proof The stochastic process is first presented in its integral form: Equation (61) plays the role of the definition of process M(t). The ODE can also be represented in an integral form: Taking differences between (61) and (62), and adding Euclidean norm: From prior calculations provided in Section 3.3.1 we know that γ(v) = g(v), ∀v ∈ R 4 . We also know from Section 3.
We define the random function f (s) = sup t∈[0,s] X(t) − X(t) . Inequality (65) also holds taking supreme with respect to t ∈ [0, T ]: Taking expected values on both sides: We must bound the expected value of the supreme of M(t) . For that purpose, we will bound the square of the expected value and use Cauchy-Schwarz inequality.
Defining α : S → R + as: In [7] the authors prove that M(t) is a martingale, and the following inequality: Using Expressions (69) and (71) and exchanging integral with expected value we get the following inequalities: The following bound for E[α(X(t)] is obtained in Appendix C: being c ∈ R + . Returning to Inequality (67) and using (74): Finally, from Inequality (75) Gronwall's Lemma allow us to conclude that: 29 Taking limit with N W → ∞, we lead to the result.
Using Markov Inequality, the result can be expressed in probabilistic terms: Pr sup for any given ǫ ∈ R + .

Competitive Model with External Regulator
We introduce a modification of the game G from Section 2.2. Now, there is an external regulator that is capable of providing incentives and taxes to both firms and workers. The goal of the regulator is to facilitate firms to be innovative and workers to be skilled, in order to have a developed economy. In other words, the regulator will try to avoid poverty traps in the economy.

Definition
The game is denoted by G E = {I, S, π E }, where the supra-index E stands for external regulation. The player-set I is precisely the one from game G defined in Section 2.2, so are the strategies S. The incomes from pure strategies π E make the difference of both games. Details of the descriptions of strategies {S, N S} for workers W and {I, N I} for firms F can be found in Section 2.2. We also consider the incomes from pure strategies from that section, but additional parameters that determine taxes and incentives.

Skilled Worker (S):
• m W : is the loan perceived by a skilled worker, hired either by an innovative (I) or non-innovative (N I) firm.
• I W : is the tax of a skilled worker, hired either by an innovative (I) or non-innovative (N I) firm..

Innovative Firm (I):
• m W : is the loan perceived by an innovative firm, no matter the worker class to be hired by this firm.
• I F : is the tax perceived by an innovative firm, no matter the worker class to be hired by this firm.
As in the previous game, we can define the incomes for pure strategies, but now taking into consideration taxes and incentives. The incomes for workers are π E W : S W × S F → R, while for firms π E F : S W × S F → R. Incomes for Workers: Incomes for Firms: Once defined the parameters for firms and workers, we can find Strategic Complements as in Section 2.2. In the game G E we keep Constraints (7), (8), (9) and (10) that were defined in the game G. Two intuitive relations for incentives will be added.
Specifically, an aggressive incentive policy will lead to the fact that is is irrational to choose another strategy different from innovation and getting skills. In terms of the game G E , this is the case where there is a single Nash equilibria of pure strategies given by {S, I}. We are not interested in this case, since it requires an intensive incentive policy. On the contrary, we are concerned with cases of moderate incentives. The following conditions are added: As stated before, these conditions have a direct impact on Nash equilibrium of the game G E .

Finding Nash Equilibrium
We will find pure Nash equilibrium first. For that purpose, we first obtain the function for best response under pure strategies, BR W (s F ) with s F ∈ S F , and BR F (s W ) with s W ∈ S W , using the incomes with external regulation. From (7), (8) and (77), for workers we have that: While using (9), (10) and (78), for firms: Using the best response for pure strategies and the definition of Nash equilibrium, we conclude that Θ N E 1 = {S, I} and Θ N E 2 = {N S, N I}. Now we will look for mixed Nash equilibrium. Again, we assume players randomize their mixture strategies; hence their strategies will be x = (x S , 1 − x S ) ∈ ∆ W y y = (y I , 1 − y I ) ∈ ∆ F .
In order to find a mixed Nash equilibria we should find the expected incomes from players. We will identify incomes with supra-index E: We will impose equalities that make players indifferent to choose other mixture strategies and lead to mixed Nash equilibrium.
Solving the linear system (79) and (80), we determine the following expression for the single mixed Nash equilibria for the game G E , Θ N E 3 = {x E * , y E * }.
Again, the mixed Nash equilibria must respect x E * S ∈ [0, 1] and y E * I ∈ [0, 1], since they represent probabilities. The deduction is analogous to the ones from Section 3.1.1, and they use Complementary Strategies (7), (8), (9) and (10) together with incentive constraints (4.1.1). We invite the reader to complete the proof.
Summarizing, the competitive model between firms and workers with external regulation has the following Nash equilibrium:

Replication Stochastic Process
As in Section 3.2 we have finite population of firms and workers that follow successful agents, and the stochastic process X(t) from that section 3.2 holds for the current competitive model. The new expected incomes perceived by agents force us to update transition rates defined in (29), (30), (31) and (31), since both the revision rates and comparison probabilities depend on such expected incomes. Therefore, with an adjustment of the expected incomes the other definitions of rates and transitions hold, and the process is fully characterized.

Replicator Dynamics
We will find the system of ODE of replicator dynamics for the competitive model between firms and workers with external regulation, based on Section 3.3.1. The derivation is completely analogous to 3.3.1, either by flow balance or drift.
On the other hand, in order to obtain the dynamic behavior of the external regulator easily, we will introduce the following definition of the control vector Φ. This is composed by both incentives (m W , m F ) and taxes (I W , I F ).
The control vector must respect Constraints (78) and (77) for incentives and taxes. It is important to remark that the economy without external regulation (model from Section 2.2) is precisely the case with Φ = 0. Furthermore, any function that depends on the expected incomes will be a parametric function of Φ.
Now, we will find the system of ODE for the replicator dynamics with external regulation explicitly as a function of Φ.
The function vector with unknowns is Z(t) ∈ R 2 such that: The vector field, parametric in Φ, from (87), is F : Therefore, the system of ODE (87) can be written as follows: Another step needed to express the behavior from the external regulator is to define, for DR E , the attraction region for the Nash equilibria with innovative firms and skilled workers.
Before, let us consider denote the family of solutions for ODE (88), as ξ(t, Z 0 , Φ) such that ∀Z 0 ∈ [0, 1] 2 : The for all Z 0 ∈ A(Φ). We are in position to announce the rules that the external regulator will state in order to tend the economy to a Pareto-efficient equilibria.
The regulator will request at time t = 0 a loan to an external institution. In t = 0 and using the loan, he will incentive both innovative firms and skilled workers in a constant manner until t i > 0. Then, the regulator will stop incentives and impose taxes to both firms and workers until t ii > t i , in order to retrieve the initial loan. Once time t ii is reached, the activity of the external regulator ceases.
The regulator defines times t i , t ii and the control parameters Φ i and Φ ii for each phase, as follows: Observe that m i W , m i W is found identifying the mixed Nash equilibria (81) with starting point Z 0 .  Figure 2. In time t i ∈ R + incentives are disrupted. Point E is defined by: The control vector for this stage is Φ ii = (0, 0, I ii W , I ii F ) T . The external regulator is free to choose taxes. It is necessary to define also δ t > 0, the time that ensures the system to reach the attraction region A(Φ ii ).
3. Ending stage: See Point F in Figure 2. In time t ii ∈ R + the external regulator disappears from the system: With t ii , the regulator collects exactly the loans offered during the first stage.
In this way we simplify the economic problem, since we do not add interest rates. In this model we get that: Summarizing, the control parameters for the three phases are: Finally, ODE (88) and control function (97) provide the complete dynamics for the competitive model between firms and workers under external regulation.

Numerical Analysis
The evolution of firms and workers is numerically studied as a function of time, with and without external regulation. We find attraction regions for both cases, and find the gap between the stochastic process and the fluid model as a function of the population size. The ODE has been numerically solved using Dormand and Prince algorithm, a stable modification of classical fourth-order Runge Kutta, with adaptive step [8].

Model without regulation
We carried-out simulations from the dynamic model introduced in Section 3.2. The goal is to analyze the gap for different population sizes with respect to the deterministic dynamics.
The evolution strongly depends on the starting point in order to tend the Pareto optimum equilibria (S, I) or the povert trap (N S, N I). Finally, the attraction region is numerically found for the deterministic model. As a first case, we consider the starting point x S0 = 0.3 and y I0 = 0.6. Figures 3,4 and 5 present 10 independent runnings of the stochastic process with respective populations N F = 20, N F = 100 and N F = 1000, with ratio η = 0, 1. The reader can appreciate that the economy converges to the poverty trap in the three cases.
The component y I (t) presents higher variability than x S (t). The cause is that η = 0.1, so there are more workers than firms, and the behavior in workers is closer than the one of firms to the asymptotic one. We present three other cases with x S0 = 0.2 e y I0 = 0.85, where the evolution tends to the Pareto-optimum equilibria (x S , y I ) = (1, 1). . Now, we present numerical solutions of the replicator system of ODE (54). First, we provide deterministic solutions with the same starting conditions for stochastic solutions. Then, we will study the attraction region. Figure 9 shows the solution when (x S0 , y I0 ) = (0.3, 0.6). There is a clear coincidence between the deterministic solution and the stochastic process for N F = 1000.
When we consider the starting point (x S0 , y I0 ) = (0.85, 0.2), we can appreciate again a nice matching between the stochastic process for N W = 1000 and the EDO replication dynamics. Figure 11 presents the attraction region for both the poverty trap and the Pareto-optimum equilibria. It is worth to mention that the mixed equilibria is precisely in the border of both attraction regions, and the vector field from the ODE flows to the other equilibrium. Black diamonds represent the starting conditions selected in previous simulations. Now, we will contrast the results with the model under regulation. We take the same parameters of the model without regulation, control parameters Φ i , Φ ii and starting point (x S0 , y I0 ) = (0.3, 0.6).  Table 2: Parameters for the model with regulation.

Parmetro
Loans from Stage 1 were found in such a way that the mixed Nash equilibria is inside the rectangle delimited by points (0, 0) and (x S0 , x I0 ). With this selection the dynamics evolves to the Pareto-optimum equilibria (1, 1). Taxes from Stage 2 were chosen in a similar order of incentives. Figure 12 shows the attraction region for the dynamics of Stage 1, and the starting point with a black diamond. The starting point is in the attraction reguin of the Pareto-efficient in the dynamic with incentives. We remark that loans increase the area of the Pareto-efficient attraction region, which is a relevant aspect of regulation. This means that there are less starting points that evolve to the poverty trap. Figure 13 shows the attraction region for Stage 2. A tax increase reduces the area of the attraction region to the Pareto-efficient equilibria.
The dynamic evolution for parameters from Table 1 and 5.1 are presented in Figure 14. There, x S (t), y I (t) and (x S (t), y I (t)) are represented in red, blue and black respectively. The border of attraction regions for loans (magenta) and taxes (green) is also included. We can identify the three stages, each one with a different vector field. The extra-time δ t where loans are kept after the border of the attraction region is met helps to rest in the Pareto-optimum equilibria. Table 3 summarizes the sizes of sub-populations and required time for different stages, where the symbols correspond to Figure 14.  Table 3: Results with external regulation -(x S (0), y I (0)) = (0.3, 0.6)

Conclusions
The main contribution of this article is to provide a means to avoid poverty trap equilibria, with the assistance of an external regulator. This regulator has an action in three phases: loans, taxes and inactivity, in chronological order. The time the system needs to escape from the poverty trap and converge the Pareto-efficient equilibria is in the order of the evolution without external regulation. Therefore, the results of our incentive mechanism become evident in a reasonable amount of time.
During the first phase, innovative agents are encouraged to innovate, and the poverty trap is mitigated. On the other hand, the poverty trap is exacerbated when taxes take place. These results are consistently verified from both numerical simulation and finding attraction regions analytically.
Numerical results highlight a nice matching between the stochastic process for big populations and the limit for infinite populations given by the system of ordinary differential equations. This idea is formally proved in Theorem 3.6. Therefore, the aggregated behavior of agents for big populations is close to the limit of deterministic fluid model for infinite population.
As a future work, we would like to develop stochastic differential equations to approach dynamic replicator system. Then, by central limit theorems, confidence intervals for deterministic evolutions will help to understand the variability of the evolution.
A second research line is to implement the model in a real-life economy, by means of a pointwise estimation of its relevant parameters and providing a thorough predictive analysis of its evolution.
Monotonicity in the incomes is by definition: From Equation (99) we obtain that the left-hand side of 101 is equivalent to: On the other hand, from Equation 100, we can conclude the right-hand side of 101 is equivalent to: Thus, from (102) and (103), monotonicity is equivalent: as desired.

B: Switching Probabilities
We find the probability of a switching of strategy in a revision. Let Φ(t) denote the standard normal distribution. We will find Pr û W (e N S , y) >û W (e S , y) using the definition of expected incomes perceived by agents (û W (·, ·) andû W (·, ·)) given in Section 3.2.

C: Bounding E[α(X(t))]
We find a bound for E(α(X(t)). Function α : S → R + is defined by: Since E(α(X(t)) is in the range of α(v), the following inequality holds: Then, we find for transition vector and rates from 3. : Replacing 107 in (106) and combining we get that The transition norms are found: