Self-Managing Terror: Resolving Agency Problems With Diverse Teams

I examine a principal-agents model of subversion with externalities and identify a novel explanation for how diversity can be valuable to organizations: teams of diverse agents can self-manage and mitigate their own agency problems. Generally, this model explores how and when integrating fringe or ideologically extreme agents can align incentives between the principal and all agents. This technique is shown to function better, relative to other contracting techniques, in settings that are bureaucratic and low-information. Self-managing teams are explored in the context Islamist terror groups that use foreign ghters. Because foreign and domestic ghters have con icting preferences over what types of activities the group should be conducting, if foreign and domestic are integrated onto a team, then the teammay self-regulate with e ciency gains for the principal. This model explains variation in agency problems and foreign ghter usage in major insurgent groups, including al Qaeda in Iraq, the Haqqani Network, and the Islamic State.

In the early 1980s, the Haqqani Network faced an existential challenge. The Haqqani Network was one of the major actors in the multi-party insurgency against the Soviet-backed government of Afghanistan. To survive, the Haqqani Network would need to ght both the government and rival local groups through complex and disciplined operations over a vast geographic area, all the while facing intense counterinsurgency pressure. In response to these challenges, the Haqqani Network became a more diverse organization. During this conict, large numbers of Arab ghters traveled to Afghanistan and fought independently against the Soviet Union as mujahadeen. While these foreign ghters were viewed cautiously by many Afghan insurgent groups due to their extreme ideology, the Haqqani Network was the rst group to recognize the foreign ghters' value and to create integrated ghting columns of Afghans and foreign ghters (Hamid and Farrall, 2015, 65-167;Brown and Rassler, 2013, 189-190). And, as an integrated organization, the Haqqani Network has done remarkably well; the Haqqanis have persevered despite nearly four decades of attempts by local actors and global superpowers (the Soviet Union and United States) to destroy the group. In one of the least developed and most conict prone areas in the world, the Haqqani Network discovered the value of a diverse workforce.
Organizational economics has explanations for how diversity could have been valuable to the Haqqani Network. Following Lazear (1999) or Hong and Page (2001), foreign ghters could have introduced new skills or new perspectives on problem solving to the organization. Alternatively, foreign ghters could have provided more manpower to the groups or were better ghters than domestic agents. Or, in light of ally principle type results in the literature on agency problems (Bendor et al., 2001), foreign ghters could have been more allied with Haqqani Network's leadership than domestic agents. If any of these explanations were correct, we would expect similar militant groups to welcome foreign ghters. Instead, there is signicant variation in foreign ghter use among prominent violent jihadist groups. In 2007, al Qaeda in Iraq (AQI) began turning foreign ghters away due to internal dysfunction (CTC, 2007a).
In 2015, al Shaabab's leadership tolerated its local ghters killing o its foreign ghter members (Scahill, 2015). And, in 2015, when AQI re-emerged as Daesh (commonly referred to as the Islamic State or ISIS), the group undertook the largest recruitment of foreign ghters in history and took pains to integrate foreign ghters into all levels of the organization (Weiss, 2015;Fishman, 2016).
The variation in foreign ghter use merits the following questions: why and when is diversity valuable to militant groups?
Diverse preferences among agents are valuable because they present a solution to a critical organizational design problem: insurgent leadership must design eective teams from imperfect agents to operate in environments where it is dicult for the leadership to discern what actions are appropriate. The Haqqanis discovered that integrated teams of domestic and foreign agents can self-manage their agency problems more eectively than homogeneous teams of domestic agents, surprisingly, even when foreign agents have preferences that are less aligned with the preferences of the principal relative to domestic agents. Put another way, in contrast to standard ally principle type results, the Haqqanis discovered that by adding worse agents, strategic interactions between dierent types of imperfect agents can lead to more ecent teams. This paper analyzes this organizational design problem in the context of a principal-agents model of subversion with externalities between agents.
To expand on the organizational design problem, to succeed in an insurgency, an insurgent groups must balance attacking government actors and asserting dominance over local civilian and (at times) rival insurgent groups over a large geographic area. However, what precisely agents should be doing in a given location would be determined by local circumstances. Because observing these local circumstances is risky for the leadership, embedded teams of agents would be cognizant of local circumstances, but insurgent leadership could not discern what agents should be doing without risking capture or death. This opens the possibility for subversion, where agents may conduct the operations that they like rather than what the principal would want them to conduct, without the principal knowing outright that the agents misbehaved. For example, if rival insurgent groups were attacking Haqqani Network agents, the Haqqani Network's leadership would want its agents to respond and engage these rivals; but, if the leadership observes its agents attacking local actors, the leadership would not easily know if the agents were being attacked and responding or if the agents were pursuing local power at the expense of local actor-insurgent relations.
To conduct operations, leadership of jihadist militant groups uses teams of imperfect agents. Domestic ghters are imperfect because their preferences are shaped by their connection to the local population. That local ghters act out and pursue greed, grievance, or personal security is consistent with extensive empirical evidence and historical anecdotes (Weinstein, 2006;Kalyvas, 2006;Enders and Jindapon, 2010;Shapiro, 2013;Schram, 2019).
1 Islamist foreign ghters were also imperfect, but for dierent reasons. Foreign ghters' traveled and fought because they believed it was their religious duty to protect the Muslim nation (the umma) when it faces external threats (Malet, 2010;Hegghammer, 2010). In contrast to local ghters, foreign ghters were enthusiastic to engage government or non-Islamist forces, but foreign ghters were also naive extremists who lacked a stake in the long-run success of the insurgent group, were less willing to attack co-religious rival actors which was necessary to insure the insurgent group emerged dominant (Hafez, 2010;Hegghammer, 2010;McChrystal, 2013;Brown and Rassler, 2013;Schram, 2019) Altogether, at a given point in time, leadership could have preferences that were more aligned with one type of agent over the other, but this would depend on complex local circumstances and this could change over time. Faced with this operating environment, the leadership must eectively design teams, or the leadership risks that agents will misbehave to the detriment of the group. 2 1 Trotsky (1971)  2 I include a more thorough discussion of actor's preferences below in the "Related Literature" section.
In this setting, two factors drive the result that diverse teams can self-manage their subversion. The rst, which was discussed above, is that agents have partially misaligned preferences with the leadership, and that these preferences are misaligned in dierent ways. The second factor is the externality structure surrounding the agent's actions. When an agent subverts, that agent's likeminded teammates benet.
3 The agent-agent externality structure used here is dierent than existing models of shirking (where agents exert less eort or allocate less funds than what the principal would prefer) in terror groups, where agency problems have negative spillovers on proximate agents (Baccara and Bar-Isaac, 2008;Enders and Jindapon, 2010). That an agent's actions have dierent eects on dierent types of proximate agents is critical for the results below. For a homogeneous team of agents, agents are collectively incentivised to subvert as each agent benets from their like-minded teammates' misbehaviors.
In contrast, for a heterogeneous team, agents possess internally misaligned preferences over the actions that they want to pursue. Because on a heterogeneous team the agents' preferences for subversion are pulling in dierent directions, agents may be willing to collectively forgoing misbehaving and instead do what is best for the insurgent group. This dynamic can be illustrated in a simple model of agents interacting within homogeneous and heterogeneous teams.
Consider an innite-horizon game in which a two-agent team conducts operations.
Let t ∈ {1, 2, 3...} denote periods. In each period, nature selects the state of the world ω t ∈ {d, f }, then each agent observes ω t and selects action x t ∈ {d, f }.
The state of the world identies the action that the leadership wants the agents to undertake. When ω t = d (ω t = f ), the leadership prefers that agents select x t = d (x t = f ). Nature selects ω t = d with probability 0.6. Agents have type τ ∈ D, F , where agents of type τ = D (τ = F ) most prefer action x t = d (x t = f ). Together, nature selecting ω t = d with probability 0.6 implies that 3 Practically, because perusing greed or grievance (for local ghters) or engaging government or Western security forces (for foreign ghters) has spillover eects for proximate agents, agents will benet when their like-minded teammates subvert. the leadership's preferences are more aligned with the preferences of type D agents. Agents receive 2 utils when they undertake their most preferred action, 2 utils when their teammate undertakes their most preferred action, and 1 util when they undertake the action that the leadership wants them to undertake.
Below I depict the normal forms of the per-period game for a homogeneous team of agents of type D. Each normal-form game references the per-period game under dierent states of the world.
In both states of the world, it is a Nash equilibrium agents for both type D agents to select the actions that match their type, or setting x t = d, for all t ∈ {0, 1, 2, ...}. While sometimes these agents are acting in the interests of the leadership (when ω t = d), at other times these agents are subverting (when ω t = f ). Similar results holds for two type F agents always setting x t = f . However, when a diverse team is formed, a new dynamic can arise. Below is the normal form of the per-period game a heterogeneous team.
Under both states of the world, it is a Nash equilibrium for agents to select their most preferred action (type D sets x t = 0 and type F sets x t = 1) for all t. In this equilibrium, agents select the actions they most prefer to the detriment of their partner and the group. For a suciently high discount rate (δ ≥ 0.83),  (Dressler, 2010, Brown andRassler, 2013, 189-190 (Weiss, 2015;Gates and Podder, 2015).
While the emphasis of this paper is jihadist militant groups, the operating That teams of diverse agents can self-regulate is fundamentally dierent from existing explanations for the value of diversity, where production complementaries (Lazear) or new perspectives on problem solving (Hong and Page, 2001) oset the costs from when diverse agents interact. And, while results like this exist in the legislative signaling settings (for examples, see Battaglini (2002) and Hirsch and Shotts (2015)), I show that diversity can also be a simple, handso tool for addressing moral hazard.
Second, within the context of self-managing teams, I assess the natural intuition that the principal wants agents whose preferences are more aligned with the preferences of the principal. I nd this intuition does not hold. Rather, teams self-manage best when agents' preferences oset one another, or when agent's ideal points are equidistant from the principal's expected most preferred policy.
In practice, the principal may recruit an agent whose preferences are less like the principal's to better counterbalance the preferences of that agent's teammate.

Related Literature
The nding presented here, that foreign ghters can help resolve agency problems in insurgent groups, has implications for the organizational economics of militant organizations. Since Crenshaw (1987) and Chai (1993) pioneered an organizational approach to terror groups, a growing literature discusses how terror and insurgent groups mitigate their agency problems (Gates, 2002;Weinstein, 2006;Shapiro and Siegel, 2007;Baccara and Bar-Isaac, 2008;Berman and Laitin, 2008;Enders and Jindapon, 2010;Shapiro, 2013 This paper is similar to a series of legislative signaling models that suggest diversity can be exploited for eciency gains for the principal (Gilligan and Krehbiel, 1989;Dewatripont and Tirole, 1999;Battaglini, 2002;Hirsch and Shotts, 2015). This paper dierentiates itself in two ways. First and most clearly, this paper examines subversion in a delegation setting rather than a signaling setting. Second, in the models above, after the principal organizes a diverse team of agents, the principal plays a critical role in realizing the eciency gains. In Gilligan and Krehbiel (1989); Dewatripont and Tirole (1999); Hirsch and Shotts (2015) the principal is able to assess the quality of a given policy, and in Battaglini (2002) the principal interprets a multidimensional message to construct an optimal policy. Here, in contrast, after the principal organizes a diverse team, the principal can rely on agents to do much of the management.
5 This paper is more similar to work on the politics of organizational decision making (see Gibbons et al. (2013) for a review), where organizational or institutional factors play a signicant role in determining how teams behave. For example, Bonatti and Rantakari (2016) describes how strategic interactions among agents with dierent policy preferences and the costs associated with developing policies can at times lead to greater allocations of eort into producing projects. This paper also speaks to a broad, largely case-based literature on what is needed for self-managing teams to function, which commonly emphases cooperation and communication among teammates and the value of stang teams with actors possessing minority views (Beyerlein and Johnson, 1994;Yeatts and Hyten, 1998).

Motivating Actors' Preferences
This discussion provides some brief background on insurgent groups to justify the utility functions in the general model. For a more thorough treatment, see Schram (2019). Within jihadist militant groups participating in multi-party insurgencies, there are three distinct groups of actors that each possess distinct preferences: the leadership, foreign agents, and domestic agents.
To be successful, the leadership must run an organization that balances attacking (at times Western backed) government actors and asserting dominance over civilian and rival insurgent groups (Whiteside, 2016 to care more about gaining a local monopoly on power in the short-term than did the leadership. Domestic ghters possess a pre-existing social network and connection to the local population. Through the common practices of radical Islamist insurgent groups like enforcing Sharia law and managing smuggling and racketeering (Moghadam and Fishman, 2010;Shapiro, 2013) domestic members of these groups could settle old grievances, protect themselves and their social network, and pursue wealth to an extent that ideologically driven outsiders (foreign ghters) or the group's leadership both could and would not. This perspective, that local ghters may pursue greed, grievance, or personal security at the expense of the insurgency movement, is consistent with anecdotal evidence within AQI and the Haqqani Network, as well as existing literature on agency problems within domestic insurgencies (Weinstein, 2006;Kalyvas, 2006;Hamid and Farrall, 2015;CTC, 2007a such as the one posed by western forces or western backed governments (Malet, 2010;Hegghammer, 2010). For that reason, foreign ghters prefer to engage Western security forces or non-radical Islamist governments (like the Afghan government under Hamid Karzai or the Iraqi government under Nouri al-Maliki) than engage co-religious militants or civilians (Hafez, 2010). Secondary documents on foreign ghter ideology and recruitment patterns (Felter and Fishman, 2007;Hafez, 2010;Kirdar, 2011), messages to would-be and existing foreign ghters (al Zarqawi, 2004), and internal documents discussing the motivations and religious devotion of foreign ghters (CTC, 2007a) all support this view. Of course, this is not to say that no foreign ghters were willing to declare other Islamists as apostates and to attack or kill these individuals.
Rather, foreign ghters preferred to engage Western forces or non-Islamist backed government forces more than they wished to engage coreligionists and to become involved in local political disputes.
Thus, leadership possessed preferences that were sometimes more in line with the preferences of foreign ghters, sometimes more in line with those of local ghters, and would depend on what was occurring locally. Overall, however, I also assume leadership preferences were closer to those of domestic ghters because both groups shared a desire to consolidate their power in the country after competing parties were defeated. In contrast, foreign ghters are generally viewed as less interested in securing a group's long run success and more interested in engaging Western or apostate government forces (Hegghammer, 2010). Should the militant group be successful, foreign ghters would move on to the next battle zone. Furthermore, foreign ghters, relative to domestic ghters, are younger and less experienced, are commonly viewed as more ideologically rigid, and are more interested in supporting the insurgent movement rather than the local politicking necessary to win an insurgency. This assumption has empirical signicance: because foreign ghters are less aligned with the preferences of the leadership, if forced to choose, leadership will work with local ghters rather than foreign ghters (as happened in AQI and al Shabaab).

Model
This is a principal-agents model of subversion with externalities between agents.
The model has two stages: a rst stage where the principal designs the organization, and a second stage where agents repeatedly conduct operations.
In the rst stage the principal can dene utility transfers to agents and can oversee the formation of a terror cell. Regarding utility transfers, the principal can transfer utility to the agents based on the agents' actions (an incentive contract) or at a at rate. In order to condition transfers on the agent's actions, the principal must set m = 1 and incur "monitoring" cost ζ > 0. If the principal sets m = 0, the principal does not incur a cost, but can still oer agents a at utility transfer. Incentive contracts will be described in more detail below. Regarding cell formation, agent 1 is assumed to be a domestic type, 6 and the principal sets o p ∈ {d, f, u} to designate agent 2 as a domestic Agents can accept or reject being in the group through setting b i = a or b i = r (respectively). If either agent selects r, then the game terminates and all actors receive their reservation utilities denoted by R p and R a for the principal and agents (respectively).
The second stage is an innite horizon game where agents repeatedly conduct operations. Time is discrete and indexed by t ∈ {1, 2, 3, ...}. At the start 6 This is a simplifying assumption, as the principal weakly prefers that agent 1 is the type of agent whose preferences are more in line with the leadership. I discuss this more in the expanded section on agent's preferences, but because foreign ghters lack a long-run stake in the insurgent group, I assume domestic ghters have prefrences that are more aligned with leadership of each period t, nature draws a realization of ω t ∈ [−1, 1] which represents what actions the principal wants the agents to perform. Each ω t is drawn independently from a continuous distribution function F with full support where E(ω t ) = 0. The distribution F is common knowledge and the agents observe ω t , but the principal does not. After ω t is realized, both agents simultaneously select actions a i,t ∈ R with i ∈ {1, 2}. The convexity of the action space captures that in a given period, agents allocate their time to some mixture of activities and that the principal has some most preferred mix of activities (represented by ω t ). In the insurgency setting, agents can For agent i ∈ {1, 2} that is type τ ∈ {d, f }, and letting j ∈ {1, 2} with i = j, summed across periods, agent i has utility function (1) I let δ ∈ (0, 1) denote the common discount factor and α > 0, β > 0 , and γ > 0 denote constants. I assume agents incur disutility when they and their partners select actions that deviate from their ideal points, as represented in the linear 7 loss terms −α|a i,t − χ τ | and −β|a j,t − χ τ |. Practically, these terms imply that domestic agents incur disutility when they and their teammates are not pursuing local power, and foreign agents incur disutility when they and their teammates are not engaging Western forces. Agents would be expected to value their own actions over the actions of their teammates, so I let α > β. Also, I assume that when agents deviate from what the principal would want them to do, they incur disutility, as represented in the −γ|a i,t −ω t | term. Practically, when agents subvert, they are inappropriately attacking actors, fostering new hostilities, or generally undertaking actions that have negative ramications for the group, which would have a negative impact on the misbehaving agents.
8 I assume α > γ, which implies agents are motivated to subvert. The G i,t function denotes the per-period utility transfer from the principal to agent i. I limit the analysis to contracting schedules that, for a transfer in period t, do not rely on events or information outside of what occurred in period t. The transfer function can then be dened as mapping The principal has utility function The principal most prefers both agents set their actions a i,t = a j,t = ω t .
Additionally, the principal may choose to pay incentive contracting and oversight costs, as represented in the G 1,t , G 2,t , mζ, and 1 o∈{d,f } κ terms.
For ease, assumption 0 summarizes the above assumptions on preferences.
7 I assume linear utility functions to allow the principal to achieve the rst-best outcome when using incentive contracts. Linear utilities here stack the deck in favor of using incentive contracts, which makes them a more competitive benchmark to creating selfmanaging teams.
8 It might also be expected that one agent's subverting would hurt that agent's teammate.
Making this assumption would strengthen the success of heterogeneous teams resolving agency problems.
7. Utilities are realized and the game repeats starting at step 5, updating the period to t = t + 1.
I limit my analysis to subgame perfect equilibria. Even so, multiple equilibria can exist in the repeated second stage. To further limit the set of subgame perfect equilibria, I introduce three criteria for equilibrium selection. First, I will only consider equilibria supported by Nash reversion. Second, I will only consider a type of subgame perfect equilibrium that I refer to as a shading equilibrium.
Denition: A subgame perfect equilibrium is a shading equilibrium if, 10 The assumption that agents only select values between their ideal points (χ d and ω t or χ f and ω t ) is in place to simplify analysis I relax this in the Additional Questions section, and the results do not substantively change.
As the third criterion, I assume that agents select the shading equilibria where agents shade the most towards the principal's most preferred actions ω t .
Regardless of whether a domestic-domestic or domestic-foreign team is formed, when the principal is not using utility transfers, both agents setting z 1 = z 2 = 0 always constitutes a shading equilibrium. However, this may not be the only shading equilibrium. The third criterion resolves this, and all three criterion are summarized in Assumption 1.
Assumption 1: Agents will select the shading equilibrium that is supported by Nash reversion and that is characterized by 9 For example, agents select z i = 0.6 on even periods and z i = 0.4 on odd periods. 10 For example, when ω t ≤ 0 domestic agents select a i,t = ω t − 0.5(ω t − χ d ) and when ω t > 0 domestic agents select a i, Being cognizant of Folk Theorem type results in repeated games, readers may be worried that Assumption 1 induces agents select an odd equilibrium where the equilibrium's peculiarities are necessary for heterogeneous self-managing teams to function. Discussing each criterion, limiting analysis to shading equilibria imposes a simple structure to equilibria analysis. Limiting analysis to equilibria supported by Nash reversion eliminates potentially implausible equilibria that rely on extreme o-path punishments also, this criterion means that any selected equilibrium will be weakly better for the agents than the z 1 = z 2 = 0 shading equilibrium (where agents match their actions to their ideal points). And, while limiting analysis to equilibria where agents select the actions that are closest to the principal's ideal point may seem strong, this is like assuming that the principal can nudge agents deciding between multiple equilibria into the one that is good for the organization (and that is Pareto improving for the agents from the z 1 = z 2 = 0 shading equilibrium); practically, by virtue of being the leader of a large, successful militant group, leadership probably has some managerial ability to convince agents not to play destructive equiliria. Also, in the Additional Questions Section, I relax Assumption 1 by considering both non-shading equilibria and the case when agents select actions that maximizes the team's joint utility; 11 these changes do not substantively change the results.
I derivez 1 andz 2 in the Appendix. To provide some intuition for these values, I plotz 1 andz 2 relative to χ d below. Note that the expressionsk f andk dkf are both decreasing in χ d .
Principal's utility As I show in the Appendix, agents are willing to shade up to levels z 1 ≤ min 1, z 2kd and z 2 ≤ min 1, z 1kf . Due to the maximization condition within Assumption 1, these inequalities will hold with equality. The feature that each agent's willingness to shade is an increasing function of their teammates shading level creates the three parts to the shading levels; to illustrate why, it is useful to compare the case whenk d < 1 andk f < 1 to the case whenk f ≥ 1 (which, by Assumption 0, implies thatk d ≥ 0). Whenk f ≥ 1 (the portion of Figure 1 to the left ofk f = 1), each agent is willing to shade at a level weakly greater than that of their teammates, resulting inz 1 =z 2 = 1 as the selected equilibrium shading levels. In contrast, whenk d < 1 andk f < 1 (which occurs for the smallest values of χ d within portion of Figure 1 to the right ofk dkf = 1), each agent is only willing to shade a fraction of their teammate's selected level of shading, makingz 1 =z 2 = 0 the only possible shading equilibrium. The equilibrium behavior between these parameter spaces is dictated by whether k dkf ≥ 1 ork dkf < 1, which is the cut point where non-zero shading levels can (or cannot) be supported. Thus, referencing the bullet points: whenk f ≥ 1, agents setz 1 = 1 andz 2 = 1, meaning they are completely self-managing their agency problems; whenk dkf ≥ 1 andk f < 1, agents setz 1 = 1 andz 2 =k f , meaning agent 1 matches their action to the principal's most preferred action, but agent 2 only partially self-manages; k dkf ≥ 1 (or fromk dkf ≥ 1 tok dkf < 1), then the principal's expected utility is strictly increasing (or strictly decreasing) in that variable.
• the expressionk dkf is increasing in χ f . If a change in χ f induces a change fromk dkf < 1 tok dkf ≥ 1 or fromk dkf ≥ 1 tok dkf < 1, the eects on the principal's utility are ambiguous.
• within the region wherek dkf < 1, the principal's expected utility is unchanging in α, β, and γ, strictly increasing in χ d , and strictly decreasing in χ f .
The most surprising result is that, whenk dkf ≥ 1, the principal's expected utility is weakly decreasing in χ d . As shown in Figure 1, when χ d decreases when the domestic agent's ideal point is further from the set of actions that the principal wants the agent to conduct the team will weakly shade more towards the principal's ideal actions, with weak utility gains for the principal.
This result contrasts standard ally principle type results and shows that the closer agent 1's ideal point is to the action the principal wants the agents to conduct, the weakly worse the principal does.
The intuition for why decreasing χ d can be better for the principal is as follows. Consider what a decrease in χ d does whenk dkf ≥ 1 andk f < 1. By decreasing χ d , it makes the Nash reversion punishment ofz 1 =z 2 = 0 worse for the foreign agent (agent 2) because χ d becomes further from χ f .
By making deviations from equilibrium behavior worse, agent 2 is willing to remain in a broader set of non-zero shading equilibria, which is reected in the increase ink f . Due to the maximization condition in Assumption 1, the increase ink f is reected in equilibrium behavior wherez 2 = min k f , 1 . Of course, decreasing χ d also aects agent 1. As a rst order eect, decreasing χ d makes agent 1 settingz 1 = 1 worse for agent 1. However, as a second order eect, decreasing χ d increasesz 2 , which makes remaining on the equilibrium path better for agent 1. In aggregate, decreases in χ d help support non-zero shading equilibria; taking rst order conditions of thek dkf expression shows that decreases in χ d increasesk dkf , meaning that a decrease in χ d will never break thek dkf ≥ 1 condition, implying that agent 1 is willing to remain at z 1 = 1. Altogether, whenk dkf ≥ 1 andk f < 1, decreasing χ d results in agent 1 remaining atz 1 = 1 and agent 2 selecting a greater level of shading, which is good for the principal. For similar reasons as outlined above, whenk dkf < 1, a decrease in χ d may ip thek dkf < 1 inequality tok dkf ≥ 1, resulting in agents changing from settingz 1 =z 2 = 0 toz 1 = 1 andz 2 > 0, which is good for the principal.
In contrast to the results on χ d , the closer agent 2's ideal point is to the set of actions that the principal wants the agent to conduct (smaller χ f ), the better the principal does. Taken together, the comparative statics on χ d and χ f results suggest that, conditional onk dkf ≥ 1, heterogeneous teams are most eective for the principal the closer |χ d | and |χ f | are to one another; essentially, it is best when agent's ideal points are closer to symmetric.
There is empirical evidence of insurgent leadership seeking agents with symmetric and osetting preferences. In ISIS, insurgent leadership likely had little control over the preferences of foreign ghters who traveled to ght for their cause; for heterogeneous teams to work most eectively, conditional on the extreme preferences of foreign ghters, the theory predicts that ISIS' leadership should recruit domestic agents that are fairly extreme, though in dierent ways from the foreign agents. ISIS accomplished this by recruiting former members of the Arab Socialist Ba'ath Party, whose members historically possessed a strong political ideology rather than religious identity (Fishman, 2016). Of course, ISIS is not the only group that started as a combination of individuals possessing distinct identities. The Red Commandos, a violent criminal organization that operates in Brazilan favelas, originated when leftist guerillas joined forces with robbers and murders during their shared time occupying high-security prisons in the 1970s-1980s (Grillo, 2016, 29-43).
Additionally, Observation 1 reveals that the more agents know of and care about the actions of their teammates (greater β), the Nash reversion punishment phase (agents setting a 1,t = χ d and a 2,t = χ f for all t following the defection) becomes worse, which in turn can support a greater range of productive (for the principal) shading equilibria. Consistent with theoretical expectations, militant groups do seem care about raising agents' intraorganizational awareness, which is one interpretation of β. For example, the Daesh newsletter and twitter account typically distributed information about group member's activities, ranging from providing public goods to committing beheadings. While communications that raise intra-organizational awareness may be benecial outside of selfmanaging teams for example, internal newsletters could breed useful competition among agents this model provides a new explanation for why raising the salience of others' activities within an organization can lead to greater productivity.
I include a discussion of other comparative statics in the Online Appendix.

How the Principal Behaves
As a nal simplifying assumption, I assume that all agents want to be in the militant group. Verbally, I am assuming that the worst equilibrium outcome for the agents in the second stage is still better than their reservation utility.
I will relax Assumption 2 in the Additional Questions section.  Denition: In the Heterogeneous Teams with Incentive Contracts Technique, the Principal sets m = 1, o p = f , and oers transfers, G 1,t (a 1 ) =

Heterogeneous Teams Technique
When the principal forms a heterogeneous team, agents will sometimes selfmanage with eciency gains for the principal. How the agents behave was characterized in the denitions ofz 1 andz 2 , and comparative statics were explored in Observation 1. I summarize the actions and expected utilities in Proposition 1.

Hands-O Technique
When the principal lets the team operate independently, agent 1 will form a homogeneous team, and agents will subvert.
Observation 2: Within the Hands-O Technique, the principal's expected utility does not change with α, β, or γ. The principal's expected utility is increasing in χ d and is unchanging in χ f .
The Hands-O Technique results in some standard ally-principle type results.
Within this technique, so long that α > γ (as assumed by Assumption 0), agents want to subvert. Thus, the closer χ d is to the principal's expected most-preferred action (E(ω t ) = 0), the better the principal will do.

Incentive Contracts Technique
When the principal oers agents G i,t = (α − γ)(a i,t − χ d ), agent 1 will partner with a domestic agent, and agents will not subvert.
Proof: See Appendix.
Under the transfers dened above, two domestic type agents are indierent over shading levels z i ∈ [0, 1]. Here agent 1 does best selecting a domestic type partner, and the principal achieves the rst-best contracting outcome where agents do not subvert.
Observation 3: Within the Incentive Contracting Technique, the principal's expected utility is strictly decreasing in α, unchanging in β, and strictly increasing in γ. The principal's expected utility is strictly increasing in χ d and is unchanging in χ f .
The Incentive Contracts Technique also results in ally principle type results.
As χ d increases, α decreases and γ increases, the agents' preferences are closer to those of the principal, making it less costly to buy good behavior through utility transfers.

Heterogeneous Teams with Incentive Contracts Technique
When the principal forms a heterogeneous team, oversees the agents' actions, and oers agents some optimal incentive contract, agents may shade towards the state of the world. I limit analysis to incentive contracts that adopt the common form of rewarding agents for deviating from their ideal points.
There is evidence that the Haqqani Network (from the 1980s to the present) and ISIS (circa 2016) used a version of self-managing teams. Both groups beneted from the existence of a safe haven in Pakistan or in Syria that reduced organizational oversight costs κ. And, both groups took pains to integrate foreign ghters into operations, whether it was the Haqqani Network embedding small teams of foreign ghters alongside their regular forces or creating integrated ranks (Hamid and Farrall, 2015, 65-167;Brown and Rassler, 2013, 189-190) or it was ISIS making sure teams consisted of agents from multiple backgrounds (Weiss, 2015;Zelin, 2018). This is not to say that foreign and domestic foreign agents got along; consistent with the discussion on preferences and theoretical expectations, 15 foreign and domestic ghters regularly clashed for ideological reasons (Brown and Rassler, 2013, 147-174).
However, both ISIS and the Haqqani Network were well managed despite being 14 Admittedly, treating organizing a one-o cost is likely an underestimation; a principal may initially form a heterogeneous team, but once operating, some teammates may undermine the team structure. In practice, dictating organizational structure likely comes with more than a one-o cost, but would require less involvement than monitoring the day-to-day actions of the teams.
15 On a heterogeneous team, agents receive less utility than they would on a homogeneous team staed by a diverse set of foreign and local agents (Lilleby, 2013;Gates and Podder, 2015).

It is not clear if any insurgent groups actually use the Heterogeneous Teams
with Incentive Contracts Technique. It is possible to imagine a young or small militant group that is large enough to merit the principal-agents treatment while still being small enough so that the principal can monitor agents' activities, provide exible transfers, know the recruits' types, and change the organizational structure at will. In the licit sector, this is more common with young companies or sports teams using combinations of dictating organizational structure with performance based incentives.

Considering the Perfectly Aligned Agent
The anti-ally-principle type results thus far have considered limited changes in agents' ideal points. A natural intuition is that the principal could form a better self-managing team if one of the misaligned agents were replaced by a subordinate whose preferences are fully in-line with the preferences of the principal. However, this intuition does not hold. Put another way, if the principal had a choice between an agent who valued exactly what the principal valued and a foreign ghter to act as teammate to a domestic ghter, in many cases, the principal can do strictly better selecting the foreign ghter. As intuition, creating a team of a domestic agent and a perfectly aligned agent removes much of the useful strategic tension that exists between foreign and domestic teammates. Adding the perfectly aligned agent can be valuable, but its value is derived largely from ally-principle type results rather than from the strategic interactions between teammates.
I will consider a perfectly aligned agent, who has utility function and will select actions a pa,t = (1 − z pa )ω t + z pa χ d . When z pa = 0, the perfectly aligned agent is selecting their most preferred action (which is also the principal's most preferred action), and when z pa > 0, the perfectly aligned agent is selecting actions closer to their domestic partner's ideal point. In equilibrium, a perfectly aligned agent and a domestic agent will setž 1 andž pa , which I introduce then describe below.
Using a Nash reversion punishment phase following deviations from equilibrium behavior, agent 1 is willing to shade up to z 1 ≤ min 1, z paǩd , and the perfectly aligned agent is willing to shade up to z pa ≤ min 1, z 1ǩpa . Because here each agent's level of shading is an increasing function of their teammate's level of shading, when the perfectly aligned agent selects z pa > 0, it can induce agent 1 to select an action that is closer to the principal's ideal point to an extent that may outweigh the disutility that the principal receives from z pa > 0. Thus, selecting z pa > 0 can follow from the maximization criterion on Proposition 5: Assume the principal forms a heterogeneous team with one domestic and one perfectly aligned agent.
• Agents set a 1,t =ž 1 ω t + (1 −ž 1 )χ d and a pa,t = (1 −ž pa )ω t +ž pa χ d for all t, 16 This condition is derived in the Appendix and follows from taking rst order conditions of the principal's utilty function with respect to agent 1's level of shading.
To compare the foreign-domestic team to the domestic-perfectly-aligned team (comparing Proposition 5 to Proposition 1), I must consider three distinct cases. First, whenk f ≥ 1 (withk f dened preceding Proposition 1), then the domestic-foreign team is fully self-managing with both agents setting a i,t = ω t . When this occurs, the foreign-domestic always outperforms the domestic-perfectly-aligned team, which never sets a 1,t = a 2,t = ω t . Second, whenk dkf < 1, then there is no productive shading occurring within the domestic-foreign team, meaning that replacing a foreign agent with a perfectly aligned agent will produce eciency gains through ally-principle type results.
Finally, whenk dkf ≥ 1 andk f < 1, sometimes the domestic-perfectly-aligned team outperforms the domestic-foreign team, while at other times it does not.

Agents Maximize Joint Utility (Modifying Assumption 1)
A natural concern with Assumption 1 is that it selects one equilibrium that is particularly good for the principal. As a simple alternative, I consider the case where agents maximize their team's per-period expected utility. This is equivalent to assuming that agents could side-contract to one another and not be concerned with hold-up problems. I compare the Incentive Contracts • Within the Heterogeneous Teams Technique, if 1 ≤ (β)/(α − γ) agents select a i,t = ω t and the principal receives expected payo EU p = −κ; otherwise, agents select a 1,t = χ d and a 2,t = χ f and the principal receives Proof: See Appendix.
For a heterogeneous team under Assumption 1, for any shading to occur, the conditionk dkf ≥ 1 must hold (as described in Proposition 1). For a heterogeneous team under the assumptions here, agents will match their action to the principal's ideal point when 1 ≤ β α−γ holds, which, based on conditions described in Assumption 0, is both easier to satisfy thank dkf ≥ 1 and generates more favorable degree of shading for the principal. Put another way: even if agents are disregarding what the principal wants, by maximizing their joint utility, they will, under a broader parameter set, do precisely what is best for the principal.
In contrast, changing from Assumption 1 to the assumption that agents maximize their joint utility, Incentive Contracts become worse for principal. Comparing Proposition 3 to Proposition 6, here the principal must pay each agent i an additional β(a j,t − χ d ) (with j = i) to get agents to match their actions to the state of the world. While a transfer of (α − γ)(a i,t − χ d ) will make agent i indierent over any action a i,t ∈ [χ d , ω t ], agent i can still benet when their teammate selects action χ d (relative to action ω t ). Here the additional β(a j,t − χ d ) transfer is necessary to make the team of agents jointly indierent over any action a i,t ∈ [χ d , ω t ].

Expanding the Agents' Action Sets (Modifying Assumption 1)
Referring back to the denition of a shading equilibrium, I previously limited z i ∈ [0, 1]. Here I assume z i ≥ 0, meaning agents are still following the shading structure as earlier, but here agents can select actions beyond the state of the world relative to their ideal point. I will refer to shading levels z i > 1 as overshading because, ceteris paribus, it describes the case where agent i shades beyond the level that the principal would most prefer that agent i undertakes (z i = 1). To summarize what is to come, letting z i ≥ 0 opens a new degree of freedom in the maximization criterion within Assumption 1, which can lead to better outcomes for the principal. In equilibrium, allowing for overshading, a team with a domestic and foreign type agent will select shading levelsz 1 and z 2 , which I introduce then describe below.
Denition:z 1 andz 2 are dened as Using a Nash reversion punishment phase following deviations from equilibrium behavior, agent 1 is willing to overshade (in other words, so long that z 1 > 1) at levels z 1 ≤ 2γ α+γ + z 2 , and agent 2 is willing to shade (in other words, so long that z 2 ≤ 1) at levels z 2 ≤ z 1 18 Because each 18 Solving these expressions for one another yields thek d andk f terms. agent's willingness to shade is an increasing functions of their teammates level of shading, when the domestic agent selects z 1 > 1, it can induce the foreign agent to select an action that is closer to the principal's ideal point to an extent that may outweigh the disutility the principal receives from z 1 > 1. Thus, selecting z 1 > 1 can follow from the maximization criterion in Assumption 1, and this occurs when Allowing for overshading sometimes does not induce any change in behavior (the rst two bullet points), while at other times can produce eciency gains for the principal (the remaining bullet points) For the rst bullet point, overshading is not productive for the principal. When βδχ f (α−γ)(χ f +1−δ) ≤ 1, the expression −|a 1,t (z 1 )−ω t |−|a 2,t (z 2 )−ω t | is not maximized through overshading, and wheñ k f ≥ 1, overshading is unnecessary because both agents are willing to always set a i,t = ω t when placed on a heterogeneous team. For the second bullet point, overshading would be productive ( βδχ f (α−γ)(χ f +1−δ) > 1 andk f < 1), but no feasible level of overshading is possible (k d ≤ 1). For the third bullet point, overshading is productive and agent 1 is willing to overshade, but agent 1 is unwilling to overshade to the degree such that agent 2 will match their action to the state of the world (z 1 =k d < 1 k f , which inducesz 2 =k f < 1). In the forth bullet point, overshading is productive, agent 1 is willing to overshade, to the point where agent 2 matches their actions to the state of the world (z 1 = 1 k f , which inducesz 2 = 1). In the nal bullet point, overshading is productive, and the nal inequality implies that −β 2 δ 2 χ f χ d (α+γ)(1−δ−χ d )(α−γ)(χ f +1−δ) ≥ 1; when this is the case, agent 1 will always be willing to overshade to the level wherez 1 = 1 k f . I dene equilibrium behavior and the principal's payos in Proposition 7.
Proposition 7: Assume z i ≥ 0. Under the Heterogeneous Teams Technique, • Agents set a 1,t =z 1 ω t + (1 −z 1 )χ d and a i,t =z 2 ω t + (1 −z 2 )(χ f ) for all t, 19 This condition is derived in the Appendix and follows from taking rst order conditions of the principal's utility function with respect to agent 1's level of shading.
Proof: See Appendix.
As an important follow-up to Proposition 7, the Appendix shows that increasing χ d can result in worse outcomes for the principal. Also in the Appendix, I include a discussion on shading equilibria.

Raising the Reservation Utility (Modifying Assumption 2)
In the Appendix, I also analyze a model where

Conclusion
Overall, the model presents a simple intuition for how self-managing teams can function. When left to their own devices, agents exhibit homophily and will at times subvert to the mutual benet of their like-minded teammates.
In contrast, when the principal requires that agents with dierent preferences work together, agents suer when their dierent-type teammates subvert. On a diverse team, agents may nd a mutually benecial point of compromise by not subverting. While organizing a diverse team is not always possible sometimes management cannot feasibly reach out to agents to insure diverse teams form forming diverse teams represents a low-cost way to mitigate agency problems. And, as the analysis above shows, this result is robust to a variety of assumptions and modeling technologies.
In some cases, what encourages dierent types of agents to self-manage is surprising. The key result as described in Observation 1 and explored in the Perfectly Aligned Agent example suggests that the principal will not always seek out agents that are the most aligned with themselves; rather, in contrast to standard ally-principle type results, the principal can achieve eciency gains by utilizing fringe agents that can oset the preferences of other agents within the organization.
The results here apply to subversion settings where constrained leadership must design eective teams from imperfect agents to operate in complex environments. The results that diversity is valuable may not apply to similar settings where leadership is less restricted. For example, many Western militaries deploy a hierarchy where leadership empowers one (or several) closely aligned agent to monitor proximate agents and to recommend rewards or punishments.
This model better describes cases where leaders face external, bureaucratic, or organizational constraints factors like counterinsurgency pressure, explicit rules on how the leadership can interact with agents, or a leader overseeing massive organization where the primary interaction for agents is between similarly empowered teammates rather than between the agent and a dened leader. While a thorough discussion of alternate cases of the model is beyond the scope of this paper, this model could also describe the behavior of various government agencies staed by multiple types of agents (for example, the economists and lawyers in the FTC, see Wilson 1989). Alternatively, this model could describe how to create more eective teams of police forces or aid workers. Additionally, this model could describe the agency problems in large corporations when managers want to fund specic pet projects, or in multinational corporations when managers want to over-invest capital development in regions where they have ties to.
My analysis suggests several avenues for future work. One possibility is to analyze how the principal can allocate funds to specic agents to make their actions have more impact, and what eect this has on the eciency of selfmanaging teams. Another is to consider how the principal may experiment with team composition in settings where agents possess unknown utility functions.

Online Appendix
A Full Equilibrium Strategies I describe full equilibrium behavior within all techniques, and provide a more detailed discussion of the the Heterogeneous Teams with Incentive Contracts Technique.

A.1 Heterogeneous Teams Technique
In the rst stage, the principal sets o p = f , m = 0, and G 1 = G 2 = 0. Also in this stage, both agents set b i = a. In the second stage, in period t = 1, each agent i who is type τ selects action a i,t =z i ω t + (1 −z i )χ τ , withz 1 and z 2 dened in the text. For periods t > 1, if in period t − 1 agents select the actions characterized byz 1 andz 2 , then in period t agent i selects the action characterized byz i . For periods t > 1, if in period t − 1 either agent deviates from selecting the actions characterized byz 1 orz 2 , then agent i selects the actions characterized by z i = 0 in period t and all future periods.

A.2 Hands-O Technique
In the rst stage, the principal sets o p = u, m = 0, and G 1 = G 2 = 0. Also in this stage, Agent 1 sets o a = d, and both agents set b i = a. In the second stage, both agents set a i,t = χ d for all t (z 1 = z 2 = 0).

A.3 Incentive Contracts Technique
In the rst stage, the principal sets o p = u, m = 1, and G i (a i,t ) = (α −γ)(a i,t − χ d ) for each agent i for all t. Also in this stage, Agent 1 sets o a = d, and both agents set b i = a. In the second stage, both agents set a i,t = ω t for all t (z 1 = z 2 = 1).
I will refer to g 1 and g 2 as the transfer constants. Also in this stage, both agents set b i = a. In the second stage, in period t = 1, each agent i who is type τ selects action a i,t =ẑ i ω t + (1 −ẑ i )χ τ , withẑ 1 andẑ 2 dened in the appendix.
For periods t > 1, if in period t − 1 agents select the actions characterized bŷ z 1 andẑ 2 , then in period t agent i selects the action characterized byẑ i . For periods t > 1, if in period t − 1 either agent deviates from selecting the actions characterized byẑ 1 orẑ 2 , then agent i selects the actions characterized by z i = 0 in period t and all future periods.
Given these actions, I modify expression (2) to dene the set of transfer 20 It is straightforward to show that the Principal would never want to make oers g 1 ≥ α − γ or g 2 ≥ α − γ.
constants (ĝ * 1 ,ĝ * 2 ) that the principal will select from: Because the principal's optimization function is neither continuous nor optimized over a closed interval, a natural concern is that the principal's optimization problem does not attain its maximum. However, it does.
With Lemma 1 in place, the principal's and agent's actions can be described.
To examine which equilibria can be sustained, I consider the cases when agents shade towards a state of the world that is furthest from their ideal point. These are the cases that present the greatest incentive for agents to defect. For agent 1 is ω t = 1 and for agent 2 is ω t = −1. I rst dene several values. Agent 1's worst 1 period payo (ω t = 1) for remaining on the equilibrium path is Agent 1's expected per-period utility for remaining on the equilibrium path is Agent 1's utility from an optimal deviation from ω t = 1 is Agent 1's expected per-period utility from being in the Nash reversion punishment phase is For agent 1 to remain on the equilibrium path, it must be that which can be simplied to .
A similar expression can be identied on the limits of z 2 , which comes from considering agent 2 facing an ω t = −1. This is .
These expressions are used to producez 1 andz 2 for the Heterogeneous Teams Technique, andẑ 1 andẑ 2 for the Heterogeneous Teams with Incentive Contracts Technique. It follows from the agent's utlity functions and reservation utilities that agents will both select b i = a.
There are two items to note here. First, as g 1 and g 2 approach α − γ, the right hand side of both expressions become greater than 1, meaning that transfers close to α − γ will not induce additional shading; in Lemma 1 I show that the pricnipal does strictly worse using transfer values close to α − γ. Second, so long that 0 ≤ g i < α − γ, the z 1 and z 2 are always positive.

C.2 Proving Proposition 2
If agent 1 selected a foreign type agent, in the repeated second stage, agents would select the strategies dened in the Heterogeneous Teams Technique.
Selecting into a heterogeneous team produces a lower expected utility for agent 1 than selecting a domestic type partner (comparing agent 1's utilities in Proposition 1 and Proposition 2).
It is stratiforward to see that a team of domestic type agents without recieving transfers does best setting a 1,t = a 2,t = χ d , and that the utiltiies from these actions exceeds each agent's reservation utility (making b = a equilibrium behavior).

C.3 Proving Proposition 3
With the oered transfer schedule G i (a i,t ) = (α − γ)(a i,t − χ d ) for both agents i, if agent 1 selected a foreign type agent, the foreign type agent would always set a i,t = χ f . This is strictly worse for agent 1 than selecting a domestic type agent 2.
When agent 1 and agent 2 are domestic type agents and are oered transfers they are indierent over all actions a i,t ∈ [χ d , ω t ] (put another way, they are indierent all shading levels z i ∈ [0, 1]), which makes any set of actions within that range an equilibrium. By the maximization criteron on Assumption 1, agents will select z 1 = z 2 = 1. It is straitforward to see that the utilities from setting z 1 = z 2 = 1 exceeds each agent's reservation utility (making b = a equilibrium behavior).
C.4 Proof of Lemma 1: I proceed by cases. In Cases 1 and 2, I dene a closed set of (g 1 , g 2 ) and show that all transfer constants outside of the set are either infeasible or strictly worse for the principal than values inside the closed set. I can then address any discontinuities to the principal's optimization function with the domain of the dened closed set, and I can show that in all cases a maximum still exists. In Case 3, I show that when the set I dened in the rst case is empty, a unique maximum exists.
Case 1: which, by the Assumption of the case, is nonempty. Throughout the proof, I use values where, by construction, (g 1 , g 2 ) ∈ G. As dened, g 1 is a useful value because when the principal sets G 1,t (a 1 ) = g 1 * (a 1,t, − χ d ) and G 2,t (a 2 ) = 0, then at these transfer valuesk d * k f ≥ 1. Thus, any payment to Agent 1 greater than g 1 is over-paying because it will not change the agents' actions. A similar logic holds for g 2 = g 2 and g 1 = 0.
To show that values of (g 1 , g 2 ) that fall outside of G are strictly worse for the principal requires a fairly tedious discussion of multiple cases. Before getting into the necessary casework, I introduce some notation. I dene these transfer value pairs as (ḡ 1 ,ḡ 2 ). I will abuse notation and let χ d = χ 1 and χ f = χ 2 as, within this case, agent 1 is domestic and agent 2 is foreign. Also, throughout this section, I dene i, j ∈ {1, 2}, where i = j. Before proceeding, one nal note I will consider cases where the principal overpays the agents: were it not for Assumption 1 (limiting to shading equilibria), there (a) would be open set issues where agents tries to select the largest or smallest action in an unbounded set, or (b) domestic agents may select actions larger than ω t and foreign agents may select actions smaller than ω t . In both cases, relaxing Assumption 2 would modify the process of the proof, but not the results.
Whenḡ i ≥ α − γ andḡ j ≥ α − γ, the principal's transfers will induce agents to set agents set a i,t = a i,j = ω t for all t. At transfer values g i and g j , agents set a i,t = a i,j = ω t for all t (equivalent actions) at a transfer rate that, by denition, is less than that dened in (ḡ 1 ,ḡ 2 ).
Whenḡ i > α − γ andḡ j ∈ [0, α − γ), then the principal's transfers induce agent i to select a i,t = ω t and will eliminate agent i's ability to use the Nash reversion punishment, 21 which results in agent j setting a j,t = χ j At transfer values g i andḡ j , agent i will select a i,t = ω t and agent j will shade some degree 0 ≤ẑ j ≤ 1 (weakly more favorable actions) at a transfer rate that, by denition, is less than that dened in (ḡ 1 ,ḡ 2 ).
Whenḡ i ∈ (g i , α − γ] andḡ j ∈ [0, α − γ), then the principal's transfers induce agent i to select a i,t = ω t while still allowing agent i the possibility of the Nash 21 At these transfer values, it is no longer a Nash equilibrium to set a i,t = 0. reversion punishment, which results in agent j selecting some shading level 0 ≤ẑ j ≤ 1. At transfer values g i andḡ j , agent i will select a i,t = ω t and agent j will shade some degree 0 ≤ẑ j ≤ 1 (equivalent actions) at a transfer rate that, by denition, is less than that dened in (ḡ 1 ,ḡ 2 ).
The examples above cover all possible transfer values falling outside of G.
Having shown that all points outside of G are strictly worse for the principal, the original optimization problem is equivalent to optimizing over the closed set This function possesses one discontinuity atk d * k f = 1. At this value, agents jump from not shading to some degree of shading; because the principal provides transfers when agents shade, based on the selected g 1 and g 2 , at k d * k f = 1 the function could increase or decrease at the discontinuity.
The principal's expected utility increases when the jump from not paying transfers (because agents setẑ 1 = 0 andẑ 2 = 0, the principal does not pay transfers) to paying transfers is productive and decreases when it is more cost than it it worth. I denote the set G as all pairs (g 1 , g 2 ) such that k d (g 1 ) * k f (g 2 ) = 1. There are three sub-cases to consider here. First, consider if for all (g 1 , g 2 ) ∈ G EU P (g 1 = 0, g 2 = 0) ≤ EU P (g 1 = g 1 , g 2 = g 2 ).
Note that the principal's expected utility from g 1 = 0 and g 2 = 0 is the same as the principal's utility from any (g 1 , g 2 ) where g 1 ≤ g 1 and g 2 ≤ g 2 , with one inequality holding strictly.
In the rst sub-case, the principal's optimization is upper semi-continuous and therefore attains its maximum over a closed set. Second, consider if some (g 1 , g 2 ) ∈ G have the property EU P (g 1 = 0, g 2 = 0) > EU P (g 1 = g 1 , g 2 = g 2 ). Here the function is not upper semi-continuous, but the principal can either (a) select the (g 1 , g 2 ) pair that does attain the maximum or (b) select the (g 1 , g 2 ) wherek d (g 1 ) * k f (g 2 ) > 1 that attains the maximum. Third, consider if for all (g 1 , g 2 ) ∈ G EU P (g 1 = 0, g 2 = 0) > EU P (g 1 = g 1 , g 2 = g 2 ). Here the function is not upper semicontinuous, but the principal can either (a) select g 1 = 0 and g 2 = 0 which attains the maximum or (b) select the (g 1 , g 2 ) wherek d (g 1 ) * k f (g 2 ) > 1 that attains the maximum.
Case 2: In this case, any transfer values g 1 > 0 and g 2 > α − γ + βδχ d is counterproductive. Thus the principal is optimizing a continuous function over a closed set, implying that a maximum exists.
Case 3: When these hold, agents both setting a i,t = ω t is supported as an equilibrium without transfers. Thus, a maximum exists at g 1 = g 2 = 0. k d * k f ≥ 1, Agents change from setting a 1,t = χ d and a 2,t = χ f to a 1,t = ω t and a 2,t = χ f −z 2 (χ f − ω t ). This shift always implies that agents are now closer to matching the principal's ideal actions. However, when χ f increases, for example, from χ f to χ f with χ f < χ f , and this results in a change from k d * k f < 1 tok d * k f ≥ 1, Agents change from setting a 1,t = χ d and a 2,t = χ f to a 1,t = ω t and a 2,t = χ f −z 2 (χ f − ω t ). This shift can lead to worse outcomes for the principal because if χ f is suciently very large, the new action χ f −z 2 (χ f − ω t ) can be further from the principal's ideal point than χ f was.

E.1 Full Equilibrium Strategy
In period t = 1, the domestic agent (agent 1) selects a 1,t =ž 1 ω t + (1 −ž 1 )χ d and the perfectly aligned agent selects a pa,t = (1 −ž pa )ω t +ž pa χ d , withž 1 anď z pa dened in the text. For periods t > 1, if in period t − 1 agents select the actions characterized byž 1 andž pa , then in period t the domestic or perfectly aligned agent selects the action characterized byž 1 orž pa (respectively). For periods t > 1, if in period t − 1 either agent deviates from selecting the actions characterized byž 1 andž pa , then the domestic or perfectly aligned agent selects the action characterized byž 1 = 0 orž pa = 0 (respectively) in period t and all future periods.

E.2 Proving Proposition 5
In equilibrium, agents shade by z 1 ∈ [0, 1] and z pa ∈ [0, 1], and deviations from the equilibrium path are met with the grim-trigger punishment phase of agents setting a 1,t = χ d and a pa,t = ω t for all t. The modication to Assumption 1 no longer implies that agents will select the largest degree of shading; rather they will select the degree of shading that benets the principal the most. If the perfectly aligned agent selects z pa = 0, then this will not induce any additional shading by the domestic agent. However, it can be possible for the perfectly aligned agent to move closer to agent 1's ideal point (set z pa > 0) to induce agent 1 to shade closer to the state of the world in such a way that will benet the principal.
Redening terms used earlier, Agent 1's worst 1 period payo (ω t = 1) for remaining on the equilibrium path is Agent 1's expected per-period utility for remaining on the equilibrium path is Agent 1's utility from an optimal deviation from ω t = 1 is Agent 1's expected per-period utility from being in the Nash reversion punishment phase is which can be simplied to .
A similar expression can be identied for the limits on z pa , which comes when the perfectly aligned agent faces a realization of ω t = 1. This is the worst-case for the perfectly aligned agent because the equation for shading implies that any z pa > 0 here will result in the largest move away from ω t . Disregarding the terms associated with β in the rst period, the perfectly aligned agent's worst 1 period payo (ω t = 1) for remaining on the equilibrium path is Agent 1's expected per-period utility for remaining on the equilibrium path is Agent 1's utility from an optimal deviation from ω t = 1 is U OF F,W pa = 0.
Agent 1's expected per-period utility from being in the Nash reversion punishment phase is For agent 1 to remain on the equilibrium path, it must be that Here I provide a more detailed discussion on when the principal would prefer the foreign-domestic team over the perfectly aligned agent-domestic team. For ease, I refer to the domestic-foreign agent team as the D-F team and the domestic-perfectly aligned agent team as the D-PA team. I compare expected per-period utilities.
Whenk f ≥ 1, then the D-F team are settingz 1 =z 2 = 1, which grants the principal a greater expected utility than anything the D-PA team does.
Whenk dkf < 1, then the D-F team is settingz 1 =z 2 = 0, which implies, for ally principle type reasons, the principal can do strictly better using the D-PA team.
For parameters wherek dkf ≥ 1 andk f < 1, then whether D-F teams or D-PA teams are better for the principal depends on whether one of two cases holds.
Case 1: The D-F team is better for the principal when −(1 −k f )χ f ≥χ d , which can be re-written as To oer some intuition on this condition, this inequality can hold or break depending on χ f . Logically, when the foreign type agent is very extreme (possessing a large χ f ), shading can still occur, but the foreign ghter's shading will not result in a selected action close to ω t . For example, when α = 1, β = 0.8, γ = 0.7, χ d = −2, δ = 0.9 and χ f = 5, the principal's per-period expected utility from the D-F team is ≈ −0.29 (withz 1 = 1 andz 2 ≈ 0.94) and the principal's expected utility from the D-PA team is −2 (withž 1 =ž pa = 0). However, keeping all parameters but χ f the same, when χ f = 10, the principal has per-period expected utility from the D-F team is ≈ −5.24 (withz 1 = 1 andz 2 ≈ 0.48) and the per-period expected utility from the D-PA team is still −2.
Case 2: The D-F team is better for the principal when Which can be re-written as Similar to the previous case, this inequality can hold or break depending on χ f . F Agents Maximize Joint Utility F.1 Proving Proposition 6 By matching action to the state of the world, a team of domestic agents receives joint expected utility 2(α + β)χ d . My matching action to their ideal points, a team of domestic agents receives joint expected utility 2γχ d . therefore, to properly motivate agents to match actions to the state of the world, the principal must transfer G i,t = (α − γ)(a i,t − χ d ) + β(a j,t − χ d ) to both agents, which combined is an expected per-period transfer of 2(α + β − γ)χ d .
By matching action to the state of the world, a team of one domestic and one foreign agent receives joint expected utility (α + β)(χ d − χ f ). My matching action to their ideal points, a team of domestic agents receives joint expected Through algebra, the condition 1 ≤ β/(α − γ) must hold for a diverse team to fully self-manage.
G Expanding the Agent's Action Sets

G.1 Equilibrium Behavior
The equilibrium behavior is nearly identical to that for a heterogeneous team, only now with actions characterized byz 1 andz 2 .
G.2 Proving Proposition 7 For reasons described in Proposition 1, agent 2's willingness to shade is z 2 ≤ −z 1 βδχ d (α−γ)(χ f +1−δ) . When agent 1 selects a shading level z 1 > 1 (overshading), removing the β term and shading associated with it in the rst period, 23 agent 1's worst 1 period payo (ω t = 1) for remaining on the equilibrium path is Agent 1's expected per-period utility for remaining on the equilibrium path is Agent 1's utility from an optimal deviation from ω t = 1 (after removing the β term and shading associated with it) is U OF F,W 1 = −γ (1 − χ d ).
Agent 1's expected per-period utility from being in the Nash reversion punishment phase is U OF F,EU 23 Because this is the one-period deviation payo, agent 1 receives the same payo stemming from agent 2's actions whether or not agent 1 remains on the equilibrium path.

G.4 Thoughts On Overshading
Empirically, it is dicult to know what to make of overshading equilibria.
Among the three techniques examined here, using the Hands-O Technique requires the smallest level of transfers. When z 1 = 1 and z 2 = 1, Heterogeneous Teams requires a greater transfer amount than Incentive Contracts. However, whenk dkf ≥ 1 andk f < 1, then sometimes Heterogeneous Teams requires a smaller expected per-period transfer. Whenk f < 1, agent 2 is selecting an action that is closer to agent 2's ideal point, and therefore does not need to be compensated as much to match their reservation utility.
The key take-away from Proposition 8 is that even with a high reservation utility, the principal may still use self-managing teams. While the principal always pays more in the Heterogeneous Teams Technique than in the Hands-O Technique, the agents will behave better when principal uses Heterogeneous Teams, which can justify the costs. While the the principal sometimes pays a larger expected per-period transfer in the Heterogeneous Teams Technique than in the Incentive Contracts Technique, the principal does not need to pay ζ each period, which can make Heterogeneous Teams overall cheaper.
Ultimately, while dierent agents do not want to work together without being provided with greater compensation, paying out a greater compensation can be worth the costs.