Get Complicated: The Effects of Complexity on Conversations over Potentially Intractable Moral Conﬂicts

Conﬂicts over important moral differences can divide communities and trap people in destructive spirals of enmity that become intractable. But these conﬂicts can also be managed constructively. Two laboratory studies investigating the underlying social – psychological dynamics of more tractable versus intractable moral conﬂicts are presented, which tested a core proposition derived from a dynamical systems theory of intractable conﬂict. It portrays more intractable conﬂicts as those, which have lost the complexity inherent to more constructive social relations and have collapsed into overly simpliﬁed, closed patterns of thinking, feeling, and acting that resist change. Employing our Difﬁcult Conversations Lab paradigm in which participants engage in genuine discussions over moral differences, we found that higher levels of cognitive, emotional, and behavioral complexity were associated with more tractable conversations. Whereas in a pilot study we examined conﬂicts that naturally became more/less intractable, in our main experiment, high versus low levels of cognitive complexity were induced. protracted conﬂict and polarization is exten-sive, our understanding of their causes and remedies remains fragmented and piecemeal. This article presents two empirical studies that test a basic proposition from a new dynamical systems theory of intractable conﬂict, which offers an integrative platform for comprehending their basic underlying psychological dynamics and present innovative methods for investigating them in a controlled setting The Difﬁcult Conversations Lab.

The model portrays intractable conflicts as those, which have lost the complexity inherent to more constructive social relations, evidencing more simplified, coherent, closed patterns of thinking, feeling, and acting which therefore become resistant to change. It employs insights from dynamical social psychology (Nowak & Vallacher, 1998;Vallacher et al., 1994; to conceptualize basic patterns that emerge over time, which result in more conformity and resistance to change. In other words, the model suggests that there are qualitatively different patterns of underlying psychosocial dynamics-the cognitions, emotions, and behaviors that people experience and evidence during a conflict-that lead to more tractable versus intractable conflicts. This view provides an integrative platform for conceptualizing (prior) research on complexity and conflict in terms of one basic mechanism: a collapse of complexity.
To illustrate, most relationships in which conflicts emerge are characterized by compatibilities and incompatibilities with respect to various issues and will therefore be multidimensional and nuanced (Katz & Kahn, 1978;Kelley & Thibaut, 1978;Schelling, 1960). Think of disputes that arise between you and your family, friends, or coworkers. These tend to be more nuanced relationships with many different moments, issues, and points of contention. The complexity of such relationships offers room for tradeoffs and complementary solutions, which address the needs of all parties. Conflict resolution in these instances is similar to problem-solving where both parties are open to exchanges and attempts to find solutions that best address their respective concerns (Curhan, Elfenbein, & Xu, 2006;Fisher, Ury, & Patton, 1991;Johnson, Johnson, & Tjosvold, 2006). The pattern of cognitions, emotions, and behaviors in these situations would therefore evidence higher levels of complexity.
In contrast, intractable conflicts are those in which, paradoxically, the conflict itself becomes increasingly more complex (involving new issues, circumstances, and disputants over time), but disputants' perceptions and experiences of the conflict become steadily more simplistic (us vs. them, good vs. evil), stable, and resistant to attempts at resolution. Think of the current tensions between Always-Trumpers and Never-Trumpers in the USA. Over time, both sides have become more certain, steadfast, and simplistic in their views on the issues and the people involved. This oversimplification is then further fueled when information that is new or contradictory is ignored or otherwise not processed (cf. Coleman et al., 2007;Suedfeld, 2010;Vallacher et al., 2010Vallacher et al., , 2013. Conflicts displaying these dynamics begin to evidence more simplified, coherent, and stable patterns for thinking, feeling, and action.
In sum, the dynamical system model of intractability suggests that a basic underlying difference between conflicts with more tractable versus more intractable outcomes is that the latter collapse into over-simplified, closed patterns of experience and action, while more tractable conflicts remain more nuanced and open to new information and change. This leads to our basic proposition: Differences in the levels of complexity in the underlying dynamics of cognitions, emotions, and behaviors of disputants are associated with different conflict outcomes; more complex dynamics are associated with more tractable outcomes, while more simple and constrained dynamics are associated with more intractable outcomes.
The work on the dynamical model of intractable conflict to date has been largely theoretical, even metaphorical (cf. Coleman et al., 2007;Vallacher et al., 2010). In the current studies, we empirically test a set of parameters gleaned from various strands of research suitable to measuring the dynamics of cognitive, emotional, and behavioral complexity of disputants engaged in a potentially intractable moral conflict.

Research Parameters for Assessing Complexity in Potentially Intractable Moral Conflict
Having laid the conceptual groundwork for our main proposition, the next step was to specify the parameters and hypotheses to operationalize and empirically test the proposition. First, for our studies in which two disagreeing individuals discussed a moral topic, we operationalized tractable versus intractable conflicts in terms of the outcomes of each of the conversations. Tractable conflicts were those in which participants were able to generate a joint position statement, which evidenced a common understanding of the issue and an advanced level of reasoning, indicative of a conflict that is likely to be resolved (Golec, 2002;Rosenberg, 1988). Intractable conflicts were those in which participants were unable to generate a joint position statement or produced one that showed a poor level of reasoning, indicative of a conflict resistant to resolution. We note that this operationalization of (in)tractability is rather narrow (focusing solely on the aspect "resistance to resolution"), a compromise we made in order to conduct a laboratory study. Accordingly, we refer to (in)tractable outcomes henceforth.
To assess the complexity of psychological dynamics underlying moral conflicts, we employed concepts used previously to measure differences in the complexity of cognitive processing, emotional experiences, and behavioral interactions-three basic attitudinal components (Breckler, 1984;Hilgard, 1980), which have been found to be directly implicated in more intractable conflict dynamics (Coleman, 2003).
Second, to conceptualize and operationalize cognitive complexity, we employed the concept of integrative complexity , a concept and measure of the degree to which cognitive processing involves recognizing multiple perspectives and possibilities and integrating them into a coherent view. In other words, it is a measure of the degree to which a complex problem is first differentiated, or analyzed into its component parts and then integrated or synthesized into an understanding that incorporates the interrelations of its parts (c.f. Baker-Brown et al., 1992;Suedfeld et al., 1992). Individuals in a state of higher integrative complexity tend to both differentiate issues-comprehend them from different points of view-and integrate this information into a coherent, global level of understanding. In contrast, lower levels of integrative complexity represent states of dichotomous, black-and-white thinking, in which contradictions and ambiguities are ignored.
Hypothesis 1. In conflicts over moral differences, higher levels of integrative complexity are associated with more tractable conflict outcomes than lower levels of integrative complexity, which are associated with more intractable outcomes.
Given the likelihood of negative emotions during discussions of difficult moral conflicts (Coleman, Goldman, & Kugler, 2009), it has been found to be necessary to have a sufficiently high level of positivity to buffer the deleterious effects of negative emotions in order to allow sufficient openness to the other party to learn and improve relations (Gottman, Murray et al., 2002). Therefore, we suggest that, especially in difficult moral conflicts, a higher ratio of positive-to-negative emotions is beneficial and likely to be associated with more tractable conflict outcomes.
Hypothesis 2. In conflicts over moral differences, higher positive-to-negative emotional ratios are associated with more tractable conflict outcomes than lower levels of these ratios, which are associated with more intractable outcomes.
Fourth, we investigated differences in the levels of behavioral complexity during moral conflicts or the array of competing behaviors that the participants displayed (Lawrence, Lenk, & Quinn, 2009). Specifically, we focused on the competing behaviors of inquiry versus advocacy while engaging in conflict. Inquiry refers to the act of asking questions that explore the other's positions and needs in conflict, whereas advocacy describes the act of stating or defending one's own positions and needs (Losada, 1999;Losada & Heaphy, 2004;Senge, 1990Senge, , 1993. Whereas advocating for one's own point of view is a dominant behavior in many conflict situations, inquiry is especially crucial to learn about the other's point of view in order to incorporate different perspective into a more complex understanding of the issue (Suedfeld, 1992), or to find common ground (Lax & Sebenius, 1986;Senge, 1993). In research on work teams, Losada found that inquiry actions acted as a counterweight to advocacy and were associated with more effective action and team performance (Losada, 1999;Losada & Heaphy, 2004). In the current research, we examined how ratios of inquiry-to-advocacy behaviors related to more and less tractable conflict outcomes.
Hypothesis 3. In conflicts over moral differences, higher levels of inquiry-to-advocacy behavioral ratios are associated with more tractable conflict outcomes than lower levels of these ratios, which are associated with more intractable outcomes.

The Difficult Conversations Lab (DCL)
In order to systematically study genuine moral conflicts in the laboratory that have a high potential to become intractable, we established the Difficult Conversations Lab (DCL) paradigm. The DCL is modeled after John Gottman's Love Lab (Gottman, 2020), which studies the dynamics of marital conflict. In the DCL, we invited two individuals to discuss and reach consensus on a potentially polarizing moral topic (e.g., abortion or euthanasia) on which the two individuals disagreed. Moral conflicts are typically based on incommensurate worldviews of complex issues and are hence hard to resolve and potentially lead to intractable outcomes (Coleman, 2003;Fiol et al., 2009;Pearce & Littlejohn, 1997). Consensus could be demonstrated by jointly generating a position statement on which both individuals agreed and with which they both felt sufficiently comfortable. A more advanced position statement (defined below) was employed as a proxy for more tractable conflict outcomes, no written statement or a poorly reasoned Volume 13, Number 3, Pages 211-230 215 statement was a proxy for more intractable conflict outcomes. Given that moral differences were discussed in the laboratory, we were able to assess cognitions directly affected by the discussions as well as the moment-to-moment emotional experiences and behaviors (i.e., underlying psychological dynamics) as they developed and resulted in more tractable versus intractable outcomes. Thus, we based our analyses on coding of actual ongoing conversations and written statements (Ericsson & Simon, 1980;Gottman et al., 1999;Shiffman, Stone, & Hufford, 2008).

General Procedure
First, participants received an online-questionnaire assessing their opinions on a diverse set of sociopolitical topics (different topics were required to simplify the matching of individuals into oppositional dyads and to assure that participants would come to the session not expecting to talk about a specific topic). About one week later, we invited two participants at a time to a 1.5-hr laboratory session; the two participants held opposing views about one topic, which was to be discussed during the laboratory session (they were initially unaware of each other's views on the topic). During the laboratory session, each dyad discussed a moral issue on which they disagreed for approximately 20 min and tried to reach consensus by writing a joint position statement on which they both agreed and with which they felt comfortable. They were informed that the statement was to be shared anonymously with a "Dialogue Forum of [the University]". The presence or absence of a position statement as well as the statements' level of political reasoning (Golec, 2002;Rosenberg, 1988) was used to assess whether conflicts turned out to be more or less (in)tractable. The discussion was audio-recorded and a facilitator was present, who did not intervene or speak unless necessary (it was not necessary in any of the discussions).
After the discussion, participants individually responded to a short questionnaire including an openended question assessing their momentary level of integrative complexity as affected by the discussion. (Note that we also included several scales for exploratory purposes addressing participants' experiences during the discussion; the items are not considered in this paper, but will be shared upon request; a sample items is: "How fair was the conversation?"). Next, participants listened to the audio recording of the discussion and coded their moment-to-moment emotional experiences during the discussion. Finally, participants were debriefed. Later on, trained coders listened to the audio recordings and coded participants' utterances with respect to inquiry and advocacy.

Measures
We assessed participants' opinions on the moral issues using scales from opinion poll research (details on the scales are provided in each of the studies' methods sections). Three additional questions addressed participants' concerns for the topics (e.g., "How concerned are you about the issue of [moral issue]?").
Conflict outcomes were assessed based on the jointly generated position statements. First, we considered whether or not participants managed to complete a joint position statement. Second, the quality of the statements they composed was assessed by coding their level of developmental advancement in political reasoning using the framework by Rosenberg (1988) and the coding manual by Golec (2002). Rosenberg defines a developmental sequence in political reasoning with 5 levels: On Level 1 (i.e., sequential level), political reasoning happens in concrete terms based on tangible aspects of observed reality without generalizing beyond the here and now and without understanding what unites or divides political groups in abstract, ideological terms. Level 3 (i.e., linear level) includes simple generalizations and abstractions as well as rules to establish categories and causal hypotheses to understand politics; evaluations happen on the basis of norms that are treated as immutable and absolute. On Level 5 (i.e., systematic level) political reality is described in abstract and complex terms: it is depicted as a result of interrelated factors and Volume 13, Number 3, Pages 211-230 216 viewed in terms of both broader political principles and specific configurations having led to its specific occurrence. Levels 2 and 4 represent transition states between the three main levels.
Integrative complexity was coded based on the statements that participants wrote individually, immediately following their discussion in the dyad. The prompt was as follows: "Please take a few minutes to think about the topic that you just discussed and write down all the thoughts which seem to be relevant to you". The coders followed the manual by Baker-Brown et al. (1992; for details on the IC coding scheme, we refer the reader to the following document available on the Internet: www2.psych.ubc.ca/ psuedfeld/MANUAL), which distinguishes seven levels of integrative complexity. Level 1 indicates no differentiation and no integration (i.e., "only one way of looking at the world is considered legitimate", Baker-Brown et al., 1992, p. 408); Level 3 indicates differentiation but no integration ("recognition of alternative perspectives or different dimensions, and the acceptance of these being relevant, legitimate, justifiable or valid", Baker-Brown et al., 1992, p. 411); Level 5 indicates differentiation and integration ("alternative perspectives or dimensions are not only held in focus simultaneously but also are viewed interactively", Baker-Brown et al., 1992, p. 414); Level 7 indicates differentiation and higher order integration ("an overarching viewpoint is presented, which contains an explanation of the organizing principles [e.g., temporal, causal, theoretical] of the problem or concept", Baker-Brown et al., 1992, p. 416). Levels 2, 4, and 6 form intermediate levels.
Participants coded their own emotional experiences during the discussion by listening to the audio recording of the conversation after the session. Using the mouse paradigm (Nowak & Vallacher, 1998;Vallacher et al., 1994), they indicated their experiences of their emotions from moment-to-moment on a continuum from very negative to neutral to very positive. This allowed us to measure the dynamics of the participants' experiences over time. The mouse paradigm is a computer program that registers the position of the mouse on the computer screen every second. While participants saw a black screen, the arrow of the mouse, and a white circle in the middle of the screen (i.e., neutral emotions), they were instructed to move the mouse more or less to the left/right when having experienced more or less negative/positive emotions. An index for negativity/positivity was calculated by the area that arose between the neutral middle and the mouse when moved to the left/right over the course of time. For analyses, we used the emotional ratio = ln (index for positivity/index for negativity). (The logarithmic calculus resulted in a symmetrical distribution from]À∞; ∞[ with 0 representing a 1:1-ratio.) This index accounts for both the valence of the emotions (positive À negative) and the arousal (degree of negativity and positivity; Feldman Barrett, 1998).
Participants' ratios of inquiry-to-advocacy behaviors during the conversation were coded second-bysecond after the sessions by trained coders listening to the discussions and using the mouse paradigm (see above). Different from the emotional coding, which was rated on a continuum, we differentiated between three categories of behaviors: inquiry utterances (left third of the screen), advocacy utterances (right third of the screen), neither/nor (middle third of the screen). Specifically, the coders were instructed to code "inquiry utterances" when hearing a question addressed to the other disputant with the purpose of better understanding or clarifying the other's point of view or finding out about the other's opinion. The coders were instructed to code "advocacy utterances" when hearing a sentence that illustrated, explained, or confirmed the person's own point of view or repeated their point of view. To train our two coders, we chose five conversations, which both coders individually coded. The coding of the conversations was compared, and differences were discussed until agreement over a common understanding of the coding was reached. For reasons of practicality, the remaining conversations were coded by one coder individually. If questions arose, they were discussed among coders and with the authors of the paper. For analyses, we calculated the inquiry-to-advocacy ratio = ln (% of inquiry/% of advocacy). (Again, the logarithmic calculus resulted in a symmetrical distribution from]À∞; ∞[ with 0 representing a 1:1-ratio.) Volume 13, Number 3, Pages 211-230 217

Pilot Study
Using the DCL paradigm, we first conducted a pilot study with 59 dyads. The data from this study were also used for other analyses (see Kurt, Kugler, Coleman, & Liebovitch, 2014). For the pilot study, we assumed that over the course of 59 conversations, some would naturally lead to more tractable versus intractable outcomes. These were designated post hoc and divided into extreme groups: 12 dyads having written an advanced statement (coded 4 or 5 for political reasoning) and 11 dyads having written no statement or a poor statement (coded 1 for political reasoning). The two extreme groups were then compared with regard to differences in integrative complexity, positive-to-negative emotional ratio, and inquiry-to-advocacy behavioral ratio. Henceforth, the analyses refer to the extreme groups.

Method
Given the general procedure of the DCL was described above, only study-specific information is provided here.

Sample
The pilot study was conducted at a large Northeastern University in the USA. We recruited students via postings, announcements, and mailing lists.

Measures
Participants discussed one of the following topics: death penalty, euthanasia, affirmative action, or abortion. Prior to the discussion in the prequestionnaire, we assessed participants' attitudes on these issues using published scales. The attitude toward death penalty and euthanasia were assessed by items from Gallup (death penalty: Jones, 2006;euthanasia: Carroll, 2007), abortion with items from the General Social Surveys (published by Scott & Schuman, 1988), and affirmative action by various items that were compiled by Swim and Miller (1999) to a scale. Extreme groups with respect to intractable versus tractable outcomes were formed based on the generation of the jointly written statements as well as the level of political reasoning of statements. Interrater reliability for coding political reasoning (Golec, 2002) was ICC = .95 (discrepancies were resolved after discussion). Dyads with no statement or a statement coded 1 were considered ending with a more intractable outcome, dyads with a statement coded 4 or 5 were considered ending with a more tractable outcome.
Cognitive complexity was assessed by coding individually written paragraphs generated after each session for integrative complexity (Baker-Brown et al., 1992; interrater reliability: ICC = .95; discrepancies were resolved after discussion). Emotional complexity was calculated by the emotional ratio = ln (index for positivity/index for negativity), based on participants' self-coding. Behavioral complexity was assessed by the behavioral ratio = ln (% of inquiry/% of advocacy), based on the coding of trained coders. Initial interrater agreements for inquiry and advocacy of "training conversations" ranged from ƙ = .69 to ƙ = .76; differences were discussed until a general agreement was reached and coding was continued individually.

Data analysis
To test our hypotheses, we compared the levels of cognitive, emotional, and behavioral complexity of the dyads whose conversations resulted in more tractable outcomes with those that Volume 13, Number 3, Pages 211-230 218 resulted in more intractable outcomes (i.e., the extreme groups). However, we could not simply compare the means of all individuals of each extreme group, as our data had a multi-level structure: the individuals were nested in dyads. Also, our data were measured on different levels: whereas integrative complexity, positive-to-negative emotional ratios, and inquiry-to-advocacy behavioral ratios were measured on the individual level, conflict (in)tractability was assessed on the dyadic level.
For the variables assessed on the individual level, we had to assume that individuals within one dyad were not independent from each other, which was supported by preliminary analyses showing an agreement within dyads up to ICC(2) = .63 (LeBreton & Senter, 2008). An ICC(2) of .63 is considered a large effect (cf. LeBreton & Senter, 2008) and close to the average agreement reported in the literature (Woehr et al., 2015), supporting the non-interdependence within our data. If not addressed, the non-interdependence biases significance tests and may lead to misinterpretation of results (Kenny, Kashy & Cook, 2006).
As suggested by Kenny et al. (2006), we addressed this issue by treating dyads (instead of individuals) as the unit of analysis. Thus, instead of directly comparing the means of the extreme groups, we first averaged the individuals' scores for each dyad (i.e., aggregating the variables on the level of the dyad), before we compared the average scores of the two groups using t-tests (with N = 23 dyads).
To aggregate our data on the level of the dyad, we used each dyad's mean (Kenny et al., 2006). In case of missing values of one of the individuals, only the value of the other individual was used (missing values for integrative complexity n = 9 individuals, due to individuals not having written a statement or statements that were "un-codable"; missing values for positive-to-negative emotional ratios n = 2 individuals, due to technical problems with the mouse paradigm; missing values for inquiry-to-advocacy behavioral ratios n = 1 individual, due to one person not having inquired and ln (0) not being defined).
Given that we hypothesized different emotional, cognitive, and behavioral patterns of complexity between the more and less tractable conversations, we assumed that emotions, cognitions, and behaviors would be related to each other. Exploratory analyses showed that the level of dyads' positivity-to-negativity emotional ratios were significantly positively correlated with their integrative complexity (r = .46, p = .028, N = 23) and their inquiry-to-advocacy behavioral ratios (r = .44, p = .035, N = 23). The levels of dyads' positivity-to-negativity emotional ratios were not significantly correlated with their inquiry-to-advocacy behavioral ratios (r = .34, p = .109, N = 23), even though the effect was medium (Cohen, 1992), which might be due to the small sample size.
In sum, individuals ending their discussions with tractable outcomes showed more complexity in their thinking, feeling and behaving than individuals ending their discussions with intractable outcomes. However, given that we compared post hoc extreme groups, causal conclusions could not be drawn.

Main Experiment
In our main experiment, we attempted to manipulate different levels of participants' complexity at the onset of the conflict discussion in order to influence the subsequent dynamics and outcomes. More specifically, individuals read a text prior to the discussion in the DCL, which contained basically the same information, but was written according to the standards of high versus low integrative complexity (see below). We hypothesized that individuals in the high-complexity versus low-complexity condition (both individuals of a dyad were assigned to the same condition) would show higher versus lower levels of integrative complexity (manipulation check), positive-to-negative emotional ratios, inquiry-to-advocacy behavioral ratios during the discussion, and finally, more tractable outcomes.

Method
In addition to the general procedure of the DCL described above, participants' levels of integrative complexity were manipulated prior to the discussion; also, we assessed levels of integrative complexity in the prequestionnaire, which allowed us to explore the change in integrative complexity caused by the manipulation and discussion.

Manipulation of integrative complexity
To induce high and low levels of integrative complexity, participants individually read a text about the topic of the discussion immediately prior to the discussion. While the amount and basic content of the information were the same in both conditions, the manner in which it was presented was fundamentally different: It was either written according to the standards of high or of low integrative complexity (Baker-Brown et al., 1992). More specifically the texts were structured as follows: Participants were first told that they would be given background information on the topic of [moral issue] as "it is important to consider different perspectives" (high-complexity condition) or "it is important to have a clear perspective" (low-complexity condition). Second, the legal framework regarding [moral topic] was presented (the text was the same for both conditions). Third, participants read several points regarding which opinions on the topic divaricate: In the high-complexity condition, opposing opinions were differentiated and integrated; in the low-complexity condition, opposing opinions were contrasted (depending on participant's view-pro vs. con-their own opinion was featured and the opposing view was framed as a counterargument). Thus, we prepared three different texts for each topic: One text for individuals in the high-complexity condition, one text for individuals in the low-complexity condition holding a pro-opinion and one text for individuals in the low-complexity condition holding a con-opinion. (Note each dyad consisted of two individuals with opposing views on a topic.) Example sentences from the high-complexity condition for the topic euthanasia are (note that the study was conducted in Germany): "In the Article 1 of the German Basic Law is written: Human dignity shall be inviolable. To respect and protect it shall be the duty of all state authority."... Is dignity violated if a person suffers a fatal illness and is in great pain? Is dignity violated, if ultimately other people decide about one's own life or death?... Regarding the decision to live or to die, an individual's reasons underlying the potential will to die should be understood and viewed in the light of preserving life." Example sentences from the low-complexity manipulation (pro-euthanasia) are: "In the Article 1 of the German Basic Law is written: Human dignity shall be inviolable. To respect and protect it shall be the duty of all state authority."... Dignity is violated if a person suffers a fatal illness and is in great pain. This is the case even if ultimately other people decide about one's own life or death. Regarding the decision to live or to die, the individual's will to die should be respected beyond preserving life." (Text translated by the authors.) Volume 13, Number 3, Pages 211-230 220 The effects of the manipulation were piloted with 61 individuals: After having read the manipulationtexts, individuals showed significantly different levels of integrative complexity in a written statement, t (59) = À2.04, p = .046, d = 0.52.

Sample
We recruited 88 participants (44 dyads) at a large University in Germany, who were paid 20 Euros (one dyad of originally 45 dyads had to be excluded due to technical problems with the recording). We recruited students via postings, announcements, and mailing lists. We assigned 22 dyads to each condition. The sample consisted of 47.7% female, who were M = 24.00 years old (SD = 4.51) and predominantly (90.9%) German.

Measures
Participants discussed one of the topics: euthanasia, abortion, or punishment of sexual offenders; participants' opinions were assessed with the same questions used in the pilot study (punishment of sexual offenders was a new topic for which items were generated by the authors).
The degree of tractability of outcomes was assessed by whether or not dyads had reached sufficient consensus to write a joint statement. If a statement was written, its level of political reasoning was coded by trained coders (Golec, 2002; interrater reliability: ICC = .96, discrepancies were resolved after discussion).
Cognitive complexity was assessed twice on the basis of individually written statements: prior to the discussion in the prequestionnaire and directly after the discussion. The statements were coded for integrative complexity (Baker-Brown et al., 1992; interrater reliability: ICC = .93, discrepancies were resolved after discussion). Emotional complexity was assessed by the emotional ratio = ln (index for positivity/index for negativity), based on participants self-coding using the mouse paradigm after the discussion. Behavioral complexity was assessed by the ratio = ln (% of inquiry/% of advocacy), based on the coding of trained coders after the discussion. Initial interrater agreements for inquiry and advocacy of "training" conversations ranged from ƙ = .60 to ƙ = .96; differences were discussed until a general agreement was reached and coding was continued individually.

Data analysis
To test our hypotheses, we compared the means of the two conditions with respect to differences in the discussants' experiences and actions during the conversation (i.e., cognitive, emotional, and behavioral complexity) as well as the outcomes of the discussion. However, a "simple" comparison of these variables was not possible, given the multi-level structure of our data: individuals were nested in dyads. Also, our variables were assessed on different levels: Whereas integrative complexity, positive-to-negative emotional ratio, and inquiry-to-advocacy behavioral ratio was measured on the individual level, conflict (in)tractability was assessed on the dyadic level. As described in the Pilot Study, we had to assume noninterdependence of our data, which was supported by preliminary analyses showing agreement within dyads up to ICC(2) = .59 (LeBreton & Senter, 2008). For the same reasons that were in outlined in the Pilot Study (see Data analysis), we followed the suggestions by Kenny et al. (2006) and treated dyads as the unit of analysis. Thus, we first aggregated all variables that were assessed on the individual level on the level of the dyad by using the means. In case of missing values of one of the individuals, only the value of the other individual was used (missing values for integrative complexity prior to the discussion n = 9 individuals, due to individuals not having written a statement or statements that were "un-codable"; missing values for integrative complexity after to the discussion n = 1 individual, due to no statement was written).
Entering all variables on the dyadic level, we compared the two conditions using t-tests and performed a mediation analysis based on regressions. The calculations were conducted in R: base and stats (R Core Team, 2018), car (Fox & Weisberg, 2011), psych (Revelle, 2018), and compute.es (Del Re, 2013). Mediation Analyses were conducted using PROCESS (Hayes, 2012) for assessing mediations with multiple mediators.

Results and Discussion
To test our hypotheses, we compared the dyads in the high-complexity and low-complexity conditions regarding their conflict dynamics and outcomes. The actual results of all analyses are provided in Table 1. Our manipulation of individuals' levels of integrative complexity prior to the discussion was successful in that dyads in the high-complexity condition showed significantly higher levels of integrative complexity after the discussion than in the low-complexity condition. Whereas individuals in the high-complexity condition displayed increases in their levels of integrative complexity from before to after the discussion, the individuals in the low-complexity condition showed decreases in their levels of integrative complexity. The change in integrative complexity different significantly between the two conditions. The cognitive manipulation also had effects on participants' emotions and behaviors: Individuals in the highcomplexity condition showed significantly higher levels of positive-to-negative emotional ratios and inquiry-to-advocacy behavioral ratios. Finally, the manipulation of high versus low complexity was also found to affect the conflict outcomes. Whereas all dyads in the high-complexity condition wrote a joint position statement, only 45.5% of the dyads in the low-complexity condition were able to generate a joint statement. The quality of the statements themselves also showed significant differences: In the low-complexity condition, the statements were rated as more poorly written in terms of political reasoning, in the high-complexity condition the statements were more advanced in terms of political reasoning.
To test our hypotheses in an overall analysis, we conducted a mediation analysis with three simultaneous mediators. The manipulation of low versus high levels of integrative complexity prior to the discussion (i.e., independent variable) positively influenced the level of complexity in participants' experiences and behaviors during the discussion (i.e., three simultaneous mediators: integrative complexity, positiveto-negative emotional ratios, and inquiry-to-advocacy behavioral ratios). The level of complexity in participants' experiences and behaviors in turn positively influenced the tractability of the discussion outcomes (i.e., the advancement in political reasoning, dependent variable). Using PROCESS (Hayes, 2012) for assessing mediations with multiple mediators, we found a significant indirect effect. The 95% biascorrected bootstrap CI with 5,000 replications did not include zero [0.01, 2.86]).
In additional analyses, we explored the relationships between cognitive, emotional, and behavioral complexity. We found a significant positive correlation of dyads' levels of integrative complexity and positive-to-negative emotional ratio (r = .32, p = .034, N = 44) as well as inquiry-to-advocacy behavioral ratio (r = .41, p = .006, N = 44). Dyads' levels of positive-to-negative emotional ratio and inquiry-to-advocacy behavioral ratio were only marginally positively correlated (r = .29, p = .058, N = 44); even the though the effect size indicated a medium relationship between the variables (Cohen, 1992), its non-significance may have been due to the relatively small sample size.
In sum, our results supported our hypotheses. They showed that not only more/less complex patterns of cognitions, emotions, and behaviors were associated with more tractable/intractable conflict outcomes, but also that the level of complexity could be influenced to support more tractable outcomes from discussions over moral conflicts. The complexity dynamics were significantly influenced by basic differences in a written text-something people read about sociopolitical topics all the time, for example, in the media-offering a starting point to enforce more constructive discussions in everyday life.

Summary and Contributions
This paper attempted to empirically address the question, under what conditions do conflicts over important moral differences go well or go poorly and result in intractable stalemates? It presented two studies from our Difficult Conversations Lab (DCL), which investigated the proposition that differences in the underlying cognitive, emotional, and behavioral complexity of the individuals involved would help account for differences in intractability. Both studies-a pilot study and a main experiment-provided support for our basic assumption: Conversations between individuals with more complex experiences and behaviors (i.e., higher levels of integrative complexity, positivity-to-negativity emotional ratios, and inquiry-to-advocacy behavioral ratios) resulted in more tractable outcomes than those with less complexity. The pilot study found these differences when comparing extreme groups with naturally occurring tractable versus intractable outcomes, while the main experiment found support by experimentally inducing different levels of integrative complexity. In the latter, higher levels of integrative complexity not only affected the outcome of the conflict, but also the complexity of the emotional experiences and behaviors of the participants during the conflict.
The current studies contribute to theory and research in two main ways. First, they offer a new methodological paradigm-the DCL paradigm-to study moral conflict in the laboratory, which allows for the investigation of conflicts over genuine moral differences. Most contemporary research on social conflict involves case studies of past events, large surveys of people's attitudes and perceptions of current events, or laboratory studies that use games or role plays to simulate conflict (Deutsch & Goldman, 2006). Here we have taken a different approach, as we were interested in studying the moment-to-moment experiences and actions of people engaged in genuine moral conflicts (Gottman has used a similar method to study marital conflict; Gottman et al., 1999). Measuring the dynamics of conflicts over time is a challenge, but the results of our studies suggest that it is promising to explore more sophisticated ways to investigate differences and changes in temporal dynamics. Second, the studies focused on differences in the complexity of underlying patterns of cognitions, emotions, and behaviors associated with more and less intractable conflicts, rather than singling out any one of these aspects. This idea has been proposed by the dynamical system model of intractable conflict (Coleman et al., 2007;Vallacher et al., 2010Vallacher et al., , 2013, which suggests that the patterns of parties' perceptions, experiences, and behaviors leading to intractability differ from those leading to tractability with respect to their complexity. The current studies offer a test of this proposition. Typically, research on intractable conflicts focuses on investigating specific aspects of their issues, individuals, relationships, or contexts that drive more recalcitrant outcomes (Coleman, 2003;Coleman et al., 2009;Kriesberg, 2005). The current studies contribute to theory and research by moving beyond mere testing of discreet components of intractability, and providing a more basic and parsimonious understanding of their underlying dynamics (Coleman et al., 2007;Vallacher et al., 2010).

Limitations and Future Research
The studies had several limitations. First, we studied 20-minute conflict discussions between two strangers. Even though we found evidence of different underlying dynamics between conflicts with more tractable versus intractable outcomes, the conflicts were brief, not involving ongoing relationships. The DCL could be used in future studies for capturing longer discussions or even conflicts between people and groups in ongoing relationships (Gottman, Murray et al., 2002).
Second, we manipulated differences in levels of integrative complexity by providing information about the topic of discussion, written according to the standards of high versus low levels of integrative complexity. Even though we believed the manipulation induced a modicum of experimental realism (Berkowitz & Donnerstein, 1982), it is possible that the effects may have also been due to demand characteristics, where the manipulation shaped expectations for the conversations. Different types of manipulations should be tested in future studies. For example, different levels of integrative complexity could be induced by presenting texts about topics unrelated to the focus of discussion, by conflict-facilitation techniques like the constructive controversy (Johnson et al., 2006), or through interactive discussion structures (Brodbeck et al., 2020). Furthermore, differences in the complexity of participants' emotions or behaviors could be manipulated in addition to their cognitions, which could have implications for the duration and sustainability of the induced effects.
Third, in these studies, we coded cognitions on the basis of a paragraph written by participants after the discussion, while emotions were coded by asking participants to recall their moment-to-moment experiences during the discussion while listening to a recording of the discussion. Both the written paragraph and the recall of emotions could potentially have been influenced by the actual outcome of the discussion. Future research should seek to refine our coding methods by, for example, having cognitions and emotions coded by external coders over time similar to our coding of inquiry and advocacy. For example, emotions could be coded by the coding system SPAFF (Coan & Gottman, 2007) or software solutions for facial recognition (e.g., https://azure.microsoft.com/en-us/services/cognitive-services/face). Also, researchers have recently attempted to code integrative complexity from ongoing conversations (Park & DeShon, 2018) or through software solutions (Conway, Conway, Gornick, & Houck, 2014). If all variables were coded by external coders or computer software, the problem of combining coding from different sources (i.e., external coders vs. participants) that we faced in our research could be avoided. Finally, even though our measures of cognitions, emotions, and behaviors were a reflection of the ongoing dynamics of the conflict or otherwise influenced by the dynamics, we chose not to apply dynamical and non-linear methods of analyses at this stage. Rather, we tried to capture the dynamics while still applying more standard methods of analyses widely used in psychological research. Future research could focus on tracking and analyzing non-linear analyses over time, in order to investigate the trajectories of interactions between cognitions, emotions, and behaviors within and between individuals. For example, such analysis could include focusing on differences in the initial starting conditions of conflicts, which have found to be crucial for subsequent social interactions Liebovitch et al., 2008;Vallacher et al., 2013).

Implications for Practice
One practical implication of our research is to highlight the importance of providing disputants in conflict with information that is not black-and-white but reveals different points of views on complex issues and describes them in relation to each other. Such approaches are already in use by many dialogue facilitation groups (e.g., by the National Issues Forum, see www.nifi.org). Furthermore, the current findings could encourage media, political decision makers, conflict interveners etc., to carefully consider how difficult topics are communicated in terms of the level of complexity. The results also suggest that conflict interventions increasing disputants' levels of complexity (integrative complexity, complexity in emotional experience, or in their behavior) could be beneficial for improving conflict processes and outcomes. Thus, interventions could be designed and validated that specifically tackle individuals or groups levels of cognitive, emotional, and/or behavioral complexity (cf. Brodbeck et al., 2020).

Conclusion
The research presented in this paper represents a first foray into what we hope to be a long-term program of research. Our goal is not to oversimplify such conflicts, but to better understand their essence in the context of their complexity.