Social Structure in the Explanation and Prediction of Social Discontinuities

Author(s): Goldstone, Jack A. | Abstract: Response to Lempert: A Response to Multipath Forecasting

needed to recruit officials and maintain resources; elites needed to maintain their positions and the assets and incomes that supported those positions; and ordinary people needed to find places in work, land, churches, and communities that provided them with reasonable returns for their labor and their acceptance of their status. This meant that social reproduction over time could never be taken for granted; institutions for taxation, social mobility, and the production and distribution of resources across the population always had to adapt to changes in the size, structure (age structure, urban/rural mix, ethnic mix), and beliefs of the society. Failure of those institutions to adapt over time, or radical changes that went against established habits and beliefs, could destabilize any society.
The combination of fairly rigid institutions in the face of sustained and cumulative demographic changes would therefore likely produce national rebellions and revolutions. This explanation of revolutions, and their distribution across time and space in Eurasia from 1500 to 1850 that was offered in my book Revolutions andRebellions in Early Modern World (2016 [1991]), thus was rooted in a holistic theory of social reproduction, stability, and instability, with revolutions the result of one particular dynamic in social systems.
I find it fascinating to learn that Lempert had developed a similar approach, which he published in an analysis of stability and resilience in Mauritius (Lempert 1987). As Lempert writes, we evidently came to a similar theory, working wholly independently, at about the same time. Lempert seems concerned about priority of theory development, noting a Yale prize he received in 1980 for an early version of his work. I can observe that I first developed the idea of a global demographic theory of social order and revolution in my Harvard dissertation proposal of 1979-which was rejected as too ambitious and unprovable (as I related in my article in this journal [Goldstone 2017]). I then scaled down my plan, aiming to demonstrate the viability of the structural-demographic approach simply for explaining the English Revolution of 1640-1660. The dissertation, with the full mathematical model, was accepted at Harvard in 1981. But I didn't dare publish it until I had first cleared a path with a series of articles debunking existing theories of revolutions and their application in this case, and demonstrating one core element of the theory, regarding price movements, and another on the demographic dynamics of early modern England (Goldstone 1980(Goldstone , 1982(Goldstone , 1983(Goldstone , 1984(Goldstone , and 1986b. Only in 1985 did I feel ready to publish both the mathematical model of state breakdown in seventeenth century England (Goldstone 1985a(Goldstone , 1986a and to relate in narrative form for historians the comparative application of the theory to the Ottoman Empire and China (Goldstone 1985b). And only after another half-decade of research was I ready to publish the full comparative account in Revolution and Rebellion. It seems that throughout this period, Lempert was developing and refining his model as well. It may be that, as with Darwin and Wallace and many other sets, an underlying truth was there waiting to be discovered, so that multiple discovery is, as Robert Merton (1961) argued, fairly common. I would guess that my colleagues in the field of cliodynamics only hope that someday the structural-demographic theory will be considered of such significance that historians of science will want to look into the circumstances of its discovery and development! Of greater import is that when I decided to focus on the English Revolution, I sought as a dissertation advisor the great sociologist and social anthropologist George Homans, who, among his other polymath endowments, was an expert on the development of medieval and early modern England (Homans 1941). Homans introduced me to the work of Malinowski, Harris, Service, Levi-Strauss, and others who evidently also shaped Lempert's views; thus, we have common roots at the base of our common approach.
So, I have some confidence that I understand Lempert's concerns and that when I say I believe we can overcome them, it is not merely hope or illusion.
Let me take on three issues: (1) the lumping of variables out of context; (2) the bridging of different levels of analysis-macro, meso, and micro; and (3) the use of variables that seem "fuzzy," based on trust, feelings, narratives, emotions, etc.

How to Compose a Model
Lempert rightly warns against using "statistical models (time series or regression analyses) that rip independent variables out of context in ways that undermine the idea of integrated modeling." As it happens, I played a major role in a modeling effort that encountered just that risk. The Political Instability Task Force was gathered by the US government to develop predictive models of political instability and crises, including revolutions, civil wars, genocides, and democratic collapse (Esty et al. 1999). It included more than a dozen scholars, drawn from a variety of fields including social scientists, natural scientists and statistical experts who sought a method to identify a particular set of needles in a large haystack-that is, to identify the several dozen country-years that were most likely to have been followed by the outbreak of such political crises, out of the roughly five thousand country years we observed from 1955 to 2000.
Developing a model to identify such rare events proved challenging. There were two main approaches advocated in the Task Force. One was the method that causes Lempert anxiety: gather a vast amount of data, in form of long lists of independent variables, and use sophisticated statistical methods to find correlations with changes in the dependent variable. We used neural networks (a simple form of AI), different kinds of econometric models, analysis of variance, step regressions and came up with-almost nothing. That is, no matter how many variables we 'threw' at the problem, after several years of gathering data and testing models we were barely able to accurately identify more than half of the critical instability onset country-years two years in advance. The coup de grâce for this method came when our sponsors invited an outside data-mining firm to comb through our data looking for correlations we had missed. The firm promised to be wholly atheoretical and thus not be misled by preconceived notions from possibly outdated social science theory. They even labelled all our independent variables-which we had sorted into 'political', 'economic', 'environmental', and 'demographic-social' categorieswith blind labels so as not to be misled by content. We thus waited with some anxiety when the firm came to report their results, and announced that in mining the blind-labelled data they had found a major correlation that we had missed.
To the embarrassment of the firm, when the blind label was stripped away and the actual variable of interest was identified, it turned out not to be one of our independent variables of interest at all. Rather, it was the alphabetical order of the 3-letter abbreviation that was given to each country as an identifier for the model. Other things being equal, countries with names starting lower in the alphabetsuch as Canada, Denmark, France, and Germany-had lower rates of association with political crises than countries with names starting later in the alphabet, such as Uganda, Venezuela, Zambia, and Zimbabwe. This is indeed the kind of association in the data we never would have caught. But it was also wholly irrelevant to prediction or policy, as no one thought that Zimbabwe would acquire greater stability and resilience if it simply switched its name to "Africanistan." The other approach, advocated by several members of the Task Force, was to start with a theory of social change and instability, focus on variables that we believed were important based on that theory, and then look closely at the behavior of models employing various combinations and interactions among those variables. I personally had hoped that demographic variables would be important, given my work on population and revolutions. Yet while high rates of infant mortality, which we felt indicated low quality of governance, were important, other demographic variables, such as age structure and urbanization, did not prove robustly significant, at least in the time period  for which we developed our analysis.
As it happens, the demographic-structural model is more useful for identifying growing risks over a long period, rather than identifying the precise year in which a political crisis will occur. As I will note below, this is much like the geophysics of earthquakes, where measuring stress along fault lines can tell you where risks of an earthquake are growing, but cannot provide precise predictions of where a quake will occur a year or more in advance. Moreover, from 1955-2005, when most countries around the world had growing populations, there was not great variance in some of the demographic variables. As it turns out, for the period after 2005, when countries around the world were much more varied in their progress on the demographic transition, demographic variables are more important in forecasting political crises (Bowlsby et al. 2019).
In the period examined by the PITF, the breakthrough came when we considered two variables that we believed were an important part of the problem of vulnerability to crises: factionalism among political elites and regime type. We long knew that factionalism was associated with increased risk of conflict, and that anocracies-regimes that were intermediate or transitional between full democracy and full dictatorship-were more likely to see political violence (Fearon and Laitin 2003). What we only discovered upon close inspection of various models is that the interaction between factionalism and anocracy was an enormously powerful predictor of coming crises. This in fact made sense: in a society already factionalized by ethnic or economic or religious issues, a full dictatorship could keep conflicts bottled up if it controlled effective coercive and economic levers. A full democracy could manage to reconcile such conflicts through compromise, legislation, and judicial rulings. But a partial or transitional democracy created great potential for crisis, by creating a situation that encouraged the open expression of such factional conflicts, but without having trusted and established institutions to manage them.
When we developed a new holistic variable for regime type, drawing on several elements of the Polity Data on regime characteristics (Goldstone et al. 2010, Figure  1), and combined it with a few other logical independent variables (infant mortality, discrimination, and conflicts in neighboring states), the new model was highly successful in identifying conflict onsets. To be sure, we also had to switch to a different form of modeling borrowed from epidemiology, based on repeated sampling of the non-conflict cases and matched comparisons with the conflict cases (King and Zeng 2001). Nonetheless, the key to making progress was to draw on theory and develop a holistic view of the underlying events. Only then could we develop a model that made sense of the data.

Bridging Levels of Analysis: Networks and Systems
To be sure, while the PITF model was more successful in its goals than competitors, it remained flawed. First, as noted above, the particular model we developed based on data in the half century from 1955-2004 did not fit as well in changed global circumstances, doing much more poorly in identifying crises in 2005-2015 (Bowlsby et al. 2019). In the later period, crises were rarer, and there were fewer violent civil wars and revolutions and more non-violent movements for political change (Chenoweth and Stephan 2012).
In addition, the model did not differentiate as well within the category of authoritarian regimes. That is, we showed that factionalized partial democracies were by far the most likely regime types to suffer crises. But after 2005, more crises came from the failure of various kinds of authoritarian regimes and, thus, this particular category needed more attention. This is part of the reason several scholars of the cliodynamics school have launched this new effort to better understand crisis onsets.
We hope to make progress by going beyond the macro-level, country-year data that was used in the PITF modeling effort. We are still interested in theory-driven, holistic approaches to identify, as Lempert notes, interacting sets of variables. But we wish to broaden the range of variables we are examining. To be sure, outstanding work has already been done to develop sub-national conflict data, based on geographic grid coordinates, that let us get away from treating countries as undifferentiated wholes (Rustad et al. 2011). It is also the case that even national-level variables are insufficient; revolutions are often even more dependent on international contexts and how they interact with national conditions than with national or subnational events (Lawson 2015). But we are also interested in other dimensions. For example, conflict onset usually is rooted in a combination of structural variables that create vulnerability plus trigger events that change perceptions of risk or shift alliances, precipitating a latent conflict (Goldstone 2014). Some of those trigger events may be identifiable as macro-level events, such as a national election or a succession crisis. But some may simply be individual level events, such as the self-immolation of a fruit-vendor, that when inserted into social networks in a nation with an at-risk regime, turn out to precipitate a crisis.
Lempert is correct when he cautions that "The suggestion of the use of 'microdynamics' and 'collective macro-level events'" combines "two different levels of analysis (group behaviors, that historically have been explained by models in anthropology, the holistic social science, and individual behaviors, that have historically been explained by psychology)", and that "As far as I know, no … model exists that would link the two levels." Yet I believe we can fruitfully combine these levels in a model of social dynamics. First, we are not trying to explain the individual level behaviors and the macro-level behaviors. Rather, we are trying to explain collective macro-level events (i.e. major political crises) by using data drawing on both individual-level and structural variables, as well as some mesolevel variables that will link them together.
It is precisely to do this that we are looking to methods such as network analysis and agent-based models. Network structure is precisely a way of aggregating individual-level relationships to identify meso-level (e.g. regional or small group) and national-level social structures. In the case of Tunisia, mentioned above, the self-immolation of a fruit vendor triggered a major crisis. Why did it do so? First, because the structural conditions in Tunisia-an old and corrupt regime, high levels of police harassment, high levels of education and civil society organization, and high levels of youth unemployment-were conducive to non-violent mobilization for change (Goldstone 2011). But second, and most important, the individual who immolated himself was connected through strong clan networks in southern Tunisia that spread word of his fate and continued his protest, which was then in turn amplified by social media that linked this rural event to the civil society organizations in the cities.
Network structures can be a powerful tool for understanding dynamics at different levels of social organization. Here, it is interesting that Lempert and I again seem to have hit upon a relationship of some importance, independently. In a recent publication, Lempert (2016) develops a theory of the evolution of political regimes, using relational variables. His main argument is that democratic institutions cannot simply be put in place in any society, but must develop from within the proper social context. That is a context of dispersed and equal social relationships, as occurs at the early stage of social evolution in small-scale lowtechnology societies, but which then fades away as societies become more hierarchical and oligarchic as they become denser and richer. As societies continue to develop through large-scale industrialization, they may then become even more hierarchical (totalitarian), or, in favorable circumstances of dispersed hierarchy and greater individual autonomy, return to more democratic governance.
I have been working for nearly a decade on a very similar argument (Goldstone and Kocornik-Mina 2009). Studying the trajectory of countries from dictatorship to democracy, I noted that efforts to install democratic institutions in most countries fail, and that such countries oscillate ('bounce' or 'cycle') from dictatorship to democracy and back. I argued that countries with strongly hierarchical social structures were unable to achieve stable democracy; only where social networks were more horizontal and widely dispersed was stable democracy able to prevail. The way I put this now in conversation is that trying to install democratic institutions in predominantly hierarchical societies is like pinning wings on a caterpillar-it does not make it a butterfly! Instead, the caterpillar has to go through its own pupae and internal transformation of its own structure first; similarly, societies have to transform their basic network structure from patronage-dominated hierarchies to wider and more horizontal networks that provide more support and autonomy to individuals before democratic institutions can operate effectively and become stable.
Network analysis thus offers a way to bridge between local organization and macro-structural outcomes. Moreover, agent-based modeling provides a way to explore the dynamics within networks under different background structural conditions and different constraints or incentives on micro-behavior. While it is too early in our research to show how all these elements might work together, we do have examples of using event-level data on elite interactions, coded as 'cooperative' or 'conflictual' to predict the success or failure of democratic institutions in transitioning countries (Dewal, Goldstone, and Volpe 2013). We also are starting to acquire extensive data on social networks from social media that allow us to identify structures of relations through which individual-level messages and events can spread. Such data is being used, along with structural variables, to explain the size of social protests (Steinert-Threlkeld, Won, and Joo 2019).

What Are the "Right" Variables?
While we now have increasingly helpful tools to explore the relationships among variables of different kinds and different levels of analysis, and how their interactions can drive social dynamics, Lempert still is right to raise the question: do we have the right variables? Or is there a danger of being caught up in what he calls "subjective variables" involving "metaphors", "cultural narratives", "ethos", "affect", "emotional tone", "optimism", and "semi-emotional clusters"?
It is reasonable that an anthropologist would be apprehensive that biologists and other social scientists might be willing to use culture-one of the core concepts of the discipline-in a careless way, developing variables that reduce 'culture' to one or more poorly defined and badly measured factors or 'variables' in a model. Yet the desire to use variables measuring cultural contexts is in this case driven by the reverse concern: that models excluding such variables for being "too subjective" have been misleading and damaging to political science and economics.
It is well known since the Nobel-winning work of Kahneman and Tversky (1979) that economists erred in constructing decision models that did not explicitly model the subjective variables of perceived risk and prospective value. Political scientists, driven by the behavioral revolution, similarly erred when they dismissed notions of "trust" and "legitimacy" as too subjective to measure (Fukuyama 1996;Levi 1997). It is thus a reaction against the prior neglect of such factors that leads us to include them in our range of explanatory variables. It is true that measures of these factors that are not reproducible and objectively verifiable have no scientific value. But that does not mean these factors cannot be measured. The works cited in this paragraph and many others have shown how psychological tests, surveys, and proxy measures can provide robust measures of interpersonal trust and the legitimacy enjoyed by authorities. Such measures are a valuable, even critical part of the context in which structural factors, network ties, and individual or group-level events interact to produce complex social outcomes.

Explanation and Prediction of Discontinuities: Epistemological and Moral Considerations
Of course, even having the right variables and a variety of useful methodologies at hand does not mean that the right model for prediction will be found, or is even attainable. Political crises are discontinuities in complex systems, each of which is slightly different and for which controlled experiments are neither possible nor morally permissible. Prediction that meets the standards of controlled laboratory sciences therefore may not be possible.
We may be dealing with phenomena more like tropical storms (in which broad seasonal and long-term cyclical patterns are readily found, but these do not allow for advance predictions of precisely when and where such storms will strike in the coming year), or earthquakes (in which fault lines can be identified as the likely sites of stress, and pressure build up and knowledge of cyclic patterns can forecast the rough frequency and location of quakes, but individual quakes still cannot be predicted much in advance, as the 'trigger' that allows plates to slip cannot be readily forecast).
In such cases, explanation does not lead directly to prediction. It may be quite possible to use models to explain, after the fact, why a revolution or democratic collapse occurred, tracing the context and pattern and interaction of actors and events. And it may be possible, using that knowledge, to identify regions and timeframes of increasing risks. It may even be possible, by knowing what context and interactions make crises more or less likely, to suggest changes in policy to shift the context to lower the likelihood that a crisis will occur.
Does that create the risk that evil powers will use such knowledge to manipulate events in order to create such crises? I think not. For one thing, it is already well known by enemies of democracy that to weaken them, one should sow confusion and untruths, enhance factionalism and mistrust, and discredit the competence of leaders and elites. Hitler in Weimar Germany, Russia's troll factories, and demagogues throughout history had an instinctive mastery of these principles. I doubt that anything we learn would make their jobs easier (they are easy enough already). Rather, I hope that by making explicit the context and interactions that favor stability, rather than decay and crises, that democracies can better protect themselves against enemies seeking to undermine them.
But what about the United States or other great powers seeking to meddle in the affairs of other nations to change their regimes? If one takes Iraq and Afghanistan as examples, it is clear that if the US resolves to use its military to change regimes in other countries, it is likely to be able to do so. But it is equally clear that US policy makers have had little idea of how-and whether it is possible -to create contexts following these invasions that would create stable and resilient and legitimate governments. Instead, ongoing and chronic crises have followed. If there is an ethical imperative here, it would be that states precipitating regime changes should have a responsibility to help create stable, safe, and legitimate governments thereafter, and better understanding the basis for such governments would be of great value.
Finally, what about the scenario where the US is accused of fomenting "color revolutions" and thus sowing disorder across the globe? Would a better understanding of the causes of political crises increase the US appetite, and propensity, for such meddling? In fact, one of the lessons of the study of revolutions is that the ability of external states to influence the stability of other states, short of military intervention, is rather small. Where color revolutions have overthrown leaders-whether Marcos in the Philippines, Mubarak in Egypt, or Yanukovych in Ukraine-it was the actions of those leaders and their interaction with their own elites and peoples that caused their downfall; the US was generally caught by surprise and only became engaged once events had already spiraled out of control (Nepstad 2011). In the one case where the US has been steadily seeking the overturning of a government from abroad-the Islamic Republic of Iran-the naïve belief that because the Islamic regime is repressive it must be unpopular and therefore will fall has distorted and undermined US policy. I believe a better understanding of the causes of state crises-in this case, awareness that a regime that maintains a united elite and strong nationalist credentials is likely to remain in power regardless of economic conditions (a finding that applies to Cuba's regime in the face of sanctions as well)-would have been helpful, not harmful, for framing more rational and ethical actions by US policymakers.
To be sure, one can never know for certain how new knowledge will be utilized. But it is my belief, based on my several decades of research on political instability and revolutions, that better understanding the contexts and causal interactions that produce such events, provided the research is done in a transparent and scientifically testable manner, based on an analysis of historical data and cases, has acceptable ethical risks.