Complex systems for the most vulnerable

In a rapidly changing world, facing an increasing number of socioeconomic, health and environmental crises, complexity science can help us to assess and quantify vulnerabilities, and to monitor and achieve the UN sustainable development goals. In this perspective, we provide three exemplary use cases where complexity science has shown its potential: poverty and socioeconomic inequalities, collective action for representative democracy, and computational epidemic modeling. We then review the challenges and limitations related to data, methods, capacity building, and, as a result, research operationalization. We finally conclude with some suggestions for future directions, urging the complex systems community to engage in applied and methodological research addressing the needs of the most vulnerable.


Introduction
Achieving the 17 goals defined by the UN 2030 Agenda for sustainable development poses a significant global challenge for humanity in the XXI century. Somehow implicit in the definitions of the sustainable development goals (SDGs) is the overarching goal of protecting the most vulnerable, those that are most at risk of suffering the consequences of persistent economic, environmental and social crises and inequalities. As already noted by several authors, SDGs outcomes are highly interdependent due to the complex couplings that characterize modern socio-technical systems and the environment [1]. Some authors have even shown that, due to this interconnectedness, policies aimed at achieving one SDG at a time can undermine progress toward the others [2,3]. This interconnectedness and the importance of considering integrated approaches that recognize that the SDGs are integrated has been repeatedly recognized by the UN and other international humanitarian and development bodies and organizations [4,5]. Interventions to achieve the SDGs can therefore greatly benefit from a design that takes into account such interconnectedness and the trade-offs and feedback between policies and outcomes.
In this perspective we argue that the inherent inter-disciplinarity of complex systems science, with its holistic perspective and explanatory nature, can be key in responding to and getting ahead of the many complex challenges and crises behind the SDGs. In generic terms, complexity science deals with systems composed of many parts whose interactions lead to the emergence of novel system properties and behaviors, hard to predict form the understanding of the parts in isolation [6]. Socioeconomic inequalities, financial crisis, migrations, social unrest and conflict, vulnerability to hazards, climate change, misinformation dynamics or humanitarian emergencies are challenges-or phenomena in complex systems lingo-with many drivers and non-linear relations in the interactions of different parts of the system. A systemic approach that allows to navigate the web of causes and relations to deeply understand and explain these challenges as complex phenomena, can be an invaluable asset to eradicate and prevent them at its root causes [7]. We argue that the complex systems approach to socio-technical systems is probably one of the most powerful scientific tools we have to shed light on the apparently ungovernable nature of major societal and environmental problems, due to, e.g., feedback mechanisms, non-linearity, network dependencies, self-organization and emergent behaviors.
In recent years, the unprecedented amounts of information coming from devices, internet, satellite imagery-the so-called big data revolution-and the increase in computational power have unlocked the potential of complexity science toward explaining techno-social systems [8]. Computational models informed by novel data streams (e.g., mobile phone data, social media data, etc) have allowed targeting of interventions and evaluation of impact in fields like marketing, advertising and in political campaigns [9][10][11]. Large-scale data have also unlocked unprecedented levels of accuracy in machine learning and AI solutions, but in many cases at the cost of loosing explainability and interpretability. Despite the enormous potential of black box approaches there is an increasing body of concern on the suitability of non-interpretable models for high stakes decision making [12]. Scientific understanding of the basic mechanisms underlying observed systems' properties, a concept that stands at the core of complexity science, has the potential for better explainability, accountability and communication of decisions and policies [13]. However, it requires the development of theoretical frameworks that can adequately describe the interactions among the basic units of the system under study, that is, a deeper domain knowledge than AI counterparts.
Mechanistic approaches of this type have been successfully developed in recent years to address several societal challenges. Some relevant examples are: the identification of global systemic risks through financial networks [14], the universal description of urban growth and development across nations [15,16], the prediction of collective responses during emergencies [17], the dynamics of labor markets and their resilience to automation [18], and the understanding of gender inequalities in different contexts, from academic career trajectories [19,20] to human mobility aspects [21]. Other studies have investigated the dynamical evolution of ecosystems [22,23] and the impact of globalization on their resilience [24], as well as the quantification of the environmental impact of car sharing [25] or traffic mitigation [26]. Here, we aim at providing a perspective of the value of complex systems as a tool for addressing some pressing issues for sustainable development, some non comprehensive list of exemplary use cases of complex systems of growing interest and, most importantly, an outline of the main challenges ahead to realize their potential, and opportunities arising from a closer interaction between the complex systems research community and the humanitarian and development sectors.

Poverty, inequalities and segregation
From eradicating poverty to achieving gender equality, the sustainable development goals aim at ending some of the most pernicious and persistent inequalities of our world. Somehow at the core of complex systems science has always been an interest in identifying and understanding the emergence of inequalities, from the 80/20 Pareto principle to modeling segregation [27]. At the beginning of this century, however, as new digital datasets allowed to observe massive networks of complex topology, the topic regained empirical interest, although with a stronger focus on the most connected: with models of preferential attachment that replicate 'the rich gets richer' dynamics [28], renewed attention to centrality metrics on networks [29] and efforts for predicting influential users [30] or for identifying rich-club phenomena [31]. Attention to the less connected and favored took longer and is still behind.
Nevertheless, some researchers took a first look into the mechanisms that put minority groups at a disadvantage (such as incorporating homophily into a preferencial attachment model) [32], and at the social dynamics of lower-income regions-from rural-urban migration trends [33] to socioeconomic stratification [34] and segregation [35], and immigrant community integration in cities [36]-while questioning the representativeness of these data sets in developing countries [37].
At the same time, at the intersection with machine learning [38,39] but with a stronger focus on explainability, complex systems researchers working on computational social science have triggered interesting efforts to estimate different types of poverty [40], such as economic poverty [41], food poverty [42], unemployment [43], or education [44], as well as to define novel indicators for others such as gender inequalities [21,45] or segregation [46][47][48].
Specially with the increase of black box AI mediated solutions, the explanatory power of complex system theory gains differential value for policy making as it allows for evidence based and interpretable frameworks to look at root causes. Two clear examples of the power of such approach have been on understanding and quantifying economic potential through an economic complexity framework [49] and recent efforts on the future of jobs and the polarization of workplace skills [50].

Collective action for representative democracy
During the last couple of decades, collective action, defined as the activity undertaken by citizens with the aim of contributing to public goods, has taken a new turn thanks to the advent of social media, with increased involvement of the young and of members of ethnic minorities [51,52]. Social media are in fact providing an accessible and decentralized way for people to produce and share content, as smartphones allow people to participate from anywhere and at any time. This huge increase in volume and diversity of participation causes an increased complexity of the phenomenon, which complex systems and network science are well suited to investigate. Moreover, the use of digital platforms provides a novel and rich data availability with respect to offline interactions, allowing researchers to better understand the complexity of social dynamics and the interplay between individual action and collective behavior [53]. At the same time, it should be noted that social media, while boosting citizen participation toward public good, also enable misinformation spread, online trolling and hate speech, and other unintended consequences [54][55][56], whose investigation is beyond the scope of this paper.
Complexity and network scientists have been developing theoretical models of social influence and contagion since several decades, even before the advent of these platforms and data. These models rely on the assumption that individuals adopt a new idea, opinion or behavior, or spread a piece of information, as a function of their social connections ('neighbors' in the network science terminology) who have already adopted it, or, in case of a piece of information, who already know about it. Three classes of models can be identified. The first is the Ising paradigm adopted by statistical physicists to model opinion dynamics [57]. The second class of models is that of threshold models, which are based on the mechanism that a person adopts a new idea or behavior whenever the number or proportion of their neighbors who have already adopted it exceeds a given threshold [58,59]. Finally, epidemic-like models are instead based on the idea that 'infected' individuals spread an idea, opinion, behavior or information to each susceptible neighbor with a given probability [60]. Beyond the investigation of the mechanisms of social contagion, a widely studied aspect of the problem is the identification of the most influential individuals in the networks, that is those that play the biggest role in the contagion process [61,62]. It has been shown that the most efficient spreaders are not those with the highest number of connections or with the highest centrality in the network, but rather those located within the core of the network as identified by the k-shell decomposition analysis [63]. The advent of internet platforms and the consequent new data availability, recently allowed researchers to test these theories on empirical data of online interactions [64][65][66][67][68][69]. Regarding social movements specifically, the dynamics of the 'Indignados' movement that was protesting in 2011 against the Spanish political response to the financial crisis, was investigated through their Twitter activity, finding empirical confirmation of a positive association between the core centrality of the seeders and the size of the information cascades [70]. On the other hand, an analysis of the protests across 16 countries in the middle East and North Africa during the Arab Spring showed that the periphery of the network can also generate collective action against authoritarian regimes [71]. The digital evolution of the occupy wall street movement was also widely investigated [72], highlighting a high degree of information localization key in resource mobilization [73]. In the last few years, further work has focused on the 'Me too' movement [74] and on online racial justice activism [75,76]. Finally, recently a complexity science approach has been proposed by an interdisciplinary group of scholars to better understand democratic instability dynamics, offering policy recommendations to help re-stabilize current systems of representative democracy, such as increasing diversity [77].

Computational epidemic modeling
The realistic modeling of infectious disease spread undoubtedly represents one of the most successful applications of complexity science to social phenomena. The introduction of the network paradigm-a pillar of complex systems-in the mathematical description of epidemics can be considered a fundamental milestone in our understanding of spreading processes. Since the early work of Pastor-Satorras and Vespignani, who uncovered the effect of scale-free networks on epidemics [78], complex systems scientists have contributed to elucidate several fundamental mechanisms that shape the transmission of infectious diseases in a population. Interactions among individuals and spatial movements, the key factors to understand epidemic spreading, can be effectively studied as complex networks [79]. At a population level, contact networks are the substrate on which a pathogen can propagate. At larger spatial scales, movements of individuals, represented as fluxes between geographical locations on transportation networks can facilitate the spreading of the virus between regions and countries. Such multi-scale network description represents the backbone of most modern modeling and forecasting frameworks that are used to evaluate possible scenarios, treatments, and control strategies to mitigate epidemic outbreaks [80][81][82][83][84].
The advent of the digital era and the availability of data describing socio-technical systems have spurred the development of large-scale data-driven models to simulate epidemics in increasingly realistic settings. The integration of novel data streams (from the web, mobile phones, etc) with the mathematical formulation of disease dynamics, has allowed the creation of models that can accurately describe epidemics at different spatial scales, from local outbreaks to pandemics [85]. A paradigmatic example of such models is represented by the global epidemic and mobility model (GLEAM), a framework that combines real-world population and mobility data within a metapopulation approach to describe epidemics at a global scale [86]. GLEAM has been continuously developed over more than 10 years, adapting its framework to an increasing number of infectious diseases, from Ebola [87] to Zika [88], and by integrating into it a number of data layers, including socio-economic data and agent-based features. Social mixing and behaviors, operationalized through the measurements of high-resolution contact networks are also at the core of the decade-long Sociopatterns project [89]. This interdisciplinary initiative has been collecting longitudinal data on the physical proximity and face-to-face contacts of individuals in numerous real-world environments, covering widely varying contexts across several countries: schools, museums, hospitals, etc [90][91][92][93]. Countless studies have used the data to study human behavior and to develop models for the transmission of infectious diseases. More recently, this approach has been used to measure household contacts in rural Africa [94,95].
For the field of epidemic modeling, the COVID-19 pandemic has clearly represented a landmark event [96]. For the first time in history, mathematical and computational models have been put at the forefront of the public health fight against the disease, providing guidance to policymakers when facing tough decisions, such as the introduction of strict social distancing measures. The COVID-19 pandemic has shown how epidemic models are a fundamental tool in the arsenal against the virus, in particular when it comes to capturing the complex interplay between human behavior and disease dynamics [97].
Indeed, the complex nature of epidemics due their social and behavioral determinants, beyond their biological component, has long been recognized as critical to mounting an effective outbreak response. For instance, the West African Ebola outbreak that killed over 11 000 people in Guinea, Sierra Leone and Liberia between 2014 and 2016, was one of the first large-scale epidemics where the role of different social and environmental factors (deforestation, land change, urbanisation, poverty and inequality, climate change driving migration patterns of host species) was studied with a complex systems-like approach [98]. It is also one of the first epidemics for which modeling efforts included the description of nation-wide social behaviors to explain disease transmission and plan interventions. In Guinea, detailed data on Guinean demography, hospitals, Ebola treatment units, contact tracing, and safe burials were included into agent-based models to assess the effectiveness of control strategies [99]. Similarly, during the COVID-19 pandemic, complex system scientists have tackled the important issue of understanding the interplay between the disease dynamics and social behaviors. For instance, how the spread of misinformation and disinformation and the subsequent changes in behaviors can influence the epidemic trajectory [100]. Indeed, vaccine hesitancy, misinformation spread, the trust in public health and science, are all challenges that require an interdisciplinary approach to be properly addressed [101].
In this wider context, complexity science can provide tools to address the needs of vulnerable populations during health emergencies. A recent work investigated the impact of refugees' displacement on the public health system of Turkey [102]. Through a multilayer network approach, Bosetti and collaborators have shown that a higher level of integration and mixing between refugees and the host population would strongly limit the re-emergence and spread of measles in the country. Other studies have modeled the burden of the COVID-19 pandemic in refugee settlements [103], investigated the socio-economic impact of the pandemic in lowincome countries [104] or highlighted their vulnerability to novel emerging diseases [105]. Modeling studies of COVID-19 have also shown how socio-economic inequalities unevenly distributed the burden of the epidemic among communities in Chile [106] and, similarly, detailed models of epidemic spread have shown that mass unemployment and the consequent evictions may lead to a dramatic increase of COVID-19 cases in the US [107].

Main challenges
Several challenges hinder the development and use of complexity science to address the issues of the most vulnerable populations. In this section we highlight the main ones.

Big data under-represents the most vulnerable
Data produced from digital means is often faster, wider, and/or more granular than traditional data; and hence the XXI century wave of applications and discoveries it has enabled: from complex systems science to AI, including some SDGs-related applications, such as the estimation of socioeconomic indicators [41,108]. However, the digital exhaust, or 'data in the wild', as some authors have called it, has not been designed and collected to answer a specific research question, but instead originally generated for other purposes [109]. Therefore it is not equally produced by everybody and it under-represents mostly disadvantaged geopolitical and socioeconomic contexts, which are precisely the most important focus of the SDGs [110].
Despite the major attention and excitement brought by this so called big data to modeling, computational social science, network analysis and complex systems, very little scientific interest has been put in the bias and inequalities that this data bears, i.e. the fact that not everyone has equal access to technology and that not everybody produces the same amount of data even when having access to technology, despite the well known sensitivity to initial conditions of chaotic systems such as socio-technical ones [111].

Lack of good data-sharing paradigms
Much has been said already on the challenges of data sharing, and there has been quite some progress and discussion in some of the most fundamental issues such as privacy [112], governance [113], and responsible use [114]. However, for humanitarian and development the data sharing ecosystem still remains siloed, with a great dichotomy between research and applications.
The main challenge is to preserve users' privacy through anonymization and aggregation of the raw data, while at the same time losing as little information as possible. Therefore for each case study a trade-off should be reached to ensure a level of aggregation at which the data are still informative but also preserving users' privacy [112]. These aggregation procedures require technical processes and skills that not all companies already have internally developed, meaning that researchers oftentimes need to support by providing guidance and algorithms. These are all long and time consuming processes that often hinder the development and applicability of models during emergencies, such as epidemic outbreaks, where early access to human mobility estimates would provide crucial information for forecasting models. Timely access to these kinds of data is therefore a critical bottleneck and more coordinated efforts between researchers, governments, international organizations and the industry, such as the Data Collaboratives [113], should be put in place to ensure their timely availability in future emergencies.

Replicability and transferability to low income settings
Arguably driven by the unequal availability of large data sets and concentration of scientific talent, the majority of available studies focuses on big cities and rich countries. The issue with this approach is that the replicability and transferability of models trained and tested in these settings to vulnerable and data-poor contexts is not guaranteed. Let us consider a model trained and validated within a specific context, that is, using historical data for a limited geographical area and covering a defined time period. First, how accurate will the model remain over time, given that the system under study might undergo changes that are not reflected in the historical data? Second, how accurate would the model be if applied to a different geographical region, even one with similar characteristics to the one the model was originally developed for? Patterns detected in one area might in fact not hold across space and time. For instance, in the use of mobile phone metadata to produce wealth estimations, it was shown that a model trained and validated with Rwandan data performed poorly when applied to the Afghan context, and vice-versa [115]. Therefore more research efforts need to focus on less explored vulnerable settings to investigate these limitations and develop models that are scalable and adaptable to different contexts.
An additional and fundamental methodological challenge is also that oftentimes, in the context of emergencies, but also beyond those, policy and decision makers are in need of models quickly to provide insights as soon as possible. Developing 'quick and dirty' models comes however with a non-negligible risk of low robustness and accuracy. Therefore, a trade-off needs to be found between developing imperfect models that can provide early insights, and taking the time to develop more robust and precise models but at the cost of risking to miss momentum or intervention timeliness.

Cross-cutting capacity
The third set of challenges relates to capacity building. In academia, complex systems scientists are faced with a lack of interdisciplinary venues spanning all aspects of their work lives: from the struggle to find a permanent position because their profile does not fit any specific department, to funding applications not fitting in any disciplinary category, to finding the right publication venues. On the other hand, international organizations that could benefit from incorporating such profiles, are struggling to create the necessary in-house capacity that can integrate as solutions the work reviewed in this perspective, as well as to advocate for fundamental changes to address the needs of the most vulnerable. Moreover, transdisciplinary research engaging with stakeholders, and notably with those representing minorities or underrepresented communities, is often overlooked by funding agencies. These challenges penalize those who decide to specialize in these topics and methods, posing a talent retention issue in both spaces.

Operationalization
Finally, as a result of all of these different issues, transitioning academic research efforts into operational and scalable tools that governments and international organizations can rely on for generating near real-time predictions and insights is a considerable challenge, but also the biggest opportunity for real social impact. As much as possible, the proposed models and data should be integrated in existing platforms already used by these stakeholders and, where not available, platforms should be co-designed rather than unidirectionally provided by researchers without a deep understanding of stakeholders' needs. Moreover, for models to run in near real-time, some considerations are required during model development. For example, input variables of forecasting models should be defined by taking into account the delays occurring between the actual dates of their measurement and when they become available for the model to use. Lastly, research operationalization also means that models need to be explainable and interpretable, in order for decision makers to be able to use the generated insights. In this sense, complexity science, with its mechanistic approach, usually provides models that are simple but easily interpretable. On the other hand, machine learning/AI models might be more difficult to interpret but normally allow for the inclusion of a higher number of dimensions and can provide useful insights when mechanistic models are not an option because the phenomenon under consideration is the results of so many interconnected drivers that a satisfactory mechanistic description is not at hand. Further efforts should be envisioned to complement the two approaches.

Opportunities and future directions
Beyond these limitations, the studies reviewed in this perspective indicate that applying complexity science to address the needs of the most vulnerable comes with several opportunities. First of all, a stronger societal role with an impact on policy. The United Nations already recognized about ten years ago the role that big data and machine learning and artificial intelligence techniques could play moving forward to tackle development and humanitarian issues [116,117], and they are now starting to acknowledge the added and complementary value that complexity science could also provide [118,119]. Indeed, the analysis of about 20 000 policy documents mentioning 'complex systems' highlights the increasing relevance of complexity science across all SDGs (see figure 1). It is therefore a timely opportunity to facilitate new cross-sector collaborations spanning academia, government, industry, global agencies and nonprofits. This would allow to unlock privately-held data in the context of well-recognized societal problems, and to go beyond multidisciplinary and interdisciplinary approaches by diving into transdisciplinary science, that is research generating both scientific and societal value through effective collaborations among scientists, the community and policy makers [120]. In this context, we need to train a new generation of complex systems scientists and data scientists motivated by socially-relevant missions. These 'multi-lingual' scientists could become ambassadors of complexity science in all sectors, lowering the barrier to future collaborations and data sharing. Moreover, we need to create more venues where different stakeholders can meet, such as the UN AI for Good Global summit [121] or, at a smaller scale, the complex systems for the most vulnerable symposium that the authors of this perspective have been organizing for several years within the annual conference on complex systems, bringing together academics with scientists from different UN agencies [122].
As reviewed in this article, several socially-relevant areas are already widely explored by complexity scientists, however, there is still room to further expand the applications horizon and more importantly to bridge the capacity and knowledge gaps that exists between complex science and humanitarian and development practice.
To illustrate with an example, a fundamental global challenge which should constitute the focus of future efforts is undoubtedly climate change. A vast body of research has already addressed different aspects of this issue, from climate models [123,124] and the role of humans in global warming [125], to the consequences on human health [126], on conflicts [127] or on forced migrations [128].
From a complex systems perspective, there are promising areas of future development such as investigating further approaches that incorporate all the relevant interconnections to account for emergent phenomena like global warming and the human response to it, as recently argued by Holme and Rocha [129]. Or the role that network based approaches can play as a critical tool to forecast climate phenomena, as a complement to numerical modeling [130]. Or investigating the value that new complex systems advancements, such as the focus on higher-order interactions in complex systems during the last few years [131], can have in allowing for a better understanding of physical and social phenomena. To analyze whether and how all these methods could be useful in understanding climate change, we need stronger collaboration channels between climate experts and the complex system community.
From a humanitarian and development point of view, climate change risk indexes have already been gaining attention over the last years: from global impact estimations [132] to recent efforts to estimate specific impacts on children's risk [133]. There is, nevertheless, a pressing need to translate this advocacy efforts into operational climate change frameworks useful for operations and planning. This transition from research into action (and influence from operational needs into research) will require stronger communication and collaboration channels between climate experts, complex systems, and humanitarian and development actors, institutions and organizations.
Any of these angles present a different aspect of a much needed-and currently scarce-transdisciplinarity and cross-pollination. This is specially important when talking about innovation and the unique nature and fragility of humanitarian and development settings. We believe it is critical to start planning and investing strongly in creating these multidisciplinary channels of exchange and collaboration, within academia, private and public sectors and humanitarian and development organizations. Only through this holistic pathways, researchers, stakeholders (from the private and public sector) and policy makers will be able to face global challenges.
We hope that this perspective will persuade complex systems scientists that several compelling research opportunities exist in addressing the needs of the most vulnerable, and we would like to conclude by urging the whole scientific community to engage in the endeavor of identifying the most pressing societal challenges and of developing suitable tools to address them, as well as to investigate whether and how the methods they have already developed might be suitable to address these challenges.