Monte Carlo basin bifurcation analysis

Many high-dimensional complex systems exhibit an enormously complex landscape of possible asymptotic states. Here, we present a numerical approach geared towards analyzing such systems. It is situated between the classical analysis with macroscopic order parameters and a more thorough, detailed bifurcation analysis. With our machine learning method, based on random sampling and clustering methods, we are able to characterize the different asymptotic states or classes thereof and even their basins of attraction. In order to do this, suitable, easy to compute, statistics of trajectories with randomly generated initial conditions and parameters are clustered by an algorithm such as DBSCAN. Due to its modular and flexible nature, our method has a wide range of possible applications in many disciplines. While typical applications are oscillator networks, it is not limited only to ordinary differential equation systems, every complex system yielding trajectories, such as maps or agent-based models, can be analyzed, as we show by applying it the Dodds–Watts model, a generalized SIRS-model, modeling social and biological contagion. A second order Kuramoto model, used, e.g. to investigate power grid dynamics, and a Stuart–Landau oscillator network, each exhibiting a complex multistable regime, are shown as well. The method is available to use as a package for the Julia language.


I. INTRODUCTION
Multistability is a universal phenomenon of complex systems.Whether it is hysteresis effects in physics, the human brain [1,2], gene expression networks [3], in human perception [4], power grids [5] or the climate system [6][7][8][9], almost every sufficiently complex system has a multitude of stable asymptotic states and bifurcations that occur when control parameters are changed.Most traditional methods of bifurcation analysis, such as AUTO [10] rely on tracking states by continuation of the integration, and become increasingly challenging for highdimensional systems.Further, for high-dimensional systems, often one is also more broadly interested in classes of asymptotic states such as synchronized versus unsynchronized states of oscillator network or states that share a common symmetry.Here, we fill a gap between a coarse analysis with macroscopic order parameters and more thorough bifurcation analysis.
Our machine learning approach, Monte Carlo Basin Bifurcation Analysis (MCBB), based on random sampling and clustering methods, resolves different classes of asymptotic behaviour into clusters.Rather than studying the existence of states and orbits on the one hand, or only tracking changes in a single order parameter on the other, our approach learns which type of attractors are most dominant in terms of the volume of their basin of attraction, and quantifies the changing size of the basin of attraction of each of these classes as a function of a control parameter.This provides new insights into the * gelbrecht@pik-potsdam.debifurcation structure of multistable high-dimensional systems.Thus, we can regard MCBB as a way to interpolate between detailed studies of asymptotic bifurcations tracking every change in asymptotic structure on the one hand, and statistical physics using specialized order parameters to study the macroscopic behavior at the other end.
First we will introduce the method and the idea behind it in the following section.Then, the algorithm will be explained in Sec.II D. A number of paradigmatic examples, the Dodds-Watts model of social and biological contagion and networks of second order Kuramoto oscillators and Stuart-Landau oscillators will follow in Sec.III.Lastly, these results and the performance and applicability of the presented method will be discussed in Sec.IV.

II. METHOD
We aim to learn those classes of similar attractors of a high-dimensional system that collectively have the largest basin of attraction with respect to a measure of initial conditions ρ 0 .Further we intend to understand how they, and their basin volumes, change as a function of a parameter p in a range I p .A class of attractors C should denote an equivalence class of attractors, including at different p, that have similar invariant measures.
To do so we will interpret ρ 0 as a probability distribution.We can then draw initial conditions from ρ 0 and parameters from I p and simulating the system to generate trajectories.Assuming ergodicity, the tail of the trajectories then sample the invariant measures on the attractors.We then use these tail samples to estimate whether the invariant measures they were drawn from are similar in the sense of the defining equivalence of our classification.This way we identify clusters among the tail samples that are drawn from the same class.By then computing the number of samples in each cluster drawn at a particular p (or a small interval around it), we provide an estimate for the relative size of the basin of attraction of a class at p. Further we can use the samples to study how the members of the class change as p changes.
A key step here is the definition of similarity of invariant measures.Comparing all tail samples to each other is a potentially prohibitively expensive step.Further, in high dimensional systems with a large number of asymptotic states we might be interested in coarser classes of behaviour.Therefore we typically define the similarity between clusters in terms of statistics of the invariant measures that can easily be estimated using the tail samples.
To make this idea more precise we need to define how to determine that two asymptotic measures are similar.We begin by outlining the formal quantities under investigation.

A. Classes of attractors and their basin volumes
We investigate a complex system with system parameter p yielding a trajectory x(t; x 0 , p) for initial conditions x 0 .This can be an ordinary differential equations system ẋ = F (x, t; p) or a map x n+1 = F (x n , x n−1 , ...; p).If this is a sufficiently well behaved dynamical system, the measure ρ 0 will asymptotically evolve into ρ ∞ , a linear combination of invariant measures on the attractors A of the system, As we vary the parameter p, the set of attractors and invariant measures of the system will change as well.Given a notion of similarity of invariant measures we define equivalence classes of asymptotic states C. Denoting C p those elements of the equivalence class that occur for the system parameter p, we have a parameterized space of measures for each class.Assuming that there are only finitely many at each p, we write for asymptotic measures with support only in class C. We assume ρ C (p) = 0 and b C (p) = 0 if the sum is empty and c A = 1 otherwise.Under these assumptions ρ ∞ can be decomposed into classes at each p: When we sample from I p and ρ 0 m then run the system, the resulting trajectories will have probability b C (p) to asymptotically sample an invariant measure in C.

B. Similarity of asymptotic measures
The key challenge to make this idea operational is to define a notion of similarity.We will approach this challenge to define a computable pseudometric in the following.Let us first consider an extremal case: A linear response of asymptotic measures suggests to identify ρ A (p) and ρ A (p + ∆p) as belonging to the same class if they are connected by a smooth continuum of measures.That is, the difference between them vanishes smoothly in an appropriate sense as ∆p goes to zero, e.g. in the sense of [11,12].When sampling trajectories, we can build clusters of samples by requiring some discrete notion of this continuity, ensuring that it converges to the right continuum condition in the appropriate limit.
Taking classes built up in this way puts us firmly in the realm of bifurcation analysis.We would resolve every potential difference in asymptotic states.As noted above this might not be desirable when the number of asymptotic states is large, and designing a discrete similarity measure on the high dimensional space that is not prohibitively expensive to evaluate is not straightforward.
Going into the other extreme are order parameters.We could consider ρ A (p) and ρ A (p + ∆p) as similar if they lead to the same order parameter up to some finite bound.This would place us directly into the realm of statistical physics, but requires us to know already what meaningful order parameters for our system are.
Generally speaking we build the classes by making use of some pseudometric on the space of measures built from a weighted sum of differences of statistics S k (ρ) of the measures.The sampled trajectories then provide us with a way to estimate these statistics, and thus the pseudometric distance between the underlying invariant measures: Specifically we will show that for the examples considered in this paper it is sufficient to track the mean and the variance of the measures, encoding the position and size of the attractor in phase space: • The position of the attractor: • The size of the attractor: where ρ k denotes the marginal distribution on system dimension k.
Parameter FIG. 1. Schematic illustration of an example dynamic with stable asymptotic states (solid blue lines) and unstable asymptotic states (dashed blue lines) We further consider the histograms of these statistics over the dimensions of the system.This is particularly useful when the system consists of many identical elements, and it allows us to identify asymptotic states related by permutation symmetry.This is critical for the application to networked systems, for example a dynamical system on a fully connected network will have a symmetry group S n .A more detailed discussion of the technical aspects are given in the next section.
Dependent of the investigated systems, other statistics, such as higher moments or entropy measures can be used as well.Our implementation of the algorithm provides a flexible framework for this purpose (see Appendix A).

C. Clustering
Finally to construct a cluster of samples from the estimates of the distance of measures there are two options.Again following linear response theory, we can require that the observed distance is (up to a factor) a finite scaling of the linear response of the asymptotic state.For every sample with a parameter p i we continue the integration with p i ± δp where δp ≈< min j (||p (i) − p (j) ||) > i should be a typical parameter spacing, leading to samples from the measure ρ i± .Then, we compare the difference D(ρ i , ρ j ) between trial i and j with the difference to the results of the continuation of the integration δ ± i = D(ρ i , ρ i± ).If the former is much larger we assume that there is no direct continuation between the states.Two states are then in the same cluster if there is a chain of states connecting them.Instead of this computationally intensive continuation study we can also try to extract sensible values for the distance between samples directly from the data.This leads then to a constant response size parameter DB for all trials that is ideally a specific percentile Q k (p(δ ± i )) of the distribution of actual responses δ ± i .When we incorporate the parameter proximity constraint with a weight w p in the distance calculation as well, the new condition then reads i and j are connected if Such a criterion is part of density based clustering algorithms such as DBSCAN [13] which we can thus use to distinguish the different classes of asymptotic states given a certain set of suitable statistics.If a single, constant threshold like DB is used, it also allows us to vary this threshold in order to resolve different classes of asymptotic finer or coarser: if we choose a large DB many similar asymptotic states will be grouped into a single cluster that corresponds to a broad class of asymptotic states.Contrary, a smaller DB will result in more different clusters, hence resolving the asymptotic states finer.Fig. 1 schematically illustrates that: As long as this constant threshold is smaller than the minimal distance between trajectories of the two asymptotic states in question, they will be resolved into different clusters.
Crucially, all steps described here can be performed in a time that scales at most quadratic in the system dimension.This means that high dimensional systems are amenable to being studied in this way.

D. Algorithm
We now describe the algorithm that implements the ideas described above in more detail.
MCBB is a modular algorithm: most steps can be modified to suit the dynamical system in question.Algorithm 1 summarizes this procedure and in the following a detailed description of every step is given.
a. Setup We aim to distinguish different classes of asymptotic states by using clustering algorithms on sets of measures that each evaluate one of the N Monte Carlo trials.Given a dynamical system such as an ordinary differential equation system ẋ = F (x, t; p) or a map In what follows, we will will use uniform distributions for U IC and U p .While we will mostly focus on systems with one parameter dimension, it is in principal also possible to investigate systems with more than one parameter dimension.In particular setups with two varying parameters can provide useful insights into the dynamics of the investigated systems.However, results for systems with three or more parameter dimensions are harder to visualize and will need exponentially more trial runs to create sufficient density in the parameter space.In contrast, just as for basin stability, the number of necessary samples does not scale with the dimension of the space of initial conditions.b.Integration Subsequently, the system is solved for all of the N drawn configurations (x (i) 0 , p (i) ).The integration time has to be set appropriately to the system, so that the asymptotic states are reached.After discarding the transient, the system is integrated for a sufficiently long time.While in theory, this choice of a suitable integration time and transient time is highly non-trivial, in practice, one should have prior knowledge about the time scales of the system.In most situations choosing these times at reasonably large values and checking them for individual trajectories is sufficient.A more sophisticated approach will be discussed in future work.
The Julia package provided with this paper (see Appendix A) uses DifferentialEquations.jl[14] to solve ODE systems.The library automatically chooses appropriate solvers, such as Tsit5 or Verner methods [15,16].Even though these methods feature an adaptive step width during integration, we save the trajectories at a constant step width, so that the results of all N trials are saved at the same time steps.We then consider the sample provided by a set final fraction of the trajectory.
c. Evaluation of the Integration On each of the tail samples generated this way we evaluate a set of statistics, typically we consider some number N s of statistics per system dimension N d .These include per default the position and size of the attractor as the mean and standard deviation of the tail sample.Other statistics are possible as well, though.Thus, we obtain N matrices of statistics S i each (N d × N s ) sized with elements S i,kl .
d. Clustering For most clustering algorithms a distance matrix between all samples is needed.This (N ×N ) distance matrix can computed from the S i s with two different approaches.First, by calculating where each measure can be weighted with a weight w i .
The parameter values can be included in the distance metric with weight w Nm+1 to ensure that similar asymptotic states with strongly different parameter values are distinguished from each other.The other possibility is to first fit a one dimensional histogram H i,k to each statistic k across all system dimensions.This can be advantageous when symmetric configuration of asymptotic states should not be distinguished which is often the case for networks of identical units.The distance matrix then follows with a suitable histogram distance D H (H i,k , H j,k ) such as the 1-Wasserstein metric with When all H i,k for one specific statistic k share the same binning and norm, the 1-Wasserstein metric can be computed very efficiently from the empirical CDF of each histogram.The choice of the weights w depends on the statistics used and the expected asymptotic states.Generally, a good first guess is to give higher moments such as variance and non-normality measures lower weight than the mean.Given the distance matrix, a clustering algorithm such as DBSCAN [13], is used.DBSCAN classifies all points that can be reached through a common DB -area as one cluster.Estimating an appropriate DB parameter is a non-trivial task and there are different possibilities.In [13] the authors recommend to use the k-Nearest Neighbour (kNN) distance, more specifically the 4NN distance and use the value of the 4NN distance at the first visual knee in the ordered 4NN distance graph of all data points as DB .Another, yet similar possibility is to use the median of the cumulative kNN distance, where k is a certain percentage of all points, e.g.0.5%.As explained in Sec.II, the DB can also be estimated by continuing the integration and tracking the response of D.
In the examples we have studied, this yields similar values like the more empirical kNN-based methods, but is computationally more expensive.This is why the kNN-based methods are preferred for the estimation of the parameter.Fundamentally there is no "right" choice of DB , in combination with the choice of distance measures it determines how finely we want to distinguish tail samples.While the choice of statistics and weights determines what aspects we look at, DB provides us with an overall resolution that we can vary.As the clustering step at this point is very quick, it is easy to scan a variety of values.We will see an example further down where two clusters that are somewhat similar are no longer resolved as we increase DB .Density-based clustering algorithms such as DBSCAN are sensible to outliers.Input that is strongly dissimilar to all other data is classified as an outlier.For our purpose, this will typically happen when an explosion of multistability, many different, yet dissimilar, asymptotic states occur.e. Evaluation of the Clustering The clustering algorithm C thus returns the cluster assignments which map each of of the N trials to one of the N C clusters with ] being the number of this cluster for trial i.The cluster assignments C enable us to further analyse the system in question.First of all we can track the size of the basin of each class of asympotic states for changing parameters and thus quantify bifurcations and multistability within the system.This is done by computing the amount of trials within a parameter window [p min ; p max ] and sliding this window over the complete parameter range.For each cluster C i , thus our estimator for the relative basin volume at parameter p, bCi (p In order to further assess the dynamics of each class of asymptotic sets, the statistics are subdivided into the sets belonging to each of the clusters as well.This way we can track, for example, how the position or size of samples in a cluster change as a function of p. Investigating solutions of typical trajectories within each cluster can provide insights as well.In Section III examples of such analysis are shown.All in all, the two main parameters of the method are the weights w of the distance calculation and the clustering parameter, in case of DBSCAN DB .As a default for w, we take w E = 1, w V ar = 0.5, w p = 1.In the Section III we will explain in more detail for every system why we chose the weights presented.For the clustering parameter, an estimate with the kNN distance or a response analysis is made and if needed this value is increased (decreased) if one wants to resolve more (fewer) clusters.As for most Monte Carlo methods, the number of trials N should be chosen sufficiently large so that the results are independent from it.A reasonable test is therefore to run the experiment twice: if the results differ qualitatively, one has to increase N .

Algorithm 1 Monte Carlo Basin Bifurcation Analysis (MCBB)
1: Given: A system ẋ = F (x, t; p) with system dimension N d (can be an ODE system but also a map) 2: Given: A set of Ns statistics {S} on the components of trajectories R N t → R (e.g.mean and variance) 3: Given: A distribution UIC of the initial conditions and parameters Up 4: for i ← 1, N do 5: Sample N initial conditions x0 from ρ0 and N parameter values p 6: Solve system for a long trajectory x(t; p) 7: for meas ← 1, Ns do 9: compute matrix of statistics S i,dim,s = Si(x dim (t)) on the tail of the trajectories.MCBB is a method that can be applied to a wide range of dynamical systems.Both, systems with discrete and with continuous state spaces are possible to investigate, as are systems with discrete and continuous time evolution.Typical applications are networks of oscillators as will be shown in the following, but also discrete agentbased models such as the Dodds-Watts model.Every system that returns a trajectory given an initial condition and parameter can in principal be analyzed with MCBB.In the following the Dodds-Watts model, Kuramoto oscillator networks and Stuart-Landau oscillator networks will be investigated with MCBB.The source code of all these results is available as Jupyter notebooks in the supplementary material.

A. Dodds-Watts model
The Dodds-Watts model of social and biological contagion [17,18] is a generalization of contagion models such as the SIS and SIR model [19, e.g.].Given is a population of N I individuals that are connected to all other individuals.Each of the individuals is either in the susceptible (S), infected (I) or recovered (R) state and has a memory of doses they received within the last T time steps D t,i .At each time steps each individual i comes into contact with another individual j that is randomly selected from all other individuals.If j is infected, i receive a dose d with exposure probability p.The amount of the dose d is drawn from a distribution f (d).The dose adds to the dose memory D t,i of i at time step t so that D t,i = t t−T +1 d t ,i .If the dose memory of an individual exceeds the dose threshold d * i , it becomes infected.Latter dose threshold d * i is drawn from a distribution g(d * ).As soon as D t,i drops below the threshold, the individual recovers with probability r at each time step.A recovered individual becomes susceptible again with probability s.One gets the classic SIS model for example for the configuration s = 1, g(d * ) = δ(d * − 1), f (d) = δ(d − 1), T = 1 with p and r as free parameters.For more details on the model, see [18].For this N I dimensional model with discrete states s i,t ∈ [S, I, R] and discrete time t ∈ [1, 2, .., t N ] we directly evaluate the count of susceptible N (S) and infected states N (I) within the time evolution of each individual as measures for the algorithm.As shown by [18], there are several configuration which possess multistable regimes where also a mixed population with N (I) unequal 0 or N I can be stable.
In particular we are investigating the two configurations: (A) with N I = 1000, T = 12, r = 1, g(d) = δ(d − 3), s = 1 and (B) with N I = 1000 g(d) = 0.075δ(d − 1) + 0.4δ(d − 2) + 0.525δ(d − 12), T = 20, r = 1 and s = 1.The number of initially infected individuals is drawn from a uniform distributed between 0 and N I .We evolve the system for 1000 time steps from which we regard the first 800 time steps as the transient.Configuration (B) is roughly similar to the SIS model but with a dosage memory of 20 steps and a dosage threshold distribution so that roughly half of the population is quite resilient against becoming infected.For both configurations N = 5000 trajectories with random initial conditions and parameter values were computed.As both of the measures are equally important, we choose w I = w S = 1 and w p = 0, so that we do not use the parameter value in the distance calculation.The distance D was constructed using histograms of the statistics as described in Sec.II D.
Based on a visual inspection of a 4NN-distance graph, the clustering parameter d DB = 0.15 was chosen for configuration (A).Fig. 2 shows the results of the analysis.Similar to the results reported in [18], we see for such a configuration a bifurcation occur around p ≈ 0.4.For values larger than this the fully infected state becomes stable.Its basin of attraction quickly grows, but the fully healthy state remains stable as well with a very small basin of attraction for large p values.
Configuration (B) exhibits a slightly more complex structure as Fig. 3 reveals in accordance with the results in [18].Additionally, Fig. 3 features sliding histograms as well.These can be helpful to identify the dynamics of the clusters.For each sliding parameter window a histogram is fitted to all measure results within this window.These histograms are then plotted directly next to each so that we can visualize changes of the measures within each cluster for changing parameter values.In the case of the Dodds-Watts model where we measure the fraction of time an individual agent was infected and susceptible, these are predominantly either 1 or 0 as most agents are either infected or susceptible the whole time.Fig. 3A shows the behaviour of the system.For small values p only the fully healthy state is stable (see also Fig. 3B).The first bifurcation occurs around p = 0.3 when a mixed state, for which susceptible and infected individuals coexist, becomes stable.Its basin of attraction quickly grows, while the healthy state remains stable but with a very small basin of attraction.Fig. 3 shows that for growing p the amount of infected individuals rises.Eventually, around p = 0.7 a fully (or almost fully) infected state becomes stables.As Fig. 3D shows directly at the bifurcation point not all individuals of the fully infected state are infected which is the case for larger p.
Comparing the results to these reported in [18] we see that the fully infected and the mixed state are indeed two distinctive stable branches of the system and thus rightfully classified by MCBB into two separate clusters.

B. Kuramoto Networks
The Kuramoto Model is one of the fundamental examples of synchronization theory and network science.The version with inertia has been used in a variety of contexts, most importantly to model nodes in power grids [20][21][22][23][24][25].In the transition towards globally stable synchronization, the Kuramoto model with inertia exhibits an extreme form of multistability, with a large number of attractors.Studying the dominant patterns of synchronization in the transition region was one of the motivating questions for the development of MCBB.The system is given by the equations with equally many +1 and −1.For K = 0 the oscillators rotate freely with ω = ±10.As K increases synchronization starts to occur in the network.At K = 10 the system typically synchronizes completely with ω i = 0.While a large number of works have studied the stability of this synchronous state as a function of the local net- work topology [23,[25][26][27][28][29][30][31][32][33][34][35], comparatively little is known about the intermediate regime.
As the main dynamics is in the frequency, we will only consider the frequency dimensions in the analysis here.Figure 4a shows the network on which the oscillators are coupled.It is a random regular graph for which every node has degree k = 3.The statistic we will use on the asymtptotic state are the positions of the frequency of all the nodes and the distance is D computed according to Eq. 6.The results shown are for N = 25, 000 trajectories.
The basin bifurcation structure, with distances calculated from the per-dimension mean of the frequency, is given in Figure 4b.We see that for K = 0 the oscillators rotate freely, the frequencies are located at ω = ±10.This state persists, until its basin starts to shrink from K = 1 onward.In the intermediate regime most of the asymptotic states occur.These are classed together in the outlier cluster here, meaning that they occur so infrequently that not enough samples can be obtained for a statistical treatment.This shows that the basin structure isn't dominated by one transitional state but an explosion of multi-stability occurs.However, the basin bifurcation diagram also shows two states that achieve a higher basin in the transition region.Each of these clusters occurred in more than 0.5% of the total runs, and peaks at taking up more than 10% of the basin volume at some p.
If we look a bit deeper into these clusters, we find that they represent partial synchronization, in which a region of the network is synchronized, while all other oscillators still rotate at their natural frequency Figure 5.
To understand how these intermediate clusters lose stability as K increases, we can consider the size of the asymptotic states considered in Figure 6.Here we see that the size of the attractor increases as K increases.In other words, the frequency itself starts to oscilate around a stable average frequency.This suggests an interesting insight into the behaviour for the transition regime.As K increases some neighbouring oscillators couple and synchronize.As the attractor of the partially synchronized state grows, the oscillators at non-synchronized nodes spend more and more time far from their natural frequency.Eventually they would have to spend considerable time close to the frequency of a synchronized component that they couple to and get entrained.
To verify that these are the mechanisms that drive the transition, and to understand which network properties enable early partial synchronized states, is beyond the scope of MCBB and this paper.However, the basin map of the bifurcation transition that is revealed by this approach provides immediate and crucial insights into how the basin structure and the structure of the attractors themselves change in the transition.In particular it reveals that the attractors do not move, but grow until they lose stability.

C. Stuart-Landau Oscillator Networks
Another paradigmatic type of oscillator is the Stuart-Landau oscillator which can be written as ż = (λ + iω − |z| 2 )z (12) where z ∈ C, λ is the bifurcation parameter and ω is its eigenfrequency.Originally found by Lev Landau and later derived by  to describe the transition to disturbance in hydrodynamics, it is also a normal form of the Andronov-Hopf bifurcation and hence widely applicable and of great importance in many fields [39].Coupling Stuart Landau Oscillator can lead to several interesting phenomena.Most importantly oscillator quenching in the form of Amplitude Death (AD) and Oscillator Death (OD) [40, e.g].An other interesting phenomena are Chimera states [41, e.g.].These are states of systems of coupled identical oscillators that exhibit a inhomogeneous pattern in which phase-locked states coexist with drifting states.To apply MCBB for Stuart-Landau systems, we use the configuration of [42] as it prominently features a multistable regime with travelling wave (TW), oscillation death (OD) and what the authors refer to as stable amplitude chimera (SAC) dynamics.In this setup N N Stuart-Landau oscillator with identical eigenfrequency ω are coupled by attractive coupling to its P 1 nearest neighbours and repulsive coupling to its P 2 nearest neighbours with the following equations: where (x) is the real part and (x) the imaginary part of x.We can also investigate this setup with the coupling mediated on two Watts-Strogatz random graphs [43], one for the repulsive and one for the attractive coupling.With the rewiring probability p r = 0, we get the same equation as above, for p r = 0 we expect changes in the dynamic.

Parameter Configuration
We choose the same parameter configuration as in [42]: ω = 2, N N = 100, P 1 = 1 and P 2 = 22.In our experiments we vary K, r 2 = P 2 /N N and p r .We use random initial conditions with real and imaginary part uniformly distributed between −1 and 1 (in contrast to the cluster initial conditions used for some calculations in [42]) and vary K from 1.8 to 2.5.As per dimension measures we use mean and standard deviation.Since the Stuart-Landau oscillators are complex valued, all measures are applied separately to the real and imaginary part.From our a priori knowledge about Stuart-Landau Oscillators, we know that their asymptotic states will exhibit different kinds of oscillatory behaviour, thus it is a good choice to put the largest weight on the standard deviation.We choose w E = 0.25, w SD = 1, w p = 1 and run N = 15, 000 trials that are integrated from t 0 = 0 to t f = 200.The first 70% of this time span are regarded as the transient and are not used for the evaluation.The first experiment is performed with p r = 0 and r 2 = 0.22 and the distance D is calculated using histograms according to Eq. 7.

Varying the coupling
After running the experiment and calculating the distance matrix D, the associated 4-dist graph exhibits the knee point at around 0.01.We slightly decreased this value to 0.009 and 0.008 in the reported results.Figs.7 (A) and (B) show these results for the approximate relative basin volume.Similar to the results reported in [42] we see a multistable regime, in which TW dynamics are prevalent for K < 1.95 and OD dynamics are for K > 2.2.In between there are various states in which some oscillators show OD-like behaviour and others exhibit a synchronized oscillation.We thus prefer to refer to these kinds of states as partially synchronized (PS) states.Importantly, the PS states are a mixture of many similarly partially synchronized states and not just a single asymptotic state.If we choose a larger DB like in Fig. 7A, the states with full OD and the PS states with only few partially synchronized oscillators and otherwise mostly OD dynamics are merged into one cluster (OD+PS).For smaller DB they are separated into two distinct clusters (Fig. 7A).One particular structured and more common kind of partially synchronized states can be found for 1.9 < K < 2.0.As Fig. 8 shows, these states are highly regular stationary waves, interrupted by oscillators exhibiting OD, we thus refer to these states as regularly clustered stationary wave states (RCSW).Aside from these more regular dynamics, there are all kinds of different mixed states between wave-like dynamics and oscillation death.Many are so dissimilar to each other that they fall into the outlier cluster.The outlier cluster has the most members during the transitions from TW to PS via RCSW at K ≈ 2.0 and at the transition between OD and PS at K ≈ 2.2.A handful of smaller clusters with less than 60 members (or 0.4% of all trials) were neglected.They contain partially synchronized states with more similarities to each other than to those in the outlier cluster.We identified these dynamics by further analyzing the statistics within each cluster.Fig. 8 shows example plots and sliding histogram plots for two of these clusters.The RCSW states mostly oscillate and thus almost all oscillators have a mean of zero and a constant standard deviation different from zero.We see that these histograms change little for different coupling values.The cluster is very homogeneous with almost all members looking like the example shown in Fig. 8 C. The PS cluster, on the other hand, is much more inhomogenous.Its members have in common that most of the oscillators exhibit OD, thus as Fig. 8 confirms, they exhibit nonzero means, with both positive and negative values while having a vanishing standard deviation which corresponds to the typical stable fixpoints of OD dynamics.Fig. 8 D shows one example, the amount of oscillators still exhibiting a synchronized oscillators is different within the cluster, though.Additional results for the other clusters can be found in the appendix.

Varying the coupling and amount of coupled neighbours
Similarly to the additional setup in [42], we can also investigate this system with two varying parameters with MCBB.First, we choose to vary K, the coupling, and r 2 , the relative amount of neighbours the oscillators are coupled to repulsively.Fig. 9 shows similar clusters of similar asymptotic behaviour as in the one-dimensional setup.We see that TW dynamics are present only for small K and large r 2 values, while OD+PS dynamics are present even for small K values when r 2 is small.For very small r 2 there is also a desynchronized (DS) cluster.Most notably the distinctive RCSW type dynamics are only present for r 2 > 0.1 and its basin becomes larger for larger r 2 values.

Rewiring of the network
14 When we start to randomize the coupling by rewiring it according to the scheme of Watts-Strogatz random graphs, we get the results presented in Fig. 10.Here, we added the outlier cluster together with several smaller clusters that all exhibit mixed, partially synchronized, partially OD dynamics to the mixed states (MS) cluster.The range of K for which these kinds of dynamics appear gets wider when the rewiring p r increases.TW dynamics appear less for larger p r values.RSCW type dynamics do not appear when we rewire the network.
IV. DISCUSSION Given a complex system, such as a ODE system, like the Kuramoto and Stuart-Landau networks demonstrated in Sec.III B and III C, or a map like the Dodds-Watts model presented in Sec.III A, MCBB is able to analyze and quantify which classes of asymptotic states are occuring.As demonstrated with the paradigmatic example systems MCBB is a widely applicable approach.It is suitable to analyze the behaviour of every highdimensional system that returns a trajectory, be it agentbased models such as the Dodds-Watts model or Differential Equations like the Kuramoto and Stuart-Landau networks.The known bifurcations of these systems were reproduced by MCBB as shown for example with the Dodds-Watts model.Additionally, it enables us to reveal clusters of qualitatively similar asymptotic states for all these systems as the results investigated in Sec.III C show.It does successfully identify the sizes of the basins of the most important asymptotic states even in transition regimes, what a traditional bifurcation analysis can not reveal.For the Kuramoto system we see how and when the basins of the unsynchronized states shrinks and how the basins of the completely synchronized states emerges.We also get an insight into the transition between these states, as we can see how the size of the states increases before they destabilize.Hence, for the Kuramoto model it provides an intuitive way of visualizing the synchronization process.When applying MCBB to a Stuart-Landau system the different asymptotic behaviours, travelling wave states, oscillator quenching phenomena such as oscillator death and mixed stated, are classified in different clusters and interesting dynamics such as regularly clustered stationary wave states are revealed and their basins quantified.
The analysis can always be fine tuned by changing the clustering parameters to resolve the asymptotic states finer or coarser.Additionally, the weights of the distance calculation provide another mean of adjustment.The flexible nature of the method also allows for experimentation with the statistics used to evaluate the trajectories and the exact clustering algorithm.In particular various entropy-based seem promising to use.While designing the method we already used the per dimension Kullback-Leibler divergence of the time series to the Gaussian measure KLG i = D KL ρ G (E i , Var i ) ρ i as a statistic to track structural changes of investigated systems.This was especially useful for relatively low-dimensional systems.The curve entropy [44] of the complete trajectory was tested as well.Additionally, we also experimented with a distance between histograms of the covariance matrices as a statistic.This expands variance-based size measure to also take cross-correlations between the dimensions into consideration which could be useful for systems that exhibit multiple possible cross-correlations structures in the asymptotic states that otherwise behave similar, e.g.different kinds of collective oscillations.For the example systems presented here, it was however sufficient to only use the position and size of the attractors as measures.Additional measures were not necessary to resolve the different classes of asymptotic states.This should not stop experimentation with additional measures though, as some of them are already implemented in the accompanying software as well with further additional ones easy to add.
Aside from the approximate basin volume and the sliding histograms shown in this paper, it is also possible to further investigate the clusters found by the clustering algorithm, e.g. by analyzing which kind of initial conditions lead to certain class of asymptotic states or by analyzing how each dimension is changing with the control parameters separately and not in histogram form.These options are already implemented in the Julia package (see  Appendix A) and more could be envisioned in the future.It is further possible to extend the method to systems with unknown background parameters that adhere to certain distribution and additional control parameters or forcings, such as some climate models which will be fur- ther discussed in future work.

A B
While this work focused on introducing the method and testing it with fairly theoretical models, we believe that this opens the door to studying a wide variety of systems in novel ways.We expect that the method be fruitful in diverse contexts where a mix of multistability and high dimensional behaviour are important.Most notable among those would be biological networks and climate systems.
A distinct limit of the approach is that it is only able to detect and track stable solutions of the investigated systems.Unstable solutions are not accessible with MCBB.A further important avenue of investigation is to study the mathematical properties of the algorithm described here in much more detail.In particular it would be highly desirable to understand the convergence properties of the algorithm.We also suspect that there is considerable scope for improving the clustering by making use of information from the continuation, rather than reverting to a standard density based algorithm.One other avenue of investigation where we will improve the method further is to use the statistics of the tail sample we record in order to track when the integration has reached the asymptotic regime in a suitable sense.
MCBB provides an excellent way to visualize the complex behaviour of systems where a traditional bifurcation analysis is often not useful or difficult to implement.It resolves the most important classes of asymptotic states and enables the user to track the size of its basins along changing parameters.

13 :
Obtained: N (N d × Ns)-matrices Si 14: Compute (N × N )-sized distance matrix D of all Si to each other.15: Density-based clustering (e.g.DBSCAN) of D 16: Analyze cluster memberships and statistics S for each cluster dependent on the parameter values p III. RESULTS

FIG. 2 .
FIG. 2. Approximate Relative Basin Volume of the two different classes of asymptotic states, fully infected (blue) and fully healthy (red), for configuration (A) of the Dodds-Watts model.The colored areas in the plot represents the basin volume of the respective state.Computed by using a sliding parameter window over the clustering results (see Sec. II D), a window length of 0.05 and an offset of 0.01 were used.

FIG. 4 .
FIG. 4. (A) Network structure of the investigated Kuramoto system.Red nodes have a negative drive (minus sign in Eq. 11) and blue nodes a positive one.(B) The basin bifurcation diagram for the system with the synchronized, partially synchronized and synchronized regimes.

FIG. 5 .
FIG. 5. Analysis of the Clusters exhibiting no synchronization (A) and partial synchronization (B).(A) and (B) are sliding histogram plots, similar to Fig. 3 and show the means of the frequencies over all nodes as histograms depending on the coupling parameter.(C) and (D) show the mean frequency of each individual oscillators over all samples in the cluster for cluster 2 (left) and cluster 3 (right).

FIG. 7 .
FIG. 7. Cluster diagram of the Stuart-Landau Oscillator network with pr = 0 for two different values of the clustering parameter DB .For (A) DB = 0.009 and for (C)&(D) DB = 0.008.MCBB resolves the different classes of asymptotic states: travelling wave (TW), regular clustered stationary waves (RCSW), (full) oscillation death (OD) and mixed partial synchronized / oscillation death (PS) states.When increasing DB states in which most (but not all) oscillators exhibit OD, while the remaining few oscillators are synchronized (PS) and the states in which all oscillators exhibit OD (OD) are merged to one cluster (OD+PS).The window size used is 0.025 and the offset is 0.01.

FIG. 8 .
FIG. 8. Introspective analysis of the two of the clusters also shown in Fig. 7. Plots (A),(B),(E),(F) are sliding histogram plots.For each sliding window of coupling values K, the respective measures of trajectories within the said cluster are plotted as a histogram in y-Direction.(A-C) inspect the RCSW cluster.(A),(B) show the mean and the standard deviation of the RSCW cluster.(C) and (D) are example trajectories from the respective clusters.(E), (F) show the mean and standard deviation of the PS cluster.

FIG. 9 .
FIG. 9. Results from the setup with two parameters, varying the amount of coupled neighbours r2 and the coupling strength K.

FIG. 10 .
FIG. 10. Results from the setup with two parameters, varying rewiring pr of the Watts-Strogatz random graph that mediates the repulsive coupling and the coupling strength K.

FIG. 12 .
FIG. 12.Further analysis on the clusters also shown in Fig. 7. (A),(B),(E),(F) are sliding window histograms fits of the denoted measures for trials with parameters within the respective window.(C) and (D) are example trajectories of trials within these clusters.