Dynamics of sleep: Exploring critical transitions and early warning signals

Background and objectives: In standard practice, sleep is classified into distinct stages by human observers according to specific rules as for instance specified in the AASM manual. We here show proof of principle for a conceptualization of sleep stages as attractor states in a nonlinear dynamical system in order to develop new empirical criteria for sleep stages. Methods: EEG (single channel) of two healthy sleeping participants was used to demonstrate this conceptualization. Firstly, distinct EEG epochs were selected, both detected by a MLR classifier and through manual scoring. Secondly, change point analysis was used to identify abrupt changes in the EEG signal. Thirdly, these detected change points were evaluated on whether they were preceded by early warning


Introduction
Traditionally, sleep is classified into distinct stages, spanning from wakefulness to deep sleep [e.g. 1 ]. Sleep stages are typically characterised by patterns in brain activity as reflected in Electroencephalography (EEG). According to the American Academy of Sleep Medicine (AASM), wake and sleep are divided into five stages: wakefulness (W); non-rapid eye movement (NREM) sleep of in-creasing depth sub-divided into N1, N2 and N3; and rapid eye movement (REM).
The qualitative type of scoring of sleep stages has been criticised because of the low interrater reliability [2] , the fixed time duration of scoring epochs [ 3 , 4 ], and the lack of automated scoring algorithms [ 3 , 5 ] as well as the absence of defining characteristics of the stages [ 2 , 4 , 6 , 7 ]. Previous studies have investigated the possibilities of automated scoring. Some automated scoring algorithms were trained on the manually scored EEG stages such as support vector machines [8] ; decision trees [ 9 , 10 ]; neural networks [11][12][13] and weighted complex networks [14] . Alternatively, other studies have tried to use algorithms that can operate The two-process model of Borbély [18] : The circadian clock (Process C) and sleep homoeostasis (Process S). Middle: 3D cusp model shown with the accompanying numbered catastrophe flags [36] : (1) sudden jump, (2) multi/ bi-modality and (3) inaccessibility between the modes. Top right: Different hystereses are displayed. The brown dashed lines represent the normal transitions between wake and sleep: when the sleep pressure is positive and exceeds a certain threshold, the system transitions into sleep, causing the sleep pressure to drop. Once the sleep pressure becomes negative and exceeds a certain threshold, the system transitions into the wake state. Note that the thresholds for falling asleep and waking up are at different points, indicating hysteresis. The yellow dashed lines represent more atypical transitions with excessive sleepiness when the threshold to fall asleep is low and difficulty maintaining sleep when the threshold to wake up is low. The purple dashed lines show no hysteresis, since the threshold is the same for both directions, and indicates constant shifting between the sleep wake states. (Flag 5) Increasing the splitting factor causes a progressively larger polarisation of the states, known as divergence. (Flag 6) Anomalous variance is an increase in behavioural variance in the neighbourhood of the bifurcation set (i.e. area where sudden jumps are possible). (Flag 7) If a person is perturbed (e.g. through noise) the sleep pattern would show large oscillations (divergence of linear response) that take a long time to fade away (i.e. critical slowing down; Flag 8).
independent of the manually scored EEG stages, such as EEG change-point segmentation in combination with cluster analysis [15] and Hidden Markov Models [HMM ; 16 ]. HMM have also been applied for data-driven determination of fMRI resting state-defined sleep stages [17] .
Although automated scoring algorithms might combat the problem of manual scoring and interrater reliability, both the manually and automated detection approaches seem to lack a formal definition of sleep stages. In this paper, we address whether it would be possible to overcome this problem by conceptualising sleep as a nonlinear dynamical system [18][19][20][21][22][23][24][25][26][27] . An important advantage of this conceptualisation is that it would provide an empirically useful definition of sleep stages in terms of attractor states of a nonlinear dynamical system. Attractors of a dynamical system are stable equilibrium states towards which the system tends to converge. Systems that have multiple stable equilibrium points rather than one, can switch between the equilibria by perturbations that tip the system over to a different 'neighbourhood' of attraction. Applied to sleep, shifts between sleep stages can be dynamically conceptualised as so-called critical transitions, in which the system is triggered by a small force and suddenly shifts towards an alternative attractor state once a threshold (i.e., tipping point) is exceeded [28] . Such critical transitions between stable states have been described in many complex dynamical systems such as ecosystems [ 28 , 29 ], financial markets [30] and depression [31] .
In natural systems, especially those of which we have a limited mechanical understanding, it is difficult to recognise critical transitions. Moreover, without detailed knowledge of the laws that govern the system, critical transitions in dynamical systems are rather unpredictable. In the cusp catastrophe theory, a special type of nonlinear dynamical system, critical transitions are modelled as sudden shifts in a system due to gradual changes of two control factors α and β [ 32 , 33 ]. This cusp model can be used to model for example the transitions of falling asleep (W-S) and waking up (S-W; see Fig. 1 ) [ 25 , 34 , 35 ]. The two-process model of Borbély [18] can also be incorporated into the cusp model as control factors (figure top/bottom left) as follows: The circadian clock (Process C) is the internal day-night rhythm that rises, promoting wakefulness/arousal, and falls following a 24-hour cycle. The homeostatic sleep propensity (Process S) increases with the duration of time spent awake and decreases during sleep. When sleep pressure (Process S) is very high and the alerting effect of the circadian clock (Process C) is low, the urge to sleep may become irresistible. At this threshold of maximum urge, the tipping point is reached and a critical transition follows. Properties that indicate the possibility of critical transitions are so-called catastrophe flags [36] . Characterising formal sleep transitions according to these criteria requires that at least two distinct states are shown, such as 'sleep' and 'wake' (multimodality; flag 2), where an intermediate state between sleep stages is rare (inaccessibility; flag 3) and where an abrupt switch between stages occurs when a certain threshold is reached (sudden jump; flag 1). Another remarkable property is that the point at which the system moves to another state depends on the direction of change. An example of this so-called hysteresis effect (flag 4) is the transition in state of water by melting at 0 °C and freezing at −4 °C (in disturbance free conditions).
Before a tipping point is reached, the system may show only little change, making critical transitions difficult to anticipate.
Interestingly, there appear to be generic properties that occur in a wide class of systems before a tipping point is reached such as critical slowing down (flag 8). Critical slowing down is the property that as a system moves towards a tipping point, it takes longer for the system to recover from small pertubations (i.e., the rate at which the system recovers to equilibrium after small perturbations slows down to zero) [28] . Statistically, critical slowing down can be detected by two phenomena when a system approaches a tipping point: (1) increase in the variance (i.e. standard deviation) of the system, and (2) elevated temporal autocorrelation (i.e. degree of similarity of a signal with its own past) [ 29 , 37 , 38 ]. Together, these indicators form early warning signals of a critical transition that have been shown to successfully predict critical transitions in complex systems [28] . For example, in EEG data of epileptic patients, increased variance and elevated temporal autocorrelation have been shown to predict a seizure [39] . Likewise, imminent extinction in animal populations was signalled by indicators of critical slowing down. Although these early warning signals have been observed to a wide variety of systems, it has yet to be investigated in the complex, multilayered system of the sleep cycle.
Conceptualising sleep stages as attractor states of a complex dynamical system might offer a new theoretical framework to investigate stages in sleep. Specifically, focusing on identifying transitions rather than stages might offer some advantages over the current sleep scoring analyses.
In the present study, we explored whether the theoretical framework of complex dynamical systems might provide a promising starting point to model the dynamics of sleep. To this end, we first used automated and manual detection techniques to define distinct sleep stages as a reference, where both the visual and automated sleep staging coincided. Subsequently, we used change point analysis [ 40 , 41 ], a powerful tool to determine whether a sudden change has taken place, to find discontinuities in this sleep EEG data signal. We detected several change points in the EEG and some of these change points were preceded by early warning signals enabling prediction of sudden sleep state changes.

Participants
We illustrated the approach in the sleep EEG of two healthy volunteers (females with age 45 and 69) who had participated in a published study [42] . Participants were recruited through the Sleep Registry [43] , and were screened by telephone first, followed by face-to-face interviews. The volunteers reported no sleep difficulties in online questionnaires, which was confirmed during the faceto-face interview.

Protocol
The participants completed two consecutive nights of laboratory polysomnography (PSG). On recording days, people were asked to refrain from alcohol and drugs, and limit themselves to a maximum of two cups of caffeinated beverages, which were allowed only before noon. PSG was recorded with a 256-channel LTM HydroCel GSN EEG net and a Polygraphic Input Box (Electrical Geodesic Inc., Eugene, OR), connected to a Net Amps 300 amplifier (input impedance: 200 MOhm, A/D converter: 24 bits, sample frequency 10 0 0 Hz). The lights-out time was adapted to the individual habitual bedtime. The sleep EEG recordings of the two participants were scored by an expert according to the AASM criteria.
The inter-rater agreement rate for this data was κ = 0.88. Every 30 s epoch was classified as either W, N1, N2, N3 or REM. When more than one sleep stage occurred within an epoch, the whole epoch was scored according to the stage seen in the largest portion of the epoch. For subsequent analyses, only the second night of sleep recordings was used, since the first night serves as a screening and adaptation night. Instead of using 3three EEG channels, 2 EOG channels and EMG according to the standard rules of AASM, we restricted the analysis to one EEG signal, namely the frontal Fz lead referenced to the vertex channel (Cz) in order to reduce the amount of data [ 8 , 44-46 ].

Sleep EEG preprocessing
In order to reduce the size of EEG recordings we used EEGLAB [47] to downsample the signal from 10 0 0 Hz to 100 Hz after band pass filtering with a range of 0.1-40 Hz. The data were detrended with a Gaussian kernel smoothing function to cope with nonstationarity and long trends in the data [48] . From the detrended, filtered and downsampled data, the spectral bands 1-3 Hz (delta), 4-7 Hz (theta), 8-13 Hz (alpha), 16-31 Hz (beta) were extracted separately using continuous wavelet transformation. Since changes in the EEG activity are usually reflected in multiple spectral bands, we computed three relative power ratios in the different spectral bands to be used to detect EEG transitions: (1) β/ δ ratio, (2) α/ δ ratio and (3) ( α + β) /( θ + δ) ratio. These three ratios have been shown to be features with low error classification rates for a single EEG channel [45] .

Data selection and analysis
The analysis consisted of three parts: (1) selection of epochs (2) formal detection of change points in the EEG signal and (3) evaluation whether detected change points were preceded by early warning signals.
Before investigating whether there are sudden changes in the EEG signal we first selected pairs of epochs where both the manual scoring and an automated classifier agreed upon which stages the participant were in (step 1, Fig. 2 ). We used this selection of paired epochs beforehand to minimise computer calculations and because we assumed that epochs would show the most definitive features of two different sleep stages in the EEG signal, thereby maximizing the potential to detect critical transitions between sleep stages. A multinomial logistic regression (MLR) was used to classify sleep into stages automatically using the scored epochs as a reference and the three relative power ratios, averaged over 30 s, as predictors. We performed this analysis using the "glmnet" package [49] in R [50] . For developing the MLR classifier we used K-fold cross-validation on the two participants and partitioned the data into 3 equally sized folds (segments). One fold is for validation and the other k-1 folds are used to train the model (2/3 of the data), which is repeated k times. Using this trained model, new predictions were made on the test data (1/3 of the data) and compared with the manually scored data to investigate the performance of the classifier. The epoch before and after every stage transition that was identified both by the expert as well as detected by the classification was selected for analysis. Note that we used all these selected transitions to form one EEG signal for further analysis. As a result, this signal contains the data from both participants and could result in a certain sleep stage occurring twice in a row (e.g., the occurrence of N2-N1 followed by N1-Wake would cause two times N1 in a row).
First, to formally detect EEG change points within these selected epochs we used an independent multiple change-point analysis using the package "ecp" in R [CPA ; 41 ]. In this analysis, change points are computed using a divisive hierarchical estimation algorithm that sequantially detects distributional changes within multivariate time-ordered observations, in our case the EEG signal (step 2, Fig. 2 ). This method enables us to simultaneously identify the number and locations of change points. The CPA was restricted to  detect the most significant changes ( p < .005) in the multivariate signal of the four wavelet coefficients of δ, θ , α, β spectral bands.
This multivariate signal was combined with the relative power ratio that was the most important feature for classification according to the MLR model. This feature was chosen by ranking the absolute coefficients of each feature in the classification model using the "caret" R package [51] .
Second, we explored whether the identified change points are preceded by early warning signals. We investigated increased variance and temporal autocorrelation (lag = 1) as two indirect indicators of critical slowing down and early-warning signals [28] . The variance and autocorrelation are estimated with the "earlywarnings" R package [52] . These signals are computed 25 s before every detected change point (step 3, Fig. 2 ). An indication of the direction of these signals was estimated by the mean Kendall tau, a measure of rank correlation. Increases in autocorrelation and variance could indicate a critical transition.

Multinomial logistic regression (MLR) based classification
Combining one night of PSG for the two participants resulted in a total of 1881 epochs that were used for the classification study. Fig. 3 shows, for one participant, a hypnogram of the manually scored sleep stages throughout the night along with the three relative EEG power ratio features. Fig. 4. Multivariate change point analysis using the four spectral band wavelet coefficients (alpha, beta, theta, delta; upper four plots) along with the relative power ratio between these four bands (lower plot). The red lines show the significant change point locations that were found for the whole multivariate signal, but for visualisation purposes are plotted separately for the bands.

Table 1
The average performance of the multinomial logistic regression method in sleep stage classification on two healthy participants, as well as the average percentage of total sleep time (TST) the sleep stages were visually scored by experts along with the inter-rater agreement rate. The correlation between the power features and the hypnogram is for β/ δ ratio r = 0.72, p < .001; for α/ δ ratio, r = 0.62, p < .001, and for ( α + β)/( θ + δ) ratio r = 0.70, p < .001 which confirms that these features are suited to be used in the classification algorithm. According to he MLR model the ( α + β)/( θ + δ) ratio is the most important feature with an absolute coefficient of β = 4.42, followed by β/ δ ratio ( β = 2.13) and α/ δ ratio ( β = 0.86). Table 1 presents the performance of the classifier, with precision, recall and the F-1 score for each of the five sleep stages (Wake, N1, N2, N3, REM). In addition, Table 1 shows the frequency for which the stage was scored manually along with the inter-rater agreement rate between the scorers. Precision is the fraction of correct predictions for a certain sleep stage, e.g. the percentage of epochs scored as N1 that wer also predicted as N1 by the classifier. Recall is the fraction of instances of a stage that were predicted correctly, e.g. the percentage of epochs classified as N1 that were also scored as N1. F-1 is the harmonic mean of precision and recall. Following Table 1 , the low precision of N1 indicates that in only 19.4% of the cases the model predicted N1 correct, indicating many false positives. The low recall in N1 shows many false negatives since in only 42.4% of the cases where sleep was scored as N1, it was predicted as N1. The low precision and recall result in a low F-1 score for N1. This is likely the result of low occurrence and inter-rater agreement rate of N1 in the visual scoring where the classifier is trained on ( Table 1 ). The evaluation metrics per class show that the classifier is relatively good in predicting N2, N3 and REM but not as good in predicting wake and N1. The overall classification accuracy was 0.71, indicating that overall, the prediction by the model was in accordance with the visual scorer for 71% of the epochs. The manual scorer and MLR agreed in 1339 epochs (of the 1881 epochs in total), of which 26 epochs contained a transition between sleep stages.

Multiple change point analysis (CPA)
The CPA identified 11 change points, indicated by the red lines in Fig. 4 in a total of 26 scored epochs (see the coloured blocks in Fig. 4 ). Almost all change points (10 out of 11) were located in the N2 epochs: Five transitions were in the transition away from N2, and five transitions towards N2. The other identified change point was between N3 and N1.

Early warning signals
To explore whether the identified change points are critical transitions, we investigated for each change point whether early warning signals preceded the tipping point. Fig. 5 shows the trends of the standard deviation (SD) and autocorrelation (AR1) 25 s prior to the detected change points of every transition ordered by the transition type (e.g., N1 to N2). Lighter coloured lines represent transitions that occurred the second half of the night. The Kendall tau was calculated to investigate whether the trends were significant, where a significant trend in either of the signals could indicate critical slowing down. For example, in the panel N2-N1 (last column, second and fourth row) the lines for both SD and AR1 visualise the three transitions from N2 to N1. As can be seen from the figure, in two out of three transitions there was a significant trend in the standard deviation, and in two out of three transitions there was a significant trend in the autocorrelation. Overall, the results show no visible universal trend, because the direction of the trends differs between transitions. For example, there are indications of decrease in SD (N3 to N1), increase in SD (N2 to REM and N1 to N2), decrease in autocorrelation (N2 to N1) and increase in autocorrelation (REM to N2). The trends also differ within the same transition type. For example, in the transition N2 to N1 there are signs of both increase and decrease in AR1. The most strong indications of critical slowing down can be found from REM to N2, displaying an increase in AR1 in both transitions.

Discussion
In this study we explored the conceptualisation of sleep as a nonlinear dynamical system, in order to provide an empirically useful definition of sleep stages and transitions. Using the empirical criteria of a nonlinear dynamical system, we identified multiple change points in the EEG signal, representing distinct changes in EEG of healthy sleepers throughout the night. Moreover, we found that some of these change points are preceded by early warning signals, which indicates that there are critical transitions in sleep. These results suggest that a focus on transitions in EEG might reveal a new characterisation of the dynamics in sleep that could remain undetected in classic sleep analysis.
Interestingly, classic sleep analyses seem to differentiate between certain sleep stages that cannot be supported by the approach of change point analysis. Our findings show that most "scored transitions", between fixed 30 s epochs, could not be detected statistically with a distribution change, as only 11 out of 26 transitions were corroborated, mostly to or from N2. Kaplan et al. [15] showed a similar approach using EEG change point segmentation and found significant correlations between N2 and delta and theta bands. These correlations reflect the appearance of Kcomplexes (brief negative high-voltage peaks) and sleep spindles (burst of brain activity) that characterise N2 sleep. Possibly, Kcomplexes and sleep spindles result in such distinct beta, theta and delta bands, that make changes into or from N2 stage sleep more easily detectable as critical transitions.
The current study is, to our knowledge, the first to investigate early warning signals in sleep. Early warning signals are, however, a known phenomenon in other complex systems, ranging from climate change [29] to relapse to depression [31] . Our findings of elevated autocorrelation before abrupt shifts in EEG show the same generic properties as in these other complex systems, where elevated autocorrelation is defined as a robust indicator of critical slowing down [53] . Another less robust indicator is increased variance, which we find in some of the transitions (i.e., N2 to REM and N1 to N2) but also decreased variance (i.e., N2 to N1 as well as from N3 to N1) [53] . Probably, sleep research may profit from technical developments in other fields concerning the detection of these generic properties of transitions in other complex systems.
A few limitations deserve mention. Firstly, this is an exploratory study on only two cases. The suggested approach needs to be applied to data of a larger and more diverse (healthy and sleepdisordered, young and elderly) group of people. Secondly, in identifying possible change points, we focused on the sleep stage transitions that were agreed upon by the manual scoring and by the automated classifier. While this approach may have resulted in selecting the most characteristic transitions and reduced the computational load, we probably failed to detect some change points, especially for stages where the agreement was low. For example, we found no critical transitions between N1 and wake, possibly because of the rare occurrence and low inter-rater agreement rate in visual scoring for N1. This low agreement rate has been reported in other studies, and is probably due to the short duration of the N1 stage [11] . A next step would be to locate the changes in the EEG without this initial selection, since it potentially limited the number of transitions found in this study.
In general, more work is needed to investigate how robust the early warning signals in complex dynamical systems are, in particular with regard to the occurrence of false positives or false negatives [28] . False positives occur when an early warning signal is not the result of approaching a critical transition, but rather occurs due to chance or a confounding trend. On the contrary, false negatives indicate situations when a critical transition did occur but no early warning signal was detected. Specifically for this study we detrended the signal as a strategy to circumvent false positives [54] , and more research is needed to ensure early warning signals can be measured reliable.
Overall, this new approach of detecting sleep state transitions shows promising implications for future research in sleep. Our analysis of change points showed that there are few, but clear discontinuities in EEG that shape the architecture of sleep. The signals of different EEG channels during sleep are highly correlated and therefore only one EEG channel (frontal Fz channel) was examined in this study, similar to previous studies [ 8 , 44-46 ]. Next to EEG channels different sources might be investigated and combined to get a more detailed understanding of particular changes. For example, distinguishing between NREM and REM sleep benefits from EMG and EOG channels [55][56][57] . Also, cardiac activity measured from ECG or pulse [ 55 , 58 ], has been found of value to predict short periods of wakefulness; and finally, respiratory bands can be used to measure breathing patterns during sleep [ 20 , 56 ]. An advantage of these signals is that they could allow large scale population assessment by measuring them wirelessly without a costly EEG. In addition, a new line of research is the implementation of homebased sleep monitoring systems using Internet of things (IoT) platforms [ 56 , 59 , 60 ], which offers the opportunity to process and analyse information real-time from different sensors. For future research it would be interesting to investigate our proposed framework with early warning signals in such big data monitoring system, circumventing the more traditional and costly polysomnography [61][62][63] . Secondly, these systems make it feasible to apply it to sleep disorders such as bruxism, OSAS and insomnia.
The conceptualisation of sleep as a complex dynamical system and the focus on transitions could provide new avenues to investigate healthy and disrupted sleep. Importantly, this conceptualisation lays out objective definitions of sleep quality and disrupted sleep. Sleep quality relates to the stability and accessibility of an equilibrium state. From this perspective, the fragmented sleep that is characteristic of insomnia might be modelled as constantly shifting between instable sleep states. The concept of hysteresis (a delay in the transition of falling asleep and waking up) may help to understand insomnia-related disorders. Exploring these possibilites by investigating the occurrence and characteristics of abrupt changes could lead to a better understanding of individual and pathological differences in sleep quality.

Declaration of Competing Interest
The authors declare that they have no conflict of interest.