Decreased coupling among respiratory variables with effort accumulation

We applied symbolic transfer entropy (STE) for the detection of directed couplings between pulmonary variables registered during repeated progressive and maximal cardiopulmonary exercise tests (CPET). We verified the hypothesis whether effort accumulation has an impact on the decrease of the level of coupling between ventilation (VEbtps), fraction of expired oxygen (FeO2) and carbon dioxide (FeCO2). A group of 10 volunteers performed two consecutive CPET (T1 and T2) on a cycle ergometer. STE values obtained for T1 are higher than for T2, which indicates that the interaction of these variables is sensitive to effort accumulation. The difference of the STE between signals corresponds to the dominating direction of the coupling and indicates that FeO2 and FeCO2 drives the VEbtps.

Introduction. -Cardiopulmonary exercise test (CPET) is commonly used as a noninvasive tool, which allows the measurement of the physiological responses during physical effort [1,2]. Usually, time series recorded in CPET are used to estimate common physiological indicators such as: maximal oxygen uptake (VO 2 max), peak oxygen uptake (VO 2 peak) and ventilatory thresholds [3,4]. They are determined from respiratory gas exchange recordings [5][6][7]. However, these indicators offer limited information regarding the complex nature of the dynamic interactions between respiratory and cardiovascular systems, which can be especially crucial in exercise physiology [8][9][10] and specific adaptations to training [11,12]. Using quantitative methods for determining the coupling or coordination between recorded variables during CPET, it is possible to identify the structure of interactions in the human body in response to exercise-induced fatigue. Multivariate studies allow to obtain information about the direction of coupling, strength and time delays between variables [13].
Identification of interactions between components can be supported, among others, by Granger causality [14][15][16], entropy-based approaches [17,18], nonlinear prediction tools [19], symbolic dynamics [20,21], the time delay 28001-p1 stability measure [22,23], coordination [24,25] or transient interactions [26,27]. The Granger causality approach is based on the estimation of AR-models for the original time series. The causality assessment focuses on the analysis of errors obtained in the AR-model, when the second variable is introduced to the model [28]. If the time series can be better predicted, when the second variable is added to the model, then interactions between datasets are detected. Information theory provides the core of the entropy-based measures [29]. This methodology makes it possible to assess the amount of information needed to properly characterize data or information shared between variables. To quantify the information flow between bivariate system symmetric mutual information and asymmetric transfer entropy (TE) are commonly used. Especially the TE measure allows to estimate the directionality of the couplings between components in dynamical systems [30]. Another class of methods are especially used in causality of multivariate recordings: mutual information from mixed embedding (MIME) [31] and partial MIME [32]. Symbolic approaches require the transformation of the original data into symbolic form using a properly selected symbol language.
Several methods dedicated for coupling and causality detection offer various modifications depending on the data type and the potential application of the results. We refer to [13] for more detailed information and literature review.
In response to the reported growing interest of multivariate assessment of respiratory and cardiorespiratory changes as a consequence of effort accumulation and training interventions [11,14], we applied the symbolic transfer entropy (STE) [33] for the estimation of directed couplings in CPET experiments. The main goal of the analysis is to verify the occurrence of changes of couplings between signals recorded during two repeated maximal CPET. Since repeated bouts of maximal exercise increase the level of fatigue, but do not seem to affect significantly the reliability of maximal physiological variables (e.g., VO 2 max; [34]), consecutive CPET seem an appropriate procedure to test if effort accumulation impairs the respiratory couplings. We hypothesize that effort accumulation should have influence on the coupling detection and should cause the decrease of its magnitude in the second trial.
First, we described the experimental protocol and data used in the coupling estimation. In the next part of the study, we presented a detailed procedure, which supports the application of the STE method for the signals recorded during CPET. We proposed the directionality index and statistical comparison between trails in the section "Results". Finally, the discussion and summary is provided.

Data and protocol. -
Participants. Ten healthy and young adults (22.1 ± 2.9 y) participated in the study. There were four males and six females with mean body mass index 22.8±1.7 kg · m −2 . Subjects had no sport specialization, however they were involved in aerobic activities at least three times per week. The study was approved by the Clinical Research Ethics Committee of the Sports Administration of Catalonia and it was carried out according to the Helsinki Declaration.
Protocol. Participants performed two consecutive progressive and maximal CPET (test 1, T1, and test 2, T2) on a cycle ergometer. Both tests were separated by an uncompleted rest of 6 min duration. The session started at least 3 h after a light meal and participants were requested to avoid any vigorous physical activity 72 h before. Due to the short resting period between tests, T2 started with altered initial physiological conditions. In order to quantify such conditions, the VO 2 change parameter, defined as follows, was used: (1) where VO 2 mean T2/T1 is an average oxygen consumption from the initial baseline phase (20 samples averaged) in T2 or T1 test.
At the beginning of the T2 phase, VO 2 mean values are higher than in the T1 phase in the range of 6.12% to 42.53% (mean: 21.93%; standard deviation (SD) = 10.68%). A greater percentage difference was associated with a greater accumulated effort.
During the resting period participants kept sitting on the cycle ergometer. The CPET started at 0 W and the workload was increased by 25 W/min in males and 20 W/min in females. The test stopped when participants were not able to maintain the prescribed cycling frequency of 70 rpm for more than 5 consecutive seconds.
Participants breathed through a valve (Hans Rudolph, 2700, Kansas City, MO, USA) and respiratory gas exchange (oxygen and CO 2 content and air flow rate) was measured by an automated open-circuit system (Metasys, Brainware, La Valette, France). These signals were recorded breath by breath and the system was calibrated before each test.
Individual performances of participants resulted in different length of registered data. The average number of samples in the group from two trials is equal 423.95 (SD: 83.38). The average signal length measured in seconds is equal to 1119.7 s (SD: 119.08 s) Data preprocessing. -The signals recorded during CPET are nonstationary, which is shown by the occurrence of trends that can affect the final results of entropy estimators [35]. Considering the specificity of the data analyzed, we proposed the linear function in a moving window to remove trends. The method was adopted from detrended fluctuation analysis approach [36,37].
The algorithm is based on the following steps: i) selection of the sliding window width, ii) adjustment of the linear function in the analyzed window, iii) subtraction of the fitted value from the original one in the analyzed window, iv) sliding of the window by one sample forward. In the study, the trends removal has been performed for ventilation (VEbtps, body temperature pressure saturated), fraction of expired oxygen (FeO 2 ) and fraction of expired carbon dioxide (FeCO 2 ) signals. As a result, after trends reduction, the data oscillates around zero ( fig. 1). Analysis of the study group showed that the average number of breaths per minute (fR) was 23. At the beginning of the recording, when the load was low, the number of breaths per minute fluctuated in the range of 14-17 fR. Therefore, we decided to use a window width, which equals to 15 samples.
Symbolic transfer entropy. -Symbolic transfer entropy has been proposed as a method of detection of the dominant direction of information flow between time series in coupled systems [33]. It is an estimator of the TE measure [18], based on permutation entropy and refers to the symbolization of the data. STE indicates the interactions between two systems X and Y . The coupling is identified from datasets x = {x 1 , x 2 , x 3 , . . . , x N } and y = {y 1 , y 2 , y 3 , . . . , y N }, which are simultaneously recorded. The time series contain the same number N of data points. The values of original series are transformed to a new one by an appropriate symbolization.
Symbolization of time series. Symbolization involves the transformation of original time series into discrete symbol sequences. Obtained symbolized data retains much of the important temporal information [21]. The most common approach used in the symbolization process refers to the variability of the observations x, y. The segments are obtained from data range and then each segment is associated with a specified symbol. Two factors play a key role in this symbolization method: i) the number of symbols k; ii) the algorithm of determination of the segmentation points P .
A set of k symbols can be expressed as S = {0, 1, 2, . . . , k − 1}. Each symbol is mapped on the data points using segmentation points P = {P 0 , P 1 , P 2 , . . . , P k−1 }, which are thresholds in the rules: where i ∈ N ; [P 0 , P 1 ), [P 1 , P 2 ), . . . are ranges of x values, P 0 is a minimal and P k is a maximal value in the original signal. The same symbolization procedure is applied to the y time series. In this study, the selection of segmentation points P is based on the probability density function [38]. For this purpose, the empirical cumulative distribution function (ECDF) is determined separately for original signals. If 5 symbols are used, there are 6 segmentation points on the ECDF curve ( fig. 2(a)). For k = 5, the signal is transformed into set of symbol S = {0, 1, 2, 3, 4} ( fig. 2(b)). Note that usually the segments do not have the same width.
In the symbolization algorithm, the selection of the number of symbols is crucial. If the number of symbols k is too small, the original structure of information in the time series may be lost. A too large set of symbols k is associated with increased noise sensitivity [38]. In the case of the same number of symbols and measured values of the observation, the transformed signals and the originals are equal in the sense of the information that they contain [21].

Optimization of symbolization parameters selection.
In order to minimize the risk of incorrect symbolization parameters, the optimization algorithm was implemented. The optimization relies on the estimation of the specified entropy determined from the original time series and signal after symbolization.
The optimization is constructed from the observation of the entropy changes: if not enough symbols are proposed, the entropy will increase significantly after adding a new symbol (and a segmentation point) [39]. In such situation, it can be stated that the set of k symbols and associated segmentation points provides a faithful symbolic representation (alphabet) of the original signal and is optimal. Similar approaches have been proposed in several studies [38][39][40].
In our study, the selection of the optimal symbol set S was provided by determination of Shannon entropy values H as a function of k symbols [38]. The optimal k opt is found, when the entropy from the symbolized signal will be at least half of the entropy from the original observations H orig ( fig. 3) [38]. The Shannon entropy H is defined as [29] where p(x i ) is the probability of x = x i . Typically, the entropy of the transformed signal is increasing with the number of k symbols ( fig. 3). Time series decoded using the individually determined parameter m (word width) are characterized by higher Shannon entropy values in comparison with the original signal.

28001-p3
Based on the Shannon entropy condition, the optimal number of symbols for the analyzed VEbtps, FeO 2 , and FeCO 2 signals was k = 10.

Symbol sequence: embedding dimension.
The next step of the symbolization process is the determination of the m-length symbol sequences (words). This sequence reflects temporal patterns given in original data. The defined m-length word is usually moved along the symbolized time series by a specific step (here is equal to 1), causing a construction of a new sequence (word). The choice of the appropriate m also refers to embedding dimension. We suggested the Cao method [41] for m estimation and applied the R package nonlinearTseries v0.2.7 (function: estimateEmbeddingDim.r) [42] for its determination. The dimension m was found for each participant and signal separately. The values m vary from m = 7 to m = 10. According to the results from [38], the parameter m should be the same for both signals in STE estimation. Hence, a smaller dimension obtained for the pair of detrended signals is selected for the analysis.
Determination of directed information flow. The interrelations between the components of the system can be obtained from estimation of the exchange information (information flow) within its elements. TE is a nonlinear method, which was proposed for quantification of the couplings in multivariate systems. TE is constructed to effectively detect the asymmetries of information flow between driving components and targets [33]. In the paper, we focused on the bivariate transfer entropy (BTE).
The BTE is constructed from Shannon entropy (eq. (3)), which is adapted to bivariate recordings. The coupling estimation uses Shannon entropy but also the Kullback-Leibler distance, which describes the divergence between distributions p and q: The definition given by eq. (4) has similarity to eq. (3), however is represented in conditional form. Equation (4) can be rewritten as follows: It can be seen that the D KL marker shows the similarity between distributions p and q, when it is close to zero. The BTE uses such comparison between two distributions. The first distribution is constructed basing on the assumption that the current event x t in the process X depends on its past and additionally on the past events of the driving process Y : p(x t |x t−1 , y t−1 ). The second distribution is obtained (and constructed) from the opposite assumption: independence between X and Y processes: q(x t |x t−1 ). It means, that x events are defined only by their past and do NOT have a driver in the Y process [43]. Note that the presented comparison should be also interpreted as the deviation from the Markov property: p(x i |x i−1 , y i−1 ) = p(x i |x i−1 ), in which the process is without memory [43]. As a result, the D KL marker (eq. (4)) can be used for the TE estimation as where p(x i , x i−1 , y i−1 ) is a joint probability density function, and p(x i |x i−1 , y j−1 ) is a conditional probability density function. STE is an extension of TE, where instead of original data, the signals after symbolic transformation are used [14]: where x S i represents the symbol sequence with m length. While the STE can be determined for both directions Y → X and X → Y , a directionality index D S Y →X was introduced, which measures asymmetry of information transfer between Y and X: A positive value of the D S Y →X indicates that Y dominates in the information flow over X and can be treated as a 28001-p5   fig. 4(A). We detected a decrease in information flow in the T2 phase in comparison to T1. This is clearly observed especially for FeCO 2 → VEbtps (and vice versa), FeO 2 → VEbtps, FeCO 2 → FeO 2 pairs ( fig. 4(A)) and confirmed by nonparametric Wilcoxon statistical test, p < 0.05 (table 1). A clear difference in STE values is not observed between the T1 and T2 for signal pairs: VEbtps → FeO 2 and FeO 2 → FeCO 2 ( fig. 4(B)).
In the case of multiple testing approach (considering Bonferroni correction) the significant result is obtained only for one pair: FeCO 2 → VEbtps (table 1). Separate comparisons in table 1 are proposed due to bivariate coupling analysis. These results suggest the occurrence of not equal magnitudes of interactions between respiratory pairs of variables and indicate a possible dominating direction of coupling.
The directionality index D S Y →X ( fig. 4(C)), which measures asymmetry of information transfer, indicates that  We performed simplified assessment of the STE magnitudes to support the significance of the first application of STE to respiratory variables. For this purpose, we used the procedure proposed by Hamman et al. [44], which combines the data from one subject with signals from other subjects. One signal for a particular subject was combined with nine respiratory variables from other persons. As a result, maximal STE in such approach for T1 was 0.008 and for T2 was 0.023. These values are much smaller than those obtained in direct coupling assessment of respiratory variables for the same person (note that they exceed 0.8 in all cases shown in fig. 4(A), (B)) and confirm non-random interrelations between the studied variables.
Conclusions. -The assessment of the body's efficiency and adaptation processes in response to a given load is a challenge for which CPET are commonly used [2]. An increasing number of researchers report the need for new methods to undertake a comprehensive analysis of time series recorded during CPET [8,10]. As an alternative to the commonly recorded physiological markers, such as VO 2 max or ventilatory thresholds, we utilized STE for the assessment of directed coupling and dynamical behavior of the respiratory system during CPET [33]. As proposed previously, these techniques may surpass the prognostic limitations [45,46], and limited sensitivity to training and other types of interventions [12,[47][48][49] of the currently measured variables. The reduction of the information flow in the second test, reflecting the inefficacy of the feedforward and feedback mechanisms mediated by chemoreceptors to adjust ventilation [50], is not captured by mean VO 2 max values, which were similar in T1 and T2. In addition, the STE shows that the information transfer between the studied variables is asymmetric but bidirectional. This means that there is a circular causality, and not just a cause-effect relationship between FeCO 2 and FeO 2 with VEbts, as usually assumed. This circular causality, and not a specific physiological value (e.g., VEmax, VO 2 max) may be responsible for the increased impairment of the control and regulation of the cardiorespiratory function that brings to exhaustion in CPET.
The methodology that allows the analysis of the signals recorded during maximal exercise has many challenges. The main one is the non-stationarity reduction. The nonstationarity manifests by the presence of trends that have a direct impact on the nonlinear entropy measures [8,35]. Due to the above, the method of matching the linear function in a moving window was used to remove the trend [37]. The determination of couplings occurrence was additionally preceded by a comprehensive analysis of the selection of appropriate symbolization parameters, i.e., the number of symbols (using Shannon entropy) and the embedding dimension [38,41] for STE analysis.
The analysis of the obtained STE results shows that bivariate symbolic approach corresponds to the relevance assumptions in the CPET data analysis. It allows the detection of couplings between VEbtps, FeO 2 and FeCO 2 signals which provide information about response of the respiratory system to exercise-induced fatigue. STE with adapted symbolization methods can be used for short physiological signals, characteristic for CPET protocols. The developed technique indicates the dominant direction of information flow. Symbolization with the number of symbols smaller than the number of samples introduces a reduced resolution of the data, which is large enough to satisfactorily reproduce the characteristics of the original one based on its variability. STE results can be a tool to verify whether the effort accumulation has an impact on the decrease of the level of coupling. The method is sensitive to the effects of changed physiological conditions resulting from insufficient recovery after T1. The effect is observed in the reduced STE magnitudes determined for T2. We can conclude about the potential application of the STE method for the coupling estimation between respiratory variables during graded maximal CPET.