COHmax: an algorithm to maximise coherence in estimates of dynamic cerebral autoregulation

Objective: The reliability of dynamic cerebral autoregulation (dCA) parameters, obtained with transfer function analysis (TFA) of spontaneous fluctuations in arterial blood pressure (BP), require statistically significant values of the coherence function. A new algorithm (COHmax) is proposed to increase values of coherence by means of the automated, selective removal of sub-segments of data. Approach: Healthy subjects were studied at baseline (normocapnia) and during 5% breathing of CO2 (hypercapnia). BP (Finapres), cerebral blood flow velocity (CBFV, transcranial Doppler), end-tidal CO2 (EtCO2, capnography) and heart rate (ECG) were recorded continuously during 5 min in each condition. TFA was performed with sub-segments of data of duration (SEGD) 100 s, 50 s or 25 s and the autoregulation index (ARI) was obtained from the CBFV response to a step change in BP. The area-under-the curve (AUC) was obtained from the receiver-operating characteristic (ROC) curve for the detection of changes in dCA resulting from hypercapnia. Main results: In 120 healthy subjects (69 male, age range 20–77 years), CO2 breathing was effective in changing mean EtCO2 and CBFV (p < 0.001). For SEGD = 100 s, ARI changed from 5.8 ± 1.4 (normocapnia) to 4.0 ± 1.7 (hypercapnia, p < 0.0001), with similar differences for SEGD = 50 s or 25 s. Depending on the value of SEGD, in normocapnia, 15.8% to 18.3% of ARI estimates were rejected due to poor coherence, with corresponding rates of 8.3% to 13.3% in hypercapnia. With increasing coherence, 36.4% to 63.2% of these could be recovered in normocapnia (p < 0.001) and 50.0% to 83.0% in hypercapnia (p < 0.005). For SEGD = 100 s, ROC AUC was not influenced by the algorithm, but it was superior to corresponding values for SEGD = 50 s or 25 s. Significance: COHmax has the potential to improve the yield of TFA estimates of dCA parameters, without introducing a bias or deterioration of their ability to detect impairment of autoregulation. Further studies are needed to assess the behaviour of the algorithm in patients with different cerebrovascular conditions.


Introduction
The concept of dynamic cerebral autoregulation (dCA) is based on the tendency of cerebral blood flow (CBF) to return to its original value following a transient disturbance, provoked by a sudden change in arterial blood pressure (BP). Although dCA was initially assessed in studies where a rapid drop in BP was induced by the sudden deflation of pressurised thigh cuffs (Aaslid et al 1989); subsequently, a number of alternative approaches have been proposed, whose merits and limitations are still being debated Claassen 2018, Tzeng and. Chiefly amongst these different possibilities, is the use of spontaneous fluctuations in BP as the stimulus to induce corresponding changes in CBF (Tzeng and Panerai 2018). Spontaneous changes in BP can be treated as isolated transients (Panerai et al 1995(Panerai et al , 2003, or, more commonly, as the input function in transfer function analysis (TFA), where corresponding changes in CBF, or CBF velocity (CBFV) are regarded as the output function (Giller 1990, Panerai et al 1996, Zhang et al 1998, Claassen et al 2016. On one hand, the use of TFA in combination with spontaneous fluctuations in BP is the ideal approach in clinical studies, since the necessary noninvasive physiological measurements can be performed in critically ill patients, something that is not feasible with other methods (e.g. changes in posture), and it also minimises disturbances to ongoing physiological processes. On the other hand though, its reliability has been questioned, mainly due to its poor reproducibility and susceptibility to nonstationarity (Panerai 2013, Sanders et al 2018, 2019, Simpson and Claassen 2018. Given its widespread utilization as a tool for clinical assessment of dCA, TFA of spontaneous fluctuations in BP deserves further attention to overcome its limitations. A recent white paper from the International Cerebral Autoregulation Research Network (CARNet) is likely to lead to improvements resulting from greater standardisation (Claassen et al 2016). Limited BP variability has also been suggested as the cause of poor reproducibility observed with the classical parameters extracted by TFA to characterise dCA, such as the gain, phase and the autoregulation index (ARI) (Tiecks et al 1995, Zhang et al 1998, Panerai et al 1998b, Liu et al 2005, Simpson and Claassen 2018. Recent work has demonstrated removal of recordings with low BP variability can lead to improvements in reproducibility (Elting et al 2020).
A further possibility for improving the reliability of TFA estimates of dCA, is by boosting the coherence of the BP-CBF transfer function. At each frequency, coherence represents the fraction of output power (i.e. CBF/CBFV) that is linearly explained by the input power (i.e. BP). Therefore, similar to a correlation coefficient, coherence ranges between zero and one, with values approaching one in the case of a linear, univariate relationship between BP and CBF, and measurements devoid of noise. Coherence will tend towards zero when the signal-to-noise ratio (SNR) is poor, there are multiple determinants of CBF, or if the relationship with BP is highly nonlinear (Bendat and Piersol 1986). Although initially proposed as a metric of dCA efficiency (Giller 1990), the main use of coherence in TFA studies of dCA has been as an indicator of the reliability of gain and phase estimates. At each frequency, estimates of gain and phase should only be accepted if the corresponding values of coherence are statistically significant; this criterion being usually based on the 95% confidence limit of coherence for the null hypothesis (Benignus 1969, Claassen et al 2016. In clinical studies, rejecting estimates of gain and phase at different frequencies, where coherence is below the 95% confidence limit threshold, is problematic as it would lead to incomplete sets of data with different subsets of harmonics represented in different patients. An important alternative to this approach is to use all the information contained in the gain and phase frequency responses, as reflected by the ARI (Tiecks et al 1995). The ARI ranges from 0 (absence of autoregulation) to 9 (best dCA that can be observed) and can be derived from the CBF/CBFV step response to the BP input, calculated from the inverse fast Fourier response of the gain and phase (Panerai et al 1998b). With this approach, coherence can still be used as a marker of reliability, with values below the 95% confidence limit leading to the rejection of estimates of ARI . The challenges of performing measurements in a clinical environment usually lead to worse SNR and poorer values of coherence in patients when compared to healthy controls. To address this problem, we present a new algorithm for estimation of ARI, aimed at maximising values of coherence by automated, selective removal of sub-segments of data. In other words, we tested the hypothesis that improvements in TFA coherence will lead to greater sensitivity and specificity for detection of dCA deterioration, as assessed with ARI. Although motivated by the need to improve the reliability of dCA metrics, this new algorithm would also be applicable to other areas of physiological measurement using TFA, such as estimates of baroreceptor sensitivity (Robbe et al 1987) or heart rate (HR) variability (Saul et al 1991).

Subjects and measurements
Participants were healthy subjects, recruited in four previous studies where hypercapnia was induced by 5% CO 2 breathing in air. All studies had local Ethical Committee approval and all participants provided written informed consent (Katsogridakis et al 2013, Maggio et al 2013, Llwyd et al 2017. Subjects were 18 years of age or older without any history or symptoms of cardiovascular, neurological or respiratory disease. Volunteers avoided caffeine, alcohol, and nicotine for ⩾4 h before attending a research laboratory with controlled temperature (20 • C-23 • C) and free from visual or auditory stimulation. All recordings were performed with subjects in the supine position with the head elevated at 30 • . Following instrumentation and a 20 min rest, two 5 min recordings were performed in each subject. The first recording corresponded to baseline resting conditions with subjects breathing ambient air. In the second recording, after a 60 s period of breathing air, subjects were switched to breathing 5% CO 2 in air, through a face mask that was tightly fitted to avoid leakage, as confirmed by visual inspection of the end-tidal CO 2 (EtCO) waveform. After 3 min of CO 2 breathing, subjects were returned to ambient air and a further 60 s was recorded during return to normocapnia.
BP was recorded continuously using a Finapres/Finometer device (FMS, Finapres Measurement Systems, Arnhem, Netherlands), attached to the middle finger of the left hand. Systolic and diastolic BP were measured by classical brachial sphygmomanometry before each 5 min recording. HR was derived from a three-lead electrocardiogram (ECG). EtCO 2 was recorded continuously via nasal prongs (Salter Labs) by a capnograph (Capnocheck Plus). CBFV was measured in both middle cerebral arteries (MCAs) using transcranial Doppler ultrasound (TCD, Viasys Companion III; Viasys Healthcare) with 2 MHz probes secured in place using a head-frame. The servo-correcting mechanism of the Finapres/Finometer was switched on and then off prior to measurements.
Data were simultaneously recorded onto a data acquisition system (PHYSIDAS, Department of Medical Physics, University Hospitals of Leicester) for subsequent off-line analysis using a sampling rate of 500 samples/s.

Data analysis
All signals were visually inspected to identify artefacts; noise and narrow spikes (<100 ms) were removed by linear interpolation. CBFV channels were subjected to a median filter and all signals were low-pass filtered with a 8th order Butterworth filter with cut-off frequency of 20 Hz. BP was calibrated at the start of each recording using systolic and diastolic values obtained with sphygmomanometry. The R-R interval was then automatically marked from the ECG and beat-to-beat HR was plotted against time. Occasional missed marks caused spikes in the HR signal; these were manually removed by remarking the R-R intervals for the time points at which they occurred. Mean, systolic and diastolic BP and CBFV values were calculated for each cardiac cycle. The end of each expiratory phase was detected in the EtCO 2 signal, linearly interpolated, and resampled with each cardiac cycle. Beat-to-beat data were spline interpolated and resampled at 5 samples/s to produce signals with a uniform time-base.
In-house software, implemented in Fortran, was used to perform TFA of the BP-CBFV relationship using Welch's method (Welch 1967) with different combinations of segment durations (SEG D ) as described below.
The mean values of BP and CBFV were removed from each segment and a cosine window was applied to minimise spectral leakage. The squared coherence function, amplitude (gain) and phase frequency responses were calculated from the smoothed auto-and cross-spectra using standard procedures (Panerai et al 1998a, Claassen et al 2016. The CBFV impulse response to the BP input was estimated using the inverse fast Fourier transform of gain and phase (Bendat and Piersol 1986) and the corresponding step response was obtained by numerical integration for positive values of time. (Tiecks et al 1995) proposed 10 template curves for the CBFV response to a step change in BP, each of these curves corresponding to a value of ARI, ranging from 0 to 9. For each recording, the corresponding value of ARI was estimated by comparing the CBFV step response with each of the template curves and choosing the best fit using the normalised minimum square error (NMSE). ARI values were only accepted if the mean squared coherence function for the 0.15-0.25 Hz frequency interval (see Discussion) was above its 95% confidence limit, adjusted for the corresponding degrees of freedom (DF) , and the NMSE was ⩽0 .30 .

COH max algorithm
For each 5 min recording, increasing values of coherence were obtained according to the following procedure: • A Reference Setting condition was initially adopted to estimate coherence using all segments available in the 5 min recording with SEG D settings of 102.4 s, 51.2 s or 25.6 s (Claassen et al 2016. With a sampling rate of five samples/s, these durations corresponded to N W = 512, 256 or 128 samples, respectively. In what follows, values of SEG D will be referred to as 100 s, 50 s and 25 s, respectively, for simplicity. With 50% superposition of segments, the number of segments (N SEG ) used to obtain estimates of the BP and CBFV auto-and cross-spectra were 5, 11 and 23, for SEG D values of 100 s, 50 s and 25 s, respectively. For each value of SEG D , a receiver-operating characteristic (ROC) curve analysis was performed for the detection of changes in ARI due to hypercapnia, in comparison with corresponding values of normocapnia. The areaunder-the-curve (AUC) was calculated for statistical testing of differences between ROC curves resulting from increases in coherence. • For each setting of SEG D , the corresponding N SEG segments were assigned at fixed positions along the 5 min recording, that is, in sequential fashion, also taking into consideration the 50% superposition of segments. • At each step j = 1, 2,…,N SEG -2 data segments were removed one at a time and the coherence was recalculated for all combinations of the remaining N SEG -j segments. The segment corresponding to the combination with the lowest coherence was removed from the ensemble and the number of segments was reset to N SEG -j. The coherence, ARI index and AUC were re-calculated and their dependence on the number of segments was expressed as COH(N SEG ) and ARI(N SEG ), and AUC(N SEG ), respectively. • Stage (iii) above was repeated until only two segments remained.
In summary, for each setting of SEG D (100 s, 50 s, or 25 s), a total of N SEG -1 estimates of coherence, ARI and AUC were obtained, corresponding to 4, 10 and 22 values in each setting, respectively, including the Reference Setting values obtained from (i) above.

Statistical analysis
Data were treated as normally distributed after visual inspection of histograms and probability plots, taking into consideration the large size of the sample (n > 100) studied. Differences between parameters were assessed using the Student's t-test. Multiple parameter comparisons were performed with parametric repeated-measures ANOVA. Differences between values derived for the right and left hemispheres were averaged when no significant differences were found. Association between variables was tested with linear regression. For each value of N SEG , the 95% confidence limits for coherence were obtained as reported previously (Claassen et al 2016. The improvement in ROC detection, due to increased values of COH(N SEG ) was assessed by testing the AUC(N SEG ) with the method proposed by (Delong et al 1988). A p-value of <0.05 was assumed to indicate statistical significance.

Results
One hundred and twenty healthy subjects (66 male), aged 43.2 ± 15.1 years old (range 20-77 years) provided a complete set of measurements for both baseline and hypercapnia. As shown in table 1, highly significant differences were observed for EtCO 2 and CBFV, as well as for BP and HR between normocapnia and hypercapnia. No inter-hemispherical differences were found for any of the parameters studied, which were then averaged for the right and left MCA.
According to its design, COH max led to increases in coherence in all subjects, both during normocapnia and hypercapnia, but with different individual patterns, as illustrated in figure 1. For the population as a whole, increases in coherence were similar for normocapnia and hypercapnia (figures 2(A) and (B)), with SEG D = 25 s providing the highest mean values at N SEG = 2, followed by SEG D = 50 s and 100 s, respectively. In each case, the starting value of N SEG was the Reference Setting, corresponding to 5, 11 and 23 segments, for SEG D values of 100 s, 50 s and 25 s, respectively. Noticeably, as the number of segments was reduced, so was the inter-subject variability as expressed by the standard errors in figure 2.
Despite marked increases in coherence resulting from COH max , the mean ARI remained relatively stable (figures 2(C) and (D)), but showed highly significant differences due to hypercapnia and also due to SEG D , but only for the case of SEG D = 25 s (table 2). The relative stability of ARI, as the number of segments was reduced with application of COH max , can be expressed by the distributions of intra-subject standard deviations (SD ARI Iintra , table 2) as depicted in figure 3, showing modes of ⩽0.5 units in all cases. Correlation of SD ARI Iintra with the Reference Setting values of coherence were significant for SEG D = 50 s (p = 0.0036) and SEG D = 25 s (p = 0.008), but only for the hypercapnia condition. Assuming that coherence changed from 0.0 to 1.0, the corresponding expected improvement in SD ARI Iintra would be of approximately 50% in both cases. For SEG D = 100 s, there was no significant association between coherence for the Reference Setting and SD ARI Iintra for either the normocapnic or hypercapnic conditions. The number of values of ARI that were rejected due to the joint criteria, based on the 95% confidence limit for coherence and the NMSE, was relatively small (table 3), but it still decreased significantly with the (1) Rapid rise in coherence up to eight segments, followed by a more gradual rise. (2) Initial value of coherence was below the 95% confidence limit (solid line), then reached the confidence limit for NSEG = 8 and continued to rise up to NSEG = 2. (3) Despite some gradual improvement in coherence, values remained below the 95% confidence limit curve (solid line) for all values of NSEG.  use of COH max , Both normocapnia (p < 0.001) and hypercapnia (p < 0.005) showed significant rates of improvement, but with a larger difference in hypercapnia as compared to normocapnia (p = 0.03). ROC analysis led to relatively stable values of AUC(N SEG ) with gradual reductions in N SEG along with increases in coherence resulting from COH max ( figure 4). In other words, for each value of SEG D , coherence

Main findings
Despite the substantial increase in the computational effort required by the COH max algorithm, there were no noticeable increases in execution time, in comparison with standard TFA analysis (Claassen et al 2016), running on a 2.7 GHz personal computer in DOS mode. The feasibility of increasing coherence values, in the spectral region where a linear relationship between BP and CBFV would be expected (0.15-0.25 Hz), whilst still retaining enough BP and CBFV power to provide acceptable SNR , was well demonstrated by the steady increase in coherence shown in figure 2. Although the dataset analysed comprised high-quality recordings, with mean values of coherence well above its 95% confidence limit for the Reference Setting (figures 2(A) and (B)), a relatively small number of subjects showed coherence values below the 95% confidence limit (figure 1) that would lead to their rejection and impossibility of extracting corresponding values of ARI (table 3). A significant number (15.8% to 83.3%) of these recordings could be recovered with COH max , showing its potential to contribute towards improving the use of dCA assessment in personalised patient care. Pertinent to the possibility of improving coherence by the selective, but automated removal of segments of data, the ARI index remained relatively stable (figures 2(A), (B) and 3) thus showing that the algorithm did not introduce any biases in its estimation, except at the lowest values of N SEG for SEG D = 25 s (figures 2(C) and (D)). On the other hand, our main hypothesis, that increases in coherence would lead to corresponding improvements in the detection of worsening CA, as would be expected with hypercapnia (Aaslid et al 1989, Panerai et al 1999, was rejected given that AUC remained relatively constant with increases in coherence ( figure 4). Noteworthy, the AUC for SEG D = 100 s was significantly higher than that observed for SEG D = 50 s or 25 s (figure 4).
Taken together, our findings suggest that COH max could be a useful tool to rescue recordings with unacceptable values of coherence in the Reference Setting, but without the expectation that it would necessarily lead to improvements in diagnostic discrimination, given that ARI values and corresponding ROC curves were broadly not affected by the algorithm.

Methodological considerations
TFA based on the Fourier transform, requires calculation of the auto-and cross-spectra (Bendat and Piersol 1986). With a single, long segment of data, e.g. 5 min duration, spectral estimates will show considerable variability, following a chi-square distribution with two DF, and a coefficient of variation of 1.0 (Bendat and Piersol 1986). To reduce the variance of spectral estimates and, consequently, the reliability of estimates of gain and phase, Welch proposed smoothing the auto-and cross-spectral estimates by averaging multiple segments of data from the original long recording, thus increasing the number of DF (Welch 1967). One additional benefit of this approach is the possibility of calculating the coherence function, something that is not possible with a single segment of data (Benignus 1969). One interesting feature of the Welch method is that the data segments used for smoothing the auto-and cross-spectra do not need to be contiguous. When breaking down a long recording into N SEG segments with duration [SEG D ,N W ], these are often shifted across and superimposed by a certain amount, typically 50% (Claassen et al 2016), but in the final calculation of the smoothed spectra, it does not matter the order in which segments are selected. This property of the Welch method can be used to remove bad segments of data, or to focus on specific events in a longer recording (Panerai et al 2005). In the present study, we benefitted from this property, to gradually remove automatically selected segments of data in order to increase the coherence of TFA for the dynamic BP-CBFV relationship. The vast majority of reports in the literature of dCA assessment by means of TFA include estimates of coherence as a marker of the reliability of estimates of gain and phase, as well as the ARI index . Although the threshold adopted for the minimum value of coherence that should be used for acceptance of TFA estimates has been fairly variable, ranging from 0.12 (for N SEG = 15) (Claassen et al 2016) to 0.50 (for any N SEG ) (Zhang et al 1998), the literature is unanimous that estimates of gain, phase and ARI (when obtained via TFA), should not be accepted below a pre-defined threshold of coherence. Ideally, this threshold should be based on the 95% confidence limit of the coherence distribution for the null hypothesis (or other value of 1-α) (Benignus 1969). For 5 min recordings, using SEG D = 100 s and 50% superposition, the coherence threshold will be 0.34, for α = 0.05 (Claassen et al 2016). Complete curves of the 95% confidence limit as a function of N SEG , have been reported for other values of SEG D .
Another common observation in the literature is the suggestion that the higher the coherence, the more reliable the estimates of gain and phase will be (Claassen et al 2009, Smirl et al 2015. This assumption is understandable, given that poor coherence can be caused by low SNR, as well as by other factors, such as non-linearity and multiple influences on the output variable (i.e. CBFV). Accordingly, it should be expected that by increasing coherence, one would obtain improved estimates of gain and phase, and, by extension, of ARI calculated via TFA, leading to better diagnostic and/or prognostic accuracy. In this study, with values of coherence starting at ∼0.65 (Reference Setting) and increasing to around 0.9 (figure 2), a substantial increase in coherence did not confirm those expectations, as reflected by the stable values of AUC shown in figure 4 and the lack of association between SD ARI intra and the coherence of the Reference Setting (with the exception of SEG D = 50 s and 25 s in hypercapnia). This result can be explained on theoretical grounds. Both the relative error of gain, and the standard error of estimates of phase, are predicted to vary as [(1-γ 2 )/2 γ 2 N SEG ] 1/2 , where γ 2 is the squared coherence function as calculated in our study (Bendat and Piersol 1986). As coherence increases, the errors will tend to go down, but, in our case, this is achieved with a gradual reduction in N SEG ( figure 2) and, as a result, the estimation errors for gain and phase tend to remain approximately constant, which would explain similar behaviour for ARI.
Further work is needed with other datasets to replicate our findings, ideally involving recordings where the mean coherence for the Reference Setting is much lower than what we obtained. Another potential use of COH max would be in studies of the nonstationarity of dCA (Panerai 2013). Recordings with high values of SD ARI intra might reflect the presence of nonstationarity of dCA parameters, that could be caused by recording artefacts, variable levels of sensorimotor or cognitive stimulation, changes in breathing patterns, or unknown physiological processes (Panerai 2013). Nonstationarity of physiological origin is thought to be behind the poor reproducibility observed in most metrics of dCA (Elting et al 2014, Sanders et al 2019 and COH max could be a useful tool to address this problem.

Clinical implications
The standard approach to assessment of the diagnostic or prognostic accuracy of a physiological measurement is the analysis of ROC curves as a global representation of its sensitivity and specificity for all possible thresholds to distinguish between two distinct groups of participants or different physiological conditions. ROC analysis has been applied in clinical studies of dCA, using several different metrics (Brady et  For an index of dCA to discriminate between patient and control groups, there is an underlying assumption that all patients have impaired autoregulation, which is something that cannot be guaranteed, except in conditions where all patients are severely ill. To avoid the fallacies of this assumption, hypercapnia has been used as a surrogate for depressed dCA (Aaslid et al 1989, Panerai et al 1999, Katsogridakis et al 2013, Maggio et al 2013, with the added benefit that each subject can act as their own control. In this study, hypercapnia led to significant depression of dCA (figures 2(C) and (D)), as well as highly significant values of AUC, when compared to the null hypothesis of AUC = 0.5 ( figure 4). Nevertheless, the values of AUC we obtained were lower than corresponding values in the literature, but those involved different physiological conditions (Katsogridakis et al 2013), or more complex mathematical models (Chacon et al 2018).
The finding that COH max can rescue recordings that would be rejected with the Reference Setting (table  3), without degrading of the ARI's ability to detect worsening of dCA, as reflected by the ROC AUC (figure 4), suggests it can be a useful tool to allow assessment of dCA in patients who otherwise would be denied this test. For patients who are sufficiently fit, improvements in coherence can be obtained with other protocols, such as the squat-stand manoeuvre (Claassen et al 2009, Smirl et al 2014, 2015, Simpson and Claassen 2018, but for critically ill patients, or those who cannot tolerate changes in posture, or even mild exercise, assessment of dCA based on spontaneous fluctuations in BP is the main alternative (Tzeng and Panerai 2018), as demonstrated by the widespread use of this approach in stroke and severe head injury (Rivera-Lara et al 2017, Intharakham et al 2019a. In critically ill patients, or those with conditions such as Parkinson's disease, good quality recordings are much more challenging than those performed in healthy volunteers under ideal conditions. As a result, the likelihood of recordings with poor coherence in the Reference Setting condition is much greater than that found in our healthy group, leading to a much greater fraction of data rejection with the TFA approach. It is in this context that COH max might prove of utility, but further work is needed with different populations to assess the extent of the benefit that can be derived. The study also provided additional information that could benefit clinical applications of dCA assessment. The white paper from CARNet (Claassen et al 2016), has provided a number of recommendations for improving standardisation of TFA settings, aiming to improve comparability of studies and also as an essential requirement to expand multi-centre collaborations (Beishon et al 2020). However, many of the recommendations of the white paper were based more on preferences identified in the literature (Meel-van den Abeelen et al 2014), than objective evidence (Claassen et al 2016). This is the case with the recommendation to standardise the duration of recordings to 5 min, with the use of SEG D = 100 s for TFA with Welch's method (Claassen et al 2016). As mentioned above, in clinical applications of dCA assessment, good quality recordings lasting 5 min might not always be feasible. This concern led to studies exploring alternative settings, such as shortening the duration of recordings (Intharakham et al 2019b), or the use of different values of SEG D . As demonstrated in these studies, the possibility of using recordings with shorter durations, and also with different values of SEG D , has endorsed the choice of considering values of SEG D of 50 s or 25 s in the present study. The new relevant finding though, is that the SEG D = 100 s setting leads to significantly better values of AUC of ROC curves, in comparison with the other two alternatives (figure 4). Based on this result, it would be appropriate to strengthen the white paper's recommendation for use of SEG D = 100 s as a standard, and its use combined with COH max in cases of poor coherence with the Reference Setting. Furthermore, in future clinical applications, it would also be relevant to assess the use of only a few segments of data, such as N SEG = 2 or 3, to confirm the feasibility of this option when no more segments of data are available with significant coherence (figure 2).

Limitations of the study
Hypercapnia has been shown to increase the diameter of the MCA, but at much higher levels of PaCO 2 than observed in this study (Coverdale et al 2014, Verbree et al 2014. One advantage of ARI, as compared to TFA gain, is that this index is not affected by amplitude changes in CBFV between recordings, but it would certainly result in distortions if CBFV were affected by intra-recording changes in MCA diameter. Application of TFA to dynamic CA relies on the assumption that the BP-CBFV relationship is linear. As mentioned above, this assumption is not acceptable for frequencies below approximately 0.15 Hz because an active CA implies that cerebrovascular resistance is changing over time, thus representing a departure from the premise of linearity (Bendat and Piersol 1986). Although non-linear models have been proposed to address this inherent limitation of TFA (Chacon et al 2018), the jury is still out to determine the benefits of using these models in clinical applications, and the key differences that would result in comparison with classical TFA.
The COH max algorithm was tested in a large representative sample of healthy subjects, collected in previous studies with homogeneous protocols by investigators trained to the same standards. We have opted to test the new algorithm on the effects of hypercapnia on dCA, instead of using clinical data, to allow a more rigorous evaluation based on intra-subject, with corresponding repeated-measures statistics, rather than inter-subject differences in dCA efficacy. For this reason, our results cannot be extended to other datasets, and future studies are needed to confirm our findings in different populations.
As expected, the COHmax algorithm led to increasing values of coherence with the gradual removal of segments of data. However, we cannot guarantee that the resulting values were the absolute optimal. The main reason behind this limitation was the sequential assignment of segments, as described above (COHmax algorithm step ii). Instead, if segments of data were removed with random starting and ending points along the recording, combined with the use of bootstrapping, there would be the possibility of achieving even higher values of coherence than we obtained.
Unusual as this might seem, our data were of better quality than what would be desirable to provide a more stringent test of COH max . In clinical applications, we have observed much higher rates of rejection of ARI estimates, due to poor coherence and high values of NMSE , Caldas et al 2017, Lam et al 2019, and it would have been informative if the dataset we analysed had a higher proportion of problematic recordings than was the case.
Our results were dependent on the choice of the 95% confidence limit of coherence, as the threshold for acceptance of TFA parameters. The choice of a different threshold, for example the 90% or 99% confidence limit (Claassen et al 2016), would undoubtedly lead to different results. Although our confidence limits, and their dependence on the DF, resulting from the TFA settings, were based on the use of broad band noise for input and output (Claassen et al 2016), we have shown previously that similar results are obtained when using surrogate pairs, based on the inter-subject swap of BP and CBFV signals .
Separate values of gain and phase were not presented. As mentioned in the Introduction, the ARI incorporates all the information provided by gain and phase, without the need to breakdown these estimates in averaged values for empirically selected frequency bands, usually termed, very-low and low frequency intervals (Claassen et al 2016). Gain has not performed as reliably in detecting alterations in dCA as phase and ARI have (Panerai 2008, Claassen et al 2016, Intharakham et al 2019a, and the latter two are closely linked by the influence of phase in the temporal pattern of the CBFV step response, that ultimately defines the value of ARI (Panerai 2008). Although the results reported herein for ARI are likely to be applicable to phase as well, this needs to be demonstrated by future studies.

Conclusions
The coherence of TFA between BP and CBFV can be increased by the selective removal of sub-segments of data, an approach that might be useful to rescue recordings that otherwise would be rejected due to values of coherence below the statistical threshold recommended for acceptance of estimates of gain, phase, or ARI index. Before COH max , an algorithm that can remove sub-segments of data in automated fashion, could be recommended for routine calculation of TFA parameters, it is necessary to test its performance more widely, and also to shed light on the benefits of achieving values of coherence significantly higher than the 95% confidence level threshold usually adopted for acceptance of dynamic CA metrics derived by TFA. Further work is needed to test COH max with different sets of data, mainly in recordings obtained in clinical studies where data quality can be jeopardised.