Comprehensive HRV estimation pipeline in Python using Neurokit2: Application to sleep physiology

NeuroKit2 is a Python Toolbox for Neurophysiological Signal Processing. The presented method is an adaptation of NeuroKit2 to simplify and automate computation of the various mathematical estimates of heart rate variability (HRV) or similar time series. By default, the present approach accepts as input electrocardiogram's R-R intervals (RRIs) or peak times, i.e., timestamp of each consecutive R peak, but the RRIs or peak times can also stem from other biosensors such as photoplethysmography (PPGs) or represent more general kinds of biological or non-biological time series oscillations. The data may be derived from a single or several sources such as conventional univariate heart rate time series or intermittently weakly coupled fetal and maternal heart rate data. The method describes preprocessing and computation of an output of 124 HRV measures including measures with a dynamic, time-series-specific optimal time delay-based complexity estimation with a user-definable time window length. I also provide an additional layer of HRV estimation looking at the temporal fluctuations of the HRV estimates themselves, an approach not yet widely used in the field, yet showing promise (doi: 10.3389/fphys.2017.01112). To demonstrate the application of the methodology, I present an approach to studying the dynamic relationships between sleep state architecture and multi-dimensional HRV metrics in 31 subjects. NeuroKit2′s documentation is extensive. Here, I attempted to simplify things summarizing all you need to produce the most extensive HRV estimation output available to date as open source and all in one place. The presented Jupyter notebooks allow the user to run HRV analyses quickly and at scale on univariate or multivariate time-series data. I gratefully acknowledge the excellent support from the NeuroKit team.• Univariate or multivariate time series input; ingestion, preprocessing, and computation of 124 HRV metrics.• Estimation of intra- and inter-individual higher order temporal fluctuations of HRV metrics.• Application to a sleep dataset recorded using Apple Watch and expert sleep labeling.


a b s t r a c t
NeuroKit2 is a Python Toolbox for Neurophysiological Signal Processing. The presented method is an adaptation of NeuroKit2 to simplify and automate computation of the various mathematical estimates of heart rate variability (HRV) or similar time series. By default, the present approach accepts as input electrocardiogram's R-R intervals (RRIs) or peak times, i.e., timestamp of each consecutive R peak, but the RRIs or peak times can also stem from other biosensors such as photoplethysmography (PPGs) or represent more general kinds of biological or nonbiological time series oscillations. The data may be derived from a single or several sources such as conventional univariate heart rate time series or intermittently weakly coupled fetal and maternal heart rate data. The method describes preprocessing and computation of an output of 124 HRV measures including measures with a dynamic, time-series-specific optimal time delay-based complexity estimation with a user-definable time window length. I also provide an additional layer of HRV estimation looking at the temporal fluctuations of the HRV estimates themselves, an approach not yet widely used in the field, yet showing promise (doi: 10.3389/fphys.2017.01112). To demonstrate the application of the methodology, I present an approach to studying the dynamic relationships between sleep state architecture and multi-dimensional HRV metrics in 31 subjects. NeuroKit2 s documentation is extensive. Here, I attempted to simplify things summarizing all you need to produce the most extensive HRV estimation output available to date as open source and all in one place. The presented Jupyter notebooks allow the user to run HRV analyses quickly and at scale on univariate or multivariate time-series data. I gratefully acknowledge the excellent support from the NeuroKit team.

Introduction
Heart rate variability (HRV) as a search term on PubMed rendered ∼55,0 0 0 publications as of June 16,2022. While first studies appeared in 1925, there has been a notable rise in scientific publishing around 1975 with some 400 papers appearing annually as of 2021. This is likely attributable to the steady increase in computational capacity and its access to it along with growing recognition of the HRV physiology and pathophysiology. For example, HRV has been recognized as a biomarker of health and stress in adult and developing organisms reflecting heart-brain interactions and resulting, among other observations, in the phenomenon of heart beat-evoked potentials, a direct reflection of bidirectional brain-heart communication [1][2][3][4][5] .
The number of HRV estimates, sometimes also referred to as metrics or biomarkers, has grown as well, now exceeding 100, albeit it is understood that some of these estimates are collinear [6][7][8] . Still, they tend to offer unique advantages depending on the computational bottlenecks, length and noisiness of data and the desire for interpretability.
With the advent of Digital Health and increased utilization of wearable or ambient sensors to capture heart rate and other biological oscillations, the awareness of caveats in HRV analysis in contrast to the traditional electrocardiogram (ECG)-based approach also needs to rise [9] . I discuss some of the sensor-driven limitations of HRV analyses due to sampling rate in the accompanying article [10] .
Several toolboxes have been built to collate the existing methodologies in a more accessible format and foster the discovery of new biomarkers of health outcomes based on HRV and other physiological time series [11][12][13] . Despite great advances in unification of the many features into a single software system, a major limitation has remained the reliance on commercially available environments to run it.
In parallel, the ecosystem of Python-based open-source packages for time series processing has also been maturing. One such package stands out in terms of methodological scope, functional depth, rich API and constant updates through a large international community of researchers: NeuroKit2. It is a Python Toolbox for Neurophysiological Signal Processing [14][15][16] .
The presented method is an adaptation of NeuroKit2 to simplify and automate computation of the various mathematical estimates of HRV or similar time series [17] . By default, the present approach accepts as input electrocardiogram's R-R intervals (RRIs) or peak times, i.e., timestamp of each consecutive R peak, but the RRIs or peak times can also stem from other biosensors such as photoplethysmography (PPGs) or represent more general kinds of biological or non-biological time series oscillations. The data may be derived from a single or several sources such as conventional univariate heart rate time series or intermittently weakly coupled fetal and maternal heart rate data.
The method describes preprocessing and computation of an output of 124 HRV measures including measures with a dynamic, time-series-specific optimal time delay-based complexity estimation with a user-definable time window length ( Table 1 ).
I also provide an additional layer of HRV estimation looking at the temporal fluctuations of the HRV estimates themselves, an approach not yet widely used in the field, yet showing promise.
Finally, I present an application of the proposed HRV estimation pipeline to an open-source dataset from PhysioNet acquired in 31 subjects during sleep using Apple Watch and enriched with expert annotation of sleep states [ 13 , 18 , 19 ].
How does this methodology add to the existing set of techniques and tools? NeuroKit2 s documentation is extensive. Here, I attempted to simplify things summarizing all the researcher needs to produce the most extensive HRV estimation output available to date as open source and all in one place. The presented Jupyter notebooks allow the user to run HRV analyses quickly and at scale on univariate or multivariate time-series data. I gratefully acknowledge the excellent support from the NeuroKit team.
The key features of the presented methodology are: (1) Univariate or multivariate time series input; ingestion, preprocessing and computation of 62 HRV metrics. (2) Standardization of RRI window lengths and RRI duration-specific computation of complexity estimates. (3) Estimation of intra-and inter-individual higher order temporal fluctuations of HRV metrics. (4) Application to a sleep dataset recorded using Apple Watch and expert sleep labeling.
The step-by-step approach is as follows. 1. Create a dedicated virtual environment You may use conda or another environment manager such as pip or Docker. The choice boils down to your preferences and constraints: for example, certain Python packages can only be installed in pip and not in conda. For the proposed approach, I am not aware of any constraints that prevent the user from using conda. Ultimately, using a virtual environment will help you down the road to ensure your Python analytical pipeline keeps on working and does not get broken by unintended package updates and disrupted interdependencies. As an alternative to this conda step, I provide a Docker container here [42] .
# call conda datanalysis environment !conda init bash !conda activate datanalysis #or use your own preferred venv 2. Load the required and recommended packages. import neurokit2 as nk import pandas as pd import matplotlib.pyplot as plt Table 1 Heart rate variability (HRV) metrics computed in the present adaptation of the NeuroKit2 Python toolbox [ 14 , 15 ].
Time domain [8] RMSSD The square root of the mean of the sum of successive differences between adjacent RR intervals. It is equivalent (although on another scale) to SD1, and therefore it is redundant to report correlations with both [20] .

MeanNN
The mean of the RR intervals.

SDNN
The standard deviation of the RR intervals. SDSD The standard deviation of the successive differences between RR intervals. CVNN The standard deviation of the RR intervals (SDNN) divided by the mean of the RR intervals (MeanNN). CVSD The root mean square of the sum of successive differences (RMSSD) divided by the mean of the RR intervals (MeanNN).

MedianNN
The median of the absolute values of the successive differences between RR intervals. MadNN The median absolute deviation of the RR intervals.

MCVNN
The median absolute deviation of the RR intervals (MadNN) divided by the median of the absolute differences of their successive differences (MedianNN).

IQRNN
The interquartile range (IQR) of the RR intervals. pNN50 The proportion of RR intervals greater than 50ms, out of the total number of RR intervals. pNN20 The proportion of RR intervals greater than 20ms, out of the total number of RR intervals. TINN A geometrical parameter of the HRV, or more specifically, the baseline width of the RR intervals distribution obtained by triangular interpolation, where the error of least squares determines the triangle. It is an approximation of the RR interval distribution. HTI The HRV triangular index, measuring the total number of RR intervals divided by the height of the RR intervals histogram. SDANN1 The standard deviation of average RR intervals extracted from n-minute segments of time series data (1, 2 and 5 by default). SDANN2 SDNNI2 The mean of the standard deviations of RR intervals extracted from n-minute segments of time series data (1, 2 and 5 by default ( continued on next page ) This is a measure of the spread of RR intervals on the Poincaré plot perpendicular to the line of identity. It is an index of short-term RR interval fluctuations, i.e., beat-to-beat variability. It is equivalent (although on another scale) to RMSSD, and therefore it is redundant to report correlations with both [20] . SD2 SD2 is a measure of the spread of RR intervals on the Poincaré plot along the line of identity. It is an index of long-term RR interval fluctuations. SD1SD2 the ratio between short and long term fluctuations of the RR intervals (SD1 divided by SD2). S Area of ellipse described by SD1 and SD2 (pi * SD1 * SD2). It is proportional to SD1SD2. CSI The Cardiac Sympathetic Index, calculated by dividing the longitudinal variability of the Poincaré plot (4 * SD2) by its transverse variability (4 * SD1) [22] . CVI The Cardiac Vagal Index, equal to the logarithm of the product of longitudinal (4 * SD2) and transverse variability (4 * SD1) [22] . CSI_Modified The modified CSI obtained by dividing the square of the longitudinal variability by its transverse variability [23] . Indices of Heart Rate Fragmentation [24] PIP Percentage of inflection points of the RR intervals series.

IALS
Inverse of the average length of the acceleration/deceleration segments.

PSS
Percentage of short segments.

PAS
Percentage of NN intervals in alternation segments. Indices of Heart Rate Asymmetry (HRA), i.e., asymmetry of the Poincaré plot [25] GI Guzik's Index, defined as the distance of points above line of identity (LI) to LI divided by the distance of all points in Poincaré plot to LI except those that are located on LI. SI Slope Index, defined as the phase angle of points above LI divided by the phase angle of all points in Poincaré plot except those that are located on LI. AI Area Index, defined as the cumulative area of the sectors corresponding to the points that are located above LI divided by the cumulative area of sectors corresponding to all points in the Poincaré plot except those that are located on LI. PI Porta's Index, defined as the number of points below LI divided by the total number of points in Poincaré plot except those that are located on LI. C1d The contributions of heart rate decelerations and accelerations to short-term HRV, respectively [26] . C1a SD1d Short-term variance of contributions of decelerations (prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively [26] . SD1a C2d The contributions of heart rate decelerations and accelerations to long-term HRV, respectively [26] . C2a SD2d SD2d and SD2a: long-term variance of contributions of decelerations (prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively [26] . SD2a Cd The total contributions of heart rate decelerations and accelerations to HRV. Ca SDNNd Total variance of contributions of decelerations (prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively [26] . SDNNa Indices of Complexity [8] ApEn The approximate entropy measure of HRV. SampEn The sample entropy measure of HRV.

MSE
The multiscale entropy measure of HR. CMSE The composite multiscale entropy measure of HRV.

RCMSE
The refined composite multiscale entropy measure of HRV. DFA The detrended fluctuation analysis of the HR signal. CorrDim The correlation dimension of the HR signal. optimal time delay This metric, in seconds, provides time delay for optimal reconstruction of the underlying dynamic process [27] . FuzzEn Fuzzy Entropy [28] .

FuzzEnRCMSE
FuzzEn version of the refined composite multiscale entropy (RCMSE). cApEn Corrected version of ApEn [29] . Cumulative residual entropy is an alternative to the Shannon differential entropy with several advantageous properties, such as non-negativity. DiffEn Differential entropy, also referred to as continuous entropy) started as a (failed) attempt by Shannon to extend Shannon entropy. However, differential entropy presents some issues too, such as that it can be negative even for simple distributions (such as the uniform distribution). FI Fisher information. The Fisher information was introduced by R. A. Fisher in 1925, as a measure of "intrinsic accuracy" in statistical estimation theory. As the Shannon entropy, it can be employed as a quality of an efficient measurement procedure, used to estimate a system's disorder [30] . Hjorth Hjorth Parameters are indicators of statistical properties used in signal processing in the time domain introduced by Hjorth (1970) [31] . The parameters are activity, mobility, and complexity. NeuroKit returns complexity directly in the output tuple, but the other parameters can be found in the dictionary. Hurst The hurst exponent is a measure of the "long-term memory" of a time series. It can be used to determine whether the time series is more, less, or equally likely to increase if it has increased in previous steps [32] . KFD The Katz's Fractal Dimension is based on euclidean distances between successive points in the signal which are summed and averaged, and the maximum distance between the starting and any other point in the sample [33] .
Here ( continued on next page ) It first transforms the time series into the frequency domain and breaks down the signal into sine and cosine waves of a particular amplitude that together "add-up" to represent the original signal. If there is a systematic relationship between the frequencies in the signal and the power of those frequencies, this will reveal itself in log-log coordinates as a linear relationship. The slope of the best fitting line is taken as an estimate of the fractal scaling exponent and can be converted to an estimate of the fractal dimension. A slope of 0 is consistent with white noise, and a slope of less than 0 but greater than -1, is consistent with pink noise, i.e., 1/f noise. Spectral slopes as steep as −2 indicate fractional Brownian motion, the epitome of random walk processes. RR Relative Roughness is a ratio of local variance (autocovariance at lag-1) to global variance (autocovariance at lag-0) that can be used to classify different 'noises' [ 36 , 37 ]. SDA Standardized Dispersion Analysis [38] . The main shortcoming of traditional PE is that no information besides the order structure is retained when extracting the ordinal patterns, which leads to several possible issues [40] . The Weighted PE was developed to address these limitations by incorporating significant information from the time series when retrieving the ordinal patterns. ShanEn Entropy is a measure of unpredictability of the state, or equivalently, of its average information content. Shannon entropy is one of the first and most basic measure of entropy and a foundational concept of information theory. Shannon's entropy quantifies the amount of information in a variable.

HFD
The Higuchi's Fractal Dimension of the HR signal Detrended Fluctuation Analysis (DFA) and Multifractal DFA [41] DFA_alpha1 The monofractal detrended fluctuation analysis of the HR signal corresponding to short-term correlations MFDFA_alpha1_Width The multifractal detrended fluctuation analysis of the HR signal corresponding to short-term correlations; the range of singularity exponents, corresponding to the width of the singularity spectrum. MFDFA_alpha1_Peak MFDFA_alpha1_Mean Multifractal DFA; the mean of singularity exponents. MFDFA_alpha1_Max MFDFA_alpha1_Delta MFDFA_alpha1_Asymmetry MFDFA_alpha1_Fluctuation MFDFA_alpha1_Increment ( continued on next page ) The monofractal detrended fluctuation analysis of the HR signal corresponding to long-term correlations MFDFA_alpha2_Width The multifractal detrended fluctuation analysis of the HR signal corresponding to long-term correlations the range of singularity exponents, corresponding to the width of the singularity spectrum. import numpy as np import os import scipy.io from pathlib import Path from scipy.stats import variation from hmmlearn import hmm # load Matlab data files from scipy.io import loadmat 3. Import raw peaks. The source may be Matlab files or whatever input data format you may need. In the present example, we load a duo of files: corresponding maternal and fetal peak data. Your use case may differ. For example, you may load just one set of peak data, three or more sets of peak data derived from different ECG channels, an ECG-derived peak times channel and a PPG-derived peak times channel, etc. Simply amend the code accordingly by editing and adding the additional lines for each step as required.
In this work, because the focus is on peak times or heart rate (or, conversely, the RRI) time series, an important step is skipped deliberately: the derivation of the peak data from the raw signal. This can be ECG, PPG or otherwise recorded blood pressure fluctuations (pulse). The HRV Task Force recommends checking for the presence of ectopic heartbeats, e.g., premature ventricular contractions (PVCs) [ 6 , 43 ]. NeuroKit provides an API for detecting R peaks and for artifact correction. I refer the interested reader to their documentation.
# Import raw peaks from the mat files; adjust to fit your input data format f_filepath_peaks = Path.cwd()/"raw_peaks/f" #fetal raw peaks mat files; m_filepath_peaks = Path.cwd()/"raw_peaks/m" #maternal raw peaks mat files;  6. Artifact correction. a. This is a key step that will influence everything downstream. It is often not reported clearly in the studies. b. Adjust the sampling rate and threshold settings as appropriate for your data. c. Note that we save the logs of artifact correction for audit purposes. Sometimes, you need to know why a certain dataset behaved in the way it did and this documentation can come in handy.
# Artifact correction # Integrated into the above UDF red_mat_file, but you may find this useful to adopt elsewhere in your code https://neurokit2.readthedocs.io/en/latest/functions.html#neurokit2. signal.signal_fixpeaks # Artifact removal on peaks using Kubios: write into UDF taking trimmed_peaks input # caution: nk.signal_fixpeaks takes peaks, not RRI! # nk.signal_fixpeaks saves the corrected peak locations to the [1] index of the output data sturcture # accessible like so: clean_peaks [1] # Review the settings for fetal versus maternal RRI inputs! Adjust to match your RRI physiology # interval_min --minimum interval btw peaks | interval_max --maximum interval btw peaks. f_clean_peaks = nk.signal_fixpeaks(f_peaks_trimmed, sampling_rate = 1000, iterative = False, show = False,interval_min = 0.1,interval_max = 0.25, method = "kubios") m_clean_peaks = nk.signal_fixpeaks(m_peaks_trimmed, sampling_rate = 1000, iterative = False, show = False,interval_min = 0.1,interval_max = 0.25, method = "kubios") # Convert trimmed and cleaned peaks to RRI (using _trimmmed_ raw peaks as input!) rri_clean = peaks_to_rri(clean_peaks_peaks [1], sampling_rate = 1000, interpolate = False) 7. Compute all HRV metrics segment-wise a. Rather than computing on the entire time series at once and trading the reproducibility as a result (HRV metrics have variable dependence on the duration of time series on which they are computed, among other things), we i. set the segment duration explicitly a priori and ii. Take advantage of the segment-wise estimate of HRV (or variability in general, as your case may be) to investigate the higher-order structure of the HRV metrics themselves. b. For complexity estimates, note that we use segment duration-specific estimation of optimal time delay rather than using default settings. This allows us to compute FuzzEn, FuzzEnMSE, FuzzEnRCMSE, cApEn specifically for the optimal time delay. Why select these complexity estimates? It is heuristic. I have found Fuzzy Entropy estimates to be understudied and robust, especially with RRI time series. This is hence worthy of additional attention in future studies deploying complexity estimates. Other time-delay-dependent complexity estimates can be plugged in here, all made available via NeuroKit2 API.  [44] a. First, basic variability statistics are defined. b. Next, a hidden markov model (HMM) is implemented. c. Finally, everything is put together and saved. d. Note that I left here a number of commented lines of code for future development. I welcome improvements on those! For example, HMM code requires a certain duration of HRV metrics time series to compute. Since in the original dataset where this method was developed, the number of datapoints was limited to 6-8, I commented it out and made it available here for future reference and use on larger datasets. def compute_basic_stats(ts_data, SubjectID): # compute mean and variation # assuming "ts_data" is where my HRV metric values list is per subject mean = np.mean(ts_data.values.tolist()) coeff_variation = variation(ts_data.values.tolist()) # this function works similar to variation() but works purely with numpy # cv = lambda x: np.std(x) / np.mean(x) # First quartile (Q1) Q1 = np.percentile(ts_data, 25, interpolation = 'midpoint') # Third quartile (Q3) Q3 = np.percentile(ts_data, 75, interpolation = 'midpoint') # Interquaritle range (IQR) IQR = Q3 -Q1 midhinge = (Q3 + Q1)/2 quartile_coefficient_dispersion = (IQR/2)/midhinge # adding entropy estimate; this is experimental! # ts_entropy = nk.entropy_sample(ts_data) # yielding error "could not broadcast input array from shape (7,1) into shape (7)" | the following syntax fixes that and is more elegant in that it estimates optimal delay # optimal_complexity_parameters = nk.complexity_delay(ts_data.to_numpy, delay_max = 6, method = 'fraser1986 , show = False) # ts_entropy = nk.entropy_fuzzy(ts_data.to_numpy, delay = optimal_ complexity_parameters) # still yielding len error ts_entropy = nk.  11. Method validation: Demonstrating the performance of the proposed HRV pipeline in a retrospective analysis of a polysomnography dataset recorded with Apple Watch.
As validation dataset, the data by Walch et al. was used which is available from PhysioNet. The team acquired heart rate data in 31 subjects during sleep using Apple Watch and enriched the data with expert annotation of sleep states [ 13 , 18 , 19 ]. The labeled sleep state architecture was recorded from polysomnography and saved in the format '[subject-id-number]_labeled_sleep.txt'. Each line in this file has the format: date (in seconds since polysomnography start) stage (0-5, wake = 0, N1 = 1, This dataset is appealing for several reasons for the intended objective of HRV pipeline validation: (1) The heart rate data, extracted from the Apple watch, is publicly available. The data were recorded from 31 subjects during sleep, averages 7.3 h, and comes expertly annotated with sleep state labels. (2) The authors provided a script on GitHub for how to enable such data extraction in the future.
This should make such a demonstration particularly relevant for future studies [45] . Informatics [10] where I discuss the impact of sampling rate on HRV estimation. The Apple Watch data are a good example of the potential of wearables to provide physiological insights which are fundamentally limited by low sampling rates of the underlying signal used to derive HRV (PPG in this case).
Interestingly, the presented HRV pipeline yields insights into sleep state dynamics reflected in HRV which I discuss below. The underlying code, based on the presented HRV estimation pipeline, and all generated data can be found on FigShare and DockerHub [46] .  As a representative metric of higher-order variability, the temporal variability, gauged as coefficient of variation (CV), of these two HRV metrics is also considered [ 44 , 47 ].
To analyze this dataset, I expanded the presented HRV pipeline further and deployed it in several ways that can be used as a basis for future studies as follows.
• The number of HRV metrics was increased from 63 (as per above Jupyter notebook) to 124 HRV metrics, computed on this entire PhysioNet dataset (cf. Table 1 ). • Since this is intended as an example only, for ease of computing, I used the entire dataset to compute HRV (i.e., divider = 1 rather than performing the sliding window computations); I also set optimization for complexity parameters to default settings for the same reasons. The code is available for those who wish to dive deeper and have the resources to do so. • Sample Entropy (SampEn) is reported as an example complexity metric of HRV over time as it changes during sleep along with the traditional linear time-domain metric RMSSD; these are plotted along with the heart rate and sleep state architecture (using the supplied labels). This approach can contribute to studying these relationships systematically and develop open source algorithms to reliably detect sleep states from PPG-derived HRV data.
• The code is presented to determine the total duration of each sleep state per recording using the labeled files [46] . • The saved continuous and averaged SampEn and RMSSD data are provided for the entire cohort, for future analyses [46] . • The code and the visualizations of each subject's time course are provided for heart rate, SampEn, and sleep state architecture [46] . These data reveal a certain covariance between the HRV complexity and sleep state dynamics ( Fig. 1 ). • Next, as an exploratory step, Spearman correlations were computed across all subjects between HRV metrics SampEn and RMSSD on one hand and the NREM sleep state N3 (the deepest sleep state). The findings show again, this time quantitatively across the cohort, a degree of correlation between HRV complexity fluctuations and duration of deep sleep ( Fig. 2 ). Note the correlation between N3 NREM duration and CV SampEn ( R = 0.39, p = 0.03) or CV RMSSD ( R = 0.55 and p = 0.001), respectively ( Fig. 3 ). • Next, I expanded the scope by assessing and showing the correlations systematically for all subjects, all 124 HRV metrics, and all sleep states (all code and results provided) ( Fig. 4 ) [46] .
I suggest the following implications for future work. First, the dataset and the presented findings can be studied further using machine learning tools to derive an optimal HRV metric-based predictor  Table 1 for HRV metrics legend. The resulting dynamic visualization of correlations between HRV metrics and sleep states with Plotly can be viewed here: https://plotly.com/ ∼ mfrasch/5/import-pandas-as-pd-import-plotlyexpres/ . of the NREM states or REM states' duration. Second, the richness of the temporal fluctuations can be further harnessed for classification and prediction using the code for hidden Markov mechanisms (HMM). I consider this to be out of scope for the present manuscript but provide the necessary code. 12

Final remarks
The presented HRV computation pipeline in Python using the API of NeuroKit2 package is shown based on the use of the maternal-fetal trans-abdominally derived non-invasive ECG signal followed by maternal and fetal ECG extraction using SAVER algorithm [48] . Therefore, two RRI inputs are coded throughout the pipeline. However, the number of inputs can vary depending on your use case from univariate RRI time series to multivariate RRI time series. As such, this approach is easily scalable to a given scenario. As an example, the approach to a univariate heart rate analysis in relation to sleep state architecture is also presented.
Recent advances in the in silico modeling of physiological systems open avenues for discovery of novel and deeper understanding of the existing HRV metrics [54][55][56] .
The literature indicates a high potential of HRV biomarkers to serve as predictors of important health outcomes such as cardiac or mental health as well as in critical care and disorders of consciousness [ 1 , 2 , 6 , 49-53 ].

Declaration of Competing Interest
The author declares the following financial interests/personal relationships which may be considered as potential competing interests: MGF holds patents on EEG and ECG processing. MGF is founder of and consults for Digital Health companies commercializing predictive potential of physiological time series for human health.

Data Availability
Comprehensive heart rate variability estimation in relation to sleep state architecture: a retrospective observational cohort study on Apple Watch heart rate data (Original data) (Figshare).