Combined Impact of Heart Rate Sensor Placements with Respiratory Rate and Minute Ventilation on Oxygen Uptake Prediction

Abstract Oxygen uptake (V˙O2) is an essential metric for evaluating cardiopulmonary health and athletic performance, which can barely be directly measured. Heart rate (HR) is a prominent physiological indicator correlated with V˙O2 and is often used for indirect V˙O2 prediction. This study investigates the impact of HR placement on V˙O2 prediction accuracy by analyzing HR data combined with the respiratory rate (RESP) and minute ventilation (V˙E) from three anatomical locations: the chest; arm; and wrist. Twenty-eight healthy adults participated in incremental and constant workload cycling tests at various intensities. Data on V˙O2, RESP, V˙E, and HR were collected and used to develop a neural network model for V˙O2 prediction. The influence of HR position on prediction accuracy was assessed via Bland–Altman plots, and model performance was evaluated by mean absolute error (MAE), coefficient of determination (R2), and mean absolute percentage error (MAPE). Our findings indicate that HR combined with RESP and V˙E (V˙O2HR+RESP+V˙E) produces the most accurate V˙O2 predictions (MAE: 165 mL/min, R2: 0.87, MAPE: 15.91%). Notably, as exercise intensity increases, the accuracy of V˙O2 prediction decreases, particularly within high-intensity exercise. The substitution of HR with different anatomical sites significantly impacts V˙O2 prediction accuracy, with wrist placement showing a more profound effect compared to arm placement. In conclusion, this study underscores the importance of considering HR placement in V˙O2 prediction models, with RESP and V˙E serving as effective compensatory factors. These findings contribute to refining indirect V˙O2 estimation methods, enhancing their predictive capabilities across different exercise intensities and anatomical placements.


Introduction
Oxygen uptake ( .VO 2 ), the quantity of oxygen consumed by the body, serves as a vital indicator of both energy expenditure [1] and exercise intensity [2].It also assesses the body's capacity to ingest and utilize oxygen, which is directly associated with key health metrics such as fitness performance [3][4][5][6] and cardiorespiratory health [7][8][9].Additionally, VO 2 [26], collecting these signals together with HR from the same source can lead to homologous data errors, potentially undermining the reliability of the results.
In parallel with the exploration of additional physiological signals, this research investigates alternative predictive models to elucidate further information embedded within HR or its combination with other physiological parameters.Historically, linear regression has been commonly employed to predict oxygen uptake [1,27,28].However, this method has significant limitations in predicting .VO 2 across varying intensities.Currently, machine learning has been recognized as a means to enhance the accuracy of .VO 2 predictions [29].For instance, models such as random forest and long short-term memory (LSTM) have utilized readily available inputs like HR, RESP, and .

VE to predict
. VO 2 fluctuations during low-intensity exercises or daily activities [30][31][32][33][34], highlighting the potential application of these algorithms in wearable devices and fitness equipment.However, it is noteworthy that most of the exercise data involved predominantly pertain to low-and moderate-intensity steady-state exercises, neglecting variations in intensity and type of exercise [23], which could lead to underestimations or overestimations of .VO 2 .Moreover, these studies utilize ECG for HR acquisition, overlooking potential discrepancies that could arise from different HR signal sources, potentially resulting in deviations in the outcomes.
With the increasing use of PPG in wearable devices for daily health monitoring, establishing its accuracy in reflecting Twenty-eight healthy, well-trained participants (16 males and 12 females) with no known cardiovascular, musculoskeletal, respiratory, or metabolic diseases were enrolled in this study.Participants with any of these diseases or related symptoms were excluded.All participants were thoroughly informed about the experimental procedures and potential risks.Each exercise test was conducted at least two hours postprandial, with no alcohol or caffeine intake and no vigorous exercise in the preceding 24 h.Female participants were advised to avoid completing the test during their menstrual period.All exercise tests were spaced at least 48 h apart.Participants wore lightweight sportswear during the tests.All procedures were approved by the Research Ethics Committee of Beijing Sport University (reference number 2024016H).

Procedure
In this study, each participant completed six cycling exercise tests, including one incremental exercise test and five constant workload tests of varying intensities.Anthropometric measurements, including height, weight, and body composition, were recorded prior to the initial exercise test.For the first test, the cycle ergometer's handlebar angle and seat height were adjusted to ensure an upright riding position.Specifically, the seat height was adjusted so that the participants' knees were slightly bent at the lowest pedal position.These settings were recorded and consistently maintained for all subsequent tests.
During each exercise test, participants wore a gas metabolism analyzer and three HR monitoring devices.After donning the devices, participants remained seated for five minutes to record resting breath gas and HR data.This was followed by a structured warm-up session, during which participants cycled at 80 W with a self-selected pedal speed for five minutes.During the exercise tests, participants maintained a pedal speed of 60 ± 5 rpm, receiving feedback from the ergometer's rpm display and a metronome.Upon completing the exercise, participants rested for ten minutes to facilitate the collection of excess post-exercise oxygen consumption (EPOC) data.
The incremental test began with a load of 50 W for four minutes, followed by an increase of 30 W every minute until exhaustion.Exhaustion was confirmed when any three of the following criteria were met: (1) a .VO 2 increment of ≤2.1 mL/(kg•min); (2) postexercise blood lactate ≥ 8.0 mmol/L; (3) respiratory exchange ratio ≥ 1.1; and (4) maximum HR ≥ 100% HRmax (208 − 0.7 × age) [2].The loads for the subsequent low, medium, and high-intensity tests were determined based on the peak power output (PPO) in the incremental test.The order of the constant load tests was randomized, and participants were required to continue for ten minutes or until exhaustion.The laboratory environment was controlled, with a temperature of approximately 20-25 • C and relative humidity of 50-60%.

Data Collection
The body composition of the subjects was determined using a digital dual-energy X-ray absorptiometry (DXA) device (iDxa, General Electric Company, Fairfield, CT, USA).Respiratory gases during the incremental and constant workload exercise tests were analyzed using a spiroergometry system (MetaMax 3B, Cortex Biophysic, Leipzig, Germany) for breath-by-breath measurements, including .VO 2 , RESP, and .VE. Calibration of air pressure, gas (standard gas concentrations of O 2 15.00%, CO 2 5.00%), and volume flow (using a 3 L syringe) was conducted according to the manufacturer's instructions before each exercise test.Fingertip blood samples were collected immediately after the incremen-tal exercise test for blood lactate analysis.All exercise tests were performed on a cycle ergometer (839E, Monark Exercise AB, Vansbro, Sweden).
The HR measurement devices used in this study were all from Polar (Polar Electro Oy, Kempele, Finland), specifically, the H10 chest strap (using ECG), the Vantage V wrist device, and the Verity Sense armband (both using PPG).Each device provides HR data every second using proprietary algorithms to filter and process the detected heartbeats.

1.
The Polar H10 uses a 1000 Hz sample rate to gather data for internal sensor calculations and algorithms.Subsequently, the sensor outputs the processed data at a 130 Hz sampling rate.The H10 transmits HR data once per second; 2.
The Polar Vantage V utilizes PPG and provides HR samples at a rate of 1 Hz.According to Polar, the internal sampling rate of the Vantage V is considerably higher, and the 1 Hz HR data is derived from this higher-rate sampling; 3.
The Polar Verity Sense utilizes PPG, with a sampling rate of 135 Hz and a resolution of 22 bits.
The Polar H10 was designated as the standard HR monitor in this study.The chest strap electrodes were moistened before being secured to the participants' chests during the xiphoid process.The Vantage V was worn on the subject's non-dominant wrist, at least a finger's width from the wrist bone.The Verity Sense was worn on the bicipital muscle in the upper arm, with the armband on the subjects' non-dominant sides, ensuring the sensor was placed firmly against the skin.These three devices were replaced on the participants' body parts, as shown in Figure 1.During this experiment, care was taken to ensure that the devices were worn securely and comfortably.Following the manufacturer's recommendations, all data were transferred and synchronized via Bluetooth.The raw data from the exercise sessions, including timestamps and HR data from the three devices, were then exported through Polar Flow.
Sensors 2024, 24, x FOR PEER REVIEW 4 of 18 incremental exercise test for blood lactate analysis.All exercise tests were performed on a cycle ergometer (839E, Monark Exercise AB, Vansbro, Sweden).
The  measurement devices used in this study were all from Polar (Polar Electro Oy, Kempele, Finland), specifically, the H10 chest strap (using ECG), the Vantage V wrist device, and the Verity Sense armband (both using PPG).Each device provides HR data every second using proprietary algorithms to filter and process the detected heartbeats.
1.The Polar H10 uses a 1000 Hz sample rate to gather data for internal sensor calculations and algorithms.Subsequently, the sensor outputs the processed data at a 130 Hz sampling rate.The H10 transmits HR data once per second; 2. The Polar Vantage V utilizes PPG and provides HR samples at a rate of 1 Hz.According to Polar, the internal sampling rate of the Vantage V is considerably higher, and the 1 Hz HR data is derived from this higher-rate sampling; 3. The Polar Verity Sense utilizes PPG, with a sampling rate of 135 Hz and a resolution of 22 bits.
The Polar H10 was designated as the standard  monitor in this study.The chest strap electrodes were moistened before being secured to the participants' chests during the xiphoid process.The Vantage V was worn on the subject's non-dominant wrist, at least a finger's width from the wrist bone.The Verity Sense was worn on the bicipital muscle in the upper arm, with the armband on the subjects' non-dominant sides, ensuring the sensor was placed firmly against the skin.These three devices were replaced on the participants' body parts, as shown in Figure 1.During this experiment, care was taken to ensure that the devices were worn securely and comfortably.Following the manufacturer's recommendations, all data were transferred and synchronized via Bluetooth.The raw data from the exercise sessions, including timestamps and HR data from the three devices, were then exported through Polar Flow.

Data Standardization
In this study, the feature data used as inputs to the neural network were taken from the Polar H10 chest strap (i.e.,  ,  and   ).Considering the dimensional  data to have a mean of 0 and a variance of 1.The formula for the Z-score normalization method is as follows: where X denotes the original data values; µ and σ represent the mean and standard deviation of the data, respectively.

Building of Backpropagation Neural Network (BPNN)
Backpropagation Neural Network (BPNN) is one of the fundamental methods in deep learning, with a wide range of applications.From natural language processing and computer vision to speech recognition and bioinformatics, BPNN has become an essential tool for addressing various complex problems.A BPNN is a type of feedforward neural network that trains weights and biases using the backpropagation algorithm to minimize prediction error.In this study, a BPNN with one input layer, three hidden layers, and one output layer is constructed to capture the nonlinear mapping between physiological signals and .VO 2 , as shown in Figure 2.
training convergence speed during the gradient descent method in neural network training, this study uses the Z-score normalization method to standardize the input and output feature data to have a mean of 0 and a variance of 1.The formula for the Z-score normalization method is as follows: where X denotes the original data values; μ and σ represent the mean and stand- ard deviation of the data, respectively.

Building of Backpropagation Neural Network (BPNN)
Backpropagation Neural Network (BPNN) is one of the fundamental methods in deep learning, with a wide range of applications.From natural language processing and computer vision to speech recognition and bioinformatics, BPNN has become an essential tool for addressing various complex problems.A BPNN is a type of feedforward neural network that trains weights and biases using the backpropagation algorithm to minimize prediction error.In this study, a BPNN with one input layer, three hidden layers, and one output layer is constructed to capture the nonlinear mapping between physiological signals and   , as shown in Figure 2.
To investigate whether there is a strong mapping relationship between different physiological signals and   , four types of inputs were considered when constructing the BPNN in this study as follows [35]: From heart rate () only, named   ; 2.
It should be noted that for the four different types of input features, this model was built using the same architecture, only changing the dimension of the input layer features.As shown in Figure 2, the BPNN achieves model training through two processes: forward propagation and backpropagation, with the principles and methodology described as follows: (1) Forward Propagation To investigate whether there is a strong mapping relationship between different physiological signals and .VO 2 , four types of inputs were considered when constructing the BPNN in this study as follows [35]: From heart rate (HR) only, named VE .It should be noted that for the four different types of input features, this model was built using the same architecture, only changing the dimension of the input layer features.As shown in Figure 2, the BPNN achieves model training through two processes: forward propagation and backpropagation, with the principles and methodology described as follows: (1) Forward Propagation First, the normalized physiological signal data are fed into the input layer of the BPNN as the input feature vector x = [x 1 , x 2 , . . ., x i ] T (i = 1, 2, or 3).Then, the input feature vector x is computed and input into the first hidden layer, yielding the linear combination vector z (1) and the activation vector a (1) of the first hidden layer as follows: z (1) = W (1) x + b (1)  (2) where W (1) and b (1) are the weight matrix and bias vector for the first hidden layer, respectively, and f (•) is the activation function ReLU, defined as f (x) = max(0, x).
Between hidden layers, similarly, for the l-th hidden layer (l = 2, 3), based on the activation vector a (1) from the first hidden layer, the linear combination vector z (l) and the activation vector a (l) are computed as follows: where W (l) and b (l) are the weight matrix and bias vector for the l-th layer, respectively.Finally, the activation vector a (3) from the last hidden layer is input into the output layer to obtain the linear combination vector z (4) and the final predicted value ŷ, computed as follows: where W (4) and b (4) are the weight matrix and bias vector for the output layer, respectively.After obtaining the prediction result from the output layer, this study uses the Mean Squared Error (MSE) to measure the difference between the predicted values and the true values.The definition of MSE is as follows: where n is the total number of samples; ŷi is the predicted value of the i-th sample, and y i is the true value of the i-th sample; (2) Backward Propagation After forward propagation and loss calculation, the predicted values of the model are obtained, as well as the error between the predicted values and the actual values.Then, the backpropagation algorithm needs to calculate the gradient of the loss function with respect to each parameter (weights and bias) in the network, which represents the rate of change in the loss function with respect to each parameter in the parameter space.By calculating the obtained gradient information, the backpropagation algorithm adjusts the weights and biases in order to reduce the value of the loss function, which results in a more accurate model prediction accuracy.
The error term δ (4) for the output layer is calculated as follows: And the corresponding gradients of the weights and biases are as follows: For the l-th hidden layer (l = 3, 2, 1), the error term δ (l) is calculated as follows: where ⊙ denotes the Hadamard product, and f ′ is the derivative of the ReLU activation function.
The corresponding gradients of the weights and biases are as follows: Sensors 2024, 24, 5412 7 of 17 Finally, the weights and biases of both the output layer and the hidden layers are updated using gradient descent as follows, and η is the learning rate:

Training Parameter Setting
In this study, all models were implemented in Python (version 3.6, Python Software Foundation, Beaverton, OR, USA).Allocate 80% of the data to the training set, 10% to the test set, and 10% to the validation set.This model was compiled using the Adam optimizer and the MSE loss function for 100 training epochs, with the batch size set to 32.Optimal parameters and node combinations were identified through trial and error.

Statistical Analysis
The results are represented as means ± standard deviation (SD).The concordance correlation coefficient (CCC) was calculated to evaluate the agreement between the devices, with 95% confidence intervals (CI) provided to assess the precision of the estimates.Deviations between measured and predicted values were evaluated using linear regression.The correlation between measured and predicted values was determined using the Pearson correlation coefficient.Additionally, a Bland-Altman plot was used to assess the agreement between measured and predicted data.The predictive accuracy of the models for the entire dataset and each exercise intensity was assessed using mean absolute error (MAE), mean absolute percentage error (MAPE), and percentage error (% error).The maximum acceptable error limit was set at 200 mL/min, which represents the typical noise during exercise [31].All statistical analyses were performed with GraphPad Prism 10 (GraphPad Software, La Jolla, CA, version 10.1.2) and Python (version 3.6, Python Software Foundation).

Participants' Characteristics and Exercise Responses
The characteristics of the participants are detailed in Table 1.Each participant underwent an incremental load test to establish baseline exercise intensity levels, followed by five experimental trials (a total of 140 trials).This study aimed to investigate the effect of HR sensor positions on .VO 2 prediction and to determine whether differences existed across various exercise intensities.Based on participants' responses during the incremental test, five exercise intensity levels were selected for each participant.These levels correspond to the participants' individual VO 2 max, due to the higher exercise intensity, steady-state exercise could not be achieved; therefore, the duration was set to exhaustion.After excluding anomalous data due to poor sensor contact or connection issues, a total of 160,429 valid data points were retained.Additionally, the duration of recording for each test in the final dataset was standardized across all trials.HR data were collected from the H10, Vantage, and Verity Sense devices, resulting in a total of 160,429 paired data points sampled at a frequency of 1 Hz.Missing data points were excluded from the analysis.The data were collected across five different exercise intensities.As exercise intensity increased, both the mean and standard HR error rose.However, at 90% and 110% .VO 2 max, the proportion of HR readings during the EPOC phase was higher due to the shorter exercise duration.The overall characteristics of the collected data are illustrated in Figure 3.

Accuracy of Different Sensor Positions: Vantage vs. Verity Sense
HR data were collected from the H10, Vantage, and Verity Sense devices, resulting in a total of 160,429 paired data points sampled at a frequency of 1 Hz.Missing data points were excluded from the analysis.The data were collected across five different exercise intensities.As exercise intensity increased, both the mean and standard error of HR rose.However, at 90% and 110% .VO 2 max, the proportion of HR readings during the EPOC phase was higher due to the shorter exercise duration.The overall characteristics of the collected data are illustrated in Figure 1.The HR accuracy of the Vantage and Verity Sense devices across varied exercise intensities is shown in Table 2   As shown in Table 4, adding information to the   model significantly increased the accuracy of the estimation.It can be observed that the parameter combinations of   and   have the highest prediction accuracy, both within 200 mL/min, in which   generated the best predictions with the highest R 2 and VE and RESP improved prediction accuracy and compensated for errors caused by different HR acquisition positions from Vantage and Verity Sense.However, the overall model underperformed the intensity-specific models, possibly due to the differences in exercise durations contributing to various proportions.For instance, a steady state is hardly observed in high-intensity exercise and mostly lasts for a shorter duration, causing data weight differences and prediction errors.
When using PPG signals to obtain heart rate data, the measurement primarily reflects pulse rate rather than heart rate [50].While both the PPG pulse wave and the ECG Rwave capture the cyclical activity of the heart and are often considered equivalent, this equivalence does not universally apply across all measurement contexts [51].For example, contemporary studies examining the validity of pulse rate variability (PRV) as a proxy for heart rate variability (HRV) frequently report conflicting results [52].These inconsistencies arise from differences in the delay and frequency characteristics of the two signals, which can be influenced by factors such as neural activity, respiration, blood pressure, and other physiological variables.These differences also explain the divergence observed between HR data derived from PPG signals and ECG signals during high-intensity exercise in this study.Consequently, in specific scenarios, the pulse rate may not accurately reflect true heart rate variations, potentially compromising the precision of .VO 2 predictions.This issue is particularly pertinent when pulse rate, rather than heart rate, is used as an input parameter.Therefore, in the actual prediction and application processes, it is essential to consider these potential variabilities and adjust predictive algorithms according to different exercise intensities to ensure accuracy.
It is important to acknowledge the limitations of this study.Firstly, the participants included were only healthy adults with exercise experiences, so the conclusions cannot be extrapolated to the other populations with diseases or older age, including those who were inactive.Further studies can incorporate a wider range of subjects to determine whether the results would be different and may improve the application of the algorithm.Otherwise, because the type of exercise we chose was cycling, no results can be shifted to other physical activities with a greater range of arm and wrist motion, and the error that occurred in other exercises might lead to diametrically opposite results.

Conclusions
Overall, our findings indicate that the accuracy of HR monitors varies from light to severe exercise tests, depending on the sensor wearing positions.For .VO 2 estimation, the BPNN can reduce the error differences between HR monitors by incorporating information on RESP and .VE. Currently, ECG is the primary source of signals for developing algorithms that include HR as an input, whereas HR detection in wearable devices is based on PPG technology.Therefore, when developing algorithms that include HR, it is essential to consider and validate PPG technology to adjust the algorithm to real-world use and enhance measurement accuracy.This approach will provide valuable insights for future health monitoring, facilitating more straightforward and precise

2. 4 .
Data Processing and Model Construction 2.4.1.Data Standardization In this study, the feature data used as inputs to the neural network were taken from the Polar H10 chest strap (i.e., HR, RESP and .VE). Considering the dimensional differences between various features, especially the impact of these differences on the training convergence speed during the gradient descent method in neural network training, this study uses the Z-score normalization method to standardize the input and output feature Sensors 2024, 24, 5412 5 of 17

Figure 2 .
Figure 2. The structure of backpropagation neural network.

Figure 2 .
Figure 2. The structure of backpropagation neural network.

Sensors 2024 , 18 Figure 3 .
Figure 3.A plot of  measurements for H10, Vantage, and Verity Sense.Different colors represent each exercise intensity level.The black line in the middle of the box represents the median ; the edges of the box represent the 25th and 75th percentiles of , and the whiskers show the range of .A split-half violin plot with density plots overlaid in the background illustrates the distribution density of the .

3. 3 .Figure 3 .
Figure 3.A plot of HR measurements for H10, Vantage, and Verity Sense.Different colors represent each exercise intensity level.The black line in the middle of the box represents the median HR; the edges of the box represent the 25th and 75th percentiles of HR, and the whiskers show the range of HR.A split-half violin plot with density plots overlaid in the background illustrates the distribution density of the HR.

Figure 4 .
Figure 4. Bland-Altman analysis for the measured   and predicted   in different models.The solid line represents the prediction bias, and the dashed line represents 95% limits of agreement.In A and B, a different color represents different exercise intensities.The  data of the left panel were from H10; the middle panel was from Vantage, and the right panel was from Verity Sense.

Figure 4 .
Figure 4. Bland-Altman analysis for the measured .VO 2 and predicted .VO 2 in different models.The solid line represents the prediction bias, and the dashed line represents 95% limits of agreement.In A and B, a different color represents different exercise intensities.The HR data of the left panel were from H10; the middle panel was from Vantage, and the right panel was from Verity Sense.

Table 1 .
Characteristics of participants.

Table 2 .
HR accuracy data for the different exercise intensities.