Towards a portable-noninvasive blood pressure monitoring system utilizing the photoplethysmogram signal

: Blood pressure (BP) responds instantly to the body’s conditions, such as movements, diseases or infections, and sudden excitation. Therefore, BP monitoring is a standard clinical measurement and is considered one of the fundamental health signs that assist in predicting and diagnosing several cardiovascular diseases. The traditional BP techniques (i.e. the cuff-based methods) only provide intermittent measurements over a certain period. Additionally, they cause turbulence in the blood flow, impeding the continuous BP monitoring, especially in emergency cases. In this study, an instrumentation system is designed to estimate BP noninvasively by measuring the PPG signal utilizing the optical technique. The photoplethysmogram (PPG) signals were measured and processed for ≈ 450 cases with different clinical conditions and irrespective of their health condition. A total of 13 features of the PPG signal were used to estimate the systolic and diastolic blood pressure (SBP and DBP), utilizing several machine learning techniques. The experimental results showed that the designed system is able to effectively describe the complex-embedded relationship between the features of the PPG signal and BP (SBP and DBP) with high accuracy. The mean absolute error (MAE) ± standard deviation (SD) was 4.82 ± 3.49 mmHg for the SBP and 1.37 ± 1.65 mmHg for the DBP, with a mean error (ME) of ≈ 0 mmHg. The estimation results are consistent with the Association for the American National Standards of the Association for the Advancement of Medical Instrumentation (AAMI) and achieved Grade A in the British Hypertension Society (BHS) standards for the DBP and Grade B for the SBP. Such a study effectively contributes to the scientific efforts targeting the promotion of the practical application for providing a portable-noninvasive instrumentation system for BP monitoring purposes. Once the BP is determined with sufficient accuracy, it can be utilized further in the early prediction and classification of various arrhythmias such as hypertension, tachycardia, bradycardia, and atrial fibrillation (as the early detection can be a critical issue).


Introduction
In developing countries, cardiovascular diseases are one of the main causes of death (ex. Coronary disease, Heart attack, Cardiomyopathy, etc. . . ). As an example, in Jordan, more than one-third of deaths are related to cardiovascular diseases [1]. Blood pressure (BP) is one of the vital signs considered in diagnosing cardiovascular diseases and health monitoring [2]. It is strongly affected by heart activities due to physical activities, medication, and stress and is closely related to blood pressure variation (ex. Hypertension or Hypotension). As part of the clinical procedures, BP provides information about the physiology of many organs and systems like the lungs, the heart, etc.. [3].
BP is affected by the degree of contraction of the muscles in the blood vessels' walls and the speed and strength of the heart beating. Typically, normal BP is (110-120 mmHg) representing the systolic BP (SBP), and (75-80 mmHg) representing the diastolic BP (DBP). During each heartbeat two mechanisms occur namely; diastole and systole mechanisms. The systole represents the contraction of the left ventricular of the heart while the diastole occurs due to the heart muscles' relaxation. Thereby, the BP would be maximum during the systole and minimum during the diastole.
BP measurement is considered an important measurement for any clinical procedure and medical examination as it reflects the physical conditions of the body. BP measurements can be carried out using two techniques; direct and indirect. The direct technique, where the measurand is directly accessible, is implemented using an intra-arterial catheter integrated into the blood flow [4]. The indirect (noninvasive) technique is achieved using an external cuff with a sphygmomanometer [5] or can be automated using an oscillometric device or using impedance plethysmograph technique [6,7]. Recently, the noninvasive technique has remarkably developed due to the possibility to monitor the BP out of the operation room with good accuracy and high repeatability.

Literature review
One of the techniques considered to indirectly estimate the BP is the utilization of features of biosignals or bioelectrical signals (ex. Photoplethysmogram (PPG) Signal and Electrocardiogram (ECG) Signal). The PPG technique is used to measure the blood volume changes in the arteries non-invasively [8]. In the literature, several studies have been reported to demonstrate the PPG signal utilization in blood pressure (BP) measurement [8][9][10][11]. These studies can be categorized based on either a pure PPG signal (single or two sensors) or a PPG signal with an ECG signal [10,12,13]. The main objective of these studies is to develop a cuff-less BP monitoring system able to measure BP continuously with high accuracy and low complexity to be applied in clinical applications. Therefore, various feature extraction and signal processing techniques have been utilized to perform BP estimation. This incorporates the morphological features of the PPG or/and ECG signals [14,15]; including amplitude, phase, time derivatives, etc. . . together with the advancement in signal processing techniques such as regression techniques [16], artificial neural networks (ANN) [17], and support vector machine (SVM) [18]. Even though this might improve the BP estimation, determining the significant feature(s) and the appropriate data-representing model to reduce the estimation error and improve the accuracy is a challenge.
Wang and Lin in [19] developed a wearable system for BP measurement utilizing the piezoelectric sensing principle. The estimation results showed a mean absolute error (MAE) of 1.52 mmHg with an SD of 0.3 mmHg for SBP and 1.83 mmHg with an SD of 0.5 mmHg for DPB. Slapnicar et al. proposed a PPG-based system for BP estimation through developing wearable devices [20]. The best results were obtained with the MIMIC hospital database with the mean absolute error of 4.47 mmHg and 2.02 mmHg for the SBP and DBP, respectively. Additionally, a smartphone-based wearable cuffless noninvasive BP estimation system was proposed by Atef et al. [21]. The system employed integrated PPG and ECG sensors. The sensors were connected to a smartphone via a wireless communication protocol. The proposed system exhibited an error for estimating SBP around 1.36 ± 7.51 mmHg and for estimating DBP around 2.44 ± 3.49 mmHg.
Deep learning (DL) and regression techniques were also used to estimate BP, utilizing various features of the PPG signal [20]. Ripoll and Vellido proposed a model (Boltzmann Machine Artificial Neural Networks (BMANN)) of titration and pulse transient time (PTT) to estimate BP [22]. Their study showed a little improvement through PTT regression models. While it was noticed that the performance of the regression model decreases when the measurement deviated slightly from the usual measurement window. Mishra and Thakkar proposed a method for cuffless BP monitoring using ECG and PPG signals [23]. The method was mainly based on calculating the PTT and Pulse Wave Velocity (PWV) features. The results showed that PWV was robust than PTT for estimating the BP and it achieved an error for estimating SBP (+11 mmHg to -12 mmHg) and (+10 mmHg to -14 mmHg) for estimating DBP. Esmaili et al. proposed a new method for blood BP assessment based on PTT and Pulse Arrival Time (PAT) [24]. This method was used for evaluating both SBP and DBP. To extract the PTT and PAT features, both the Phonocardiogram (PCG) and ECG signals were recorded as a reference. The results showed that the correlation coefficients for SBP and DBP using the PTT index were 0.89 and 0.84, respectively. For the PAT index, the correlation coefficients were 0.95 and 0.84 for SBP and DBP, respectively. The Relief algorithm was utilized to reduce the dimension of the PPG features and to determine the significant features. Khalid et al. compared various machine learning algorithms for BP estimation using the PPG signal [25]. The online database was first preprocessed and the most powerful pulse features (pulse area, pulse rising time, and width 25%) with their BP reference values were used in three types of machine learning algorithms (regression tree, multiple linear regression (MLR), and SVM. The results show that the regression tree provided the best overall accuracy for SBP (mean and SD of difference: −0.1 ± 6.5 mmHg) and DBP (mean and SD of difference: −0.6 ± 5.2 mmHg). The MLR and SVM achieved an overall mean difference of less than 5 mmHg for both SBP and DBP, but their SD of difference was >8 mmHg. Mousavi et al. proposed a new technique (called whole-based) for estimating the BP using only the PPG signal irrespective of its shape [26]. This method is based on treating the raw PPG signal at predefined time intervals rather than utilizing the time and frequency domain features. It involves performing pre-processing steps for noise removal followed by extracting the features of the PPG signal and finally applying nonlinear regression techniques for BP estimation. The MIMIC-II (Multiparameter Intelligent Monitoring in Intensive Care) database was used as the PPG data source. The results showed a mean error (ME) of 0.187 mmHg for systolic BP (SBP) with a standard deviation (SD) of 4.173 mmHg, while for the diastolic BP (DBP) it showed ME of -0.05 mmHg with SD of 8.9 mmHg. Chen et al. studied the estimation of BP utilizing the pulse transit time and machine learning techniques [13]. The impact of the extracted features on the model performance was considered using the mean impact value method and genetic algorithm. The results showed an error of 3.27 mmHg with an SD of 5.52 mmHg for SBP, while for the DBP it showed an error of 1.16 mmHg with 1.97 mmHg SD.
Due to the various advantages of ANN in treating a large amount of input data and dealing with non-linear correlations between the input features and the output, recently, several studies have been utilizing this technique for BP estimation. The input layer of the neural networks consists of a data vector representing the significant features that have been extracted from the PPG signal or other biosignals, while the output layer consists of the estimated parameters (i.e. SBP and DBP) [20]. Park et.al have presented a system for estimating BP based on artificial intelligence by using single earlobe PPG through cardiopulmonary resuscitation (CPR) [27]. BP was estimated based on a long short-term memory (LSTM) model. Statistical analyses were processed to make a comparison between the gold standards of BP and the proposed method. The results showed that the proposed method has higher BP correlations with the measured BP and the Root Mean Square Error (RMSE) of the calculated SBP and DBP were 2.24 ± 1.37 mmHg and 1.90 ± 1.20 mmHg, respectively. Accordingly, The difficulties of extracting different features through many diseases of the circulatory system will lead to changes in the morphological contours of both PPG and ECG. Therefore, Tanveer and Hasan proposed a method to estimate the BP from ECG and PPG using a waveform-based hierarchical Artificial Neural Network-Long Short Term Memory (ANN-LSTM) [28]. The results showed that the mean MAE and RMSE for SBP estimation are 1.10 and 1.56 mmHg and for DBP estimation are 0.58 and 0.85 mmHg, respectively. Kurylyak and Grimaldi utilized the ANN for BP estimation using the PPG signal [17]. The input of the ANN consists of the heartbeats and 21 features extracted from the PPG signal. The BP estimation results showed a better performance when compared with the linear regression technique. The system was able to estimate the SBP and the DBP with an MAE of 3.80 mmHg, SD of 3.46 mmHg, and 2.21 mmHg, and 2.09 mmHg, respectively. Gaurav et al. proposed a smartphone-based system utilizing 46 features of the PPG signal together with the heart-rate variability for BP estimation [29]. These features were fed into an ANN system. The results showed that DBP and SBP were estimated with MAE of 3.21 with SD of 6.85 and 4.47 mmHg with an SD of 4.72, respectively. Xing and Sun presented an optical system for estimating BP based on PPG signal [30]. The Fast Fourier Transform (FFT) technique was used to extract the significant features after proper pre-processing and normalization steps. An ANN was trained and then tested to provide an estimation for BP. The testing results showed a good estimation accuracy (tested over 90 subjects) compared with reference measurements. A difference with -1.67 ± 2.46 mmHg for SBP and -1.29 ± 1.71 mmHg for DBP was obtained. However, the distribution of BP in practice is different from the samples that have been used in the training stage.
With the continuous increase of cardiovascular diseases risks, the need for performing blood pressure monitoring becomes a must for treating these diseases. Although the conventional techniques are simple and painless, they are less comfortable and do not provide continuous monitoring. Additionally, some of these methods (ex. cuff-based methods) cause turbulences to the bloodstream which affects the accuracy of the BP measurements and makes it ill-suited for monitoring purposes. On the contrary, the invasive techniques used in hospitals are limited (i.e. in the intensive care units (ICU)) and need complex operation procedures. Therefore, developing a more comfortable, accurate, and noninvasive monitoring system for BP estimation becomes valuable and effective. In this paper, a non-invasive portable system is developed for accurate BP monitoring based on the PPG signal. The proposed scheme utilizes the optical technique to measure the PPG signal. Several features extracted from the PPG signal along with signal processing techniques were utilized as inputs for the BP estimation process. The advantage here is that all the PPG signals were measured using the designed system and with the same instrumentation system. Such a system has the potential to estimate BP accurately while benefiting from the advantages of the non-invasive property of the optical acquisition technique. This study contributes to the efforts of cuffless-based BP monitoring and will assist in providing proper medical treatment and diagnosing cardiovascular diseases.

PPG signal
The PPG signal is considered recently in literature as one of the biosignals utilized in blood pressure prediction. Additionally, it is also used for classifying various heart diseases. The PPG signal is optically generated and monitors the changes in the light intensity transmitted through the skin. By detecting the amount of absorbed light by the blood cells the PPG signal is acquired, reflecting the blood volume changes in the arteries. The output voltage generated from the PPG sensor is proportional to the amount of blood flowing through the blood arteries due to volume changes caused by the pressure pulse.

System design
The BP instrumentation system is designed based on the transmission-mode optical technique, for sensing the PPG signal, operated at specific wavelengths (i.e. 880 nm for better penetration depth), and supported by adequate signal processing techniques to perform BP estimation. Figure 1 shows a block diagram for the PPG instrumentation system. It consists of a photo-transmitter (Light Emitting Diode (LED)) and optical detector (photodiode) followed by a conditioning circuit. The photodiode was selected with the highest sensitivity at 880 nm for a better signal-to-noise ratio. The generated current from the optical detector is proportional to the light intensity, which is then converted into a voltage by a trans-impedance amplifier. The transmitter has a low-intensity infrared (IR) wavelength composed of a driving circuit that controls LED operation. At the receiver side, the sensor consists of a photodiode (PD) with the highest responsivity at the LED's optical range. The PD is followed by a trans-impedance amplifier which is used to convert the generated current by the PD to voltage. This is followed by a low-noise instrumentation amplifier and a non-inverting amplifier, which were optimized for better signal-to-noise ratios (SNR). Additionally, low DC offset and very high input impedance were also considered for the design of the amplifiers. The PPG signal is further processed through bandpass filtration with corner frequencies of 0.05 Hz and 30 Hz, respectively. The system is interfaced to a computer using a data acquisition card (DAQ), with a sampling rate of 1 ks/s, for further signal processing under the LabVIEW and MATLAB environments. Figure 2 shows an image of the PPG instrumentation system.  The structure of the sensor was designed utilizing the package of a commercial sensor (only the tongs of the pulse oximeter probe), which was modified to suit the current application (i.e. the light source, the driving circuit, and the detection circuit). The sensor is adapted to the fingertip shape, ensuring a minimum light interference and a minimum applied pressure to the skin. This guarantees minimum disturbances to the blood flow to achieve a stable and good PPG signal quality.

Data collection
The measurements were performed with different subjects' health conditions (i.e. normal and abnormal) with a total of about ≈ 450 subjects. For the abnormal subjects, the data collections were performed in the heart clinic and ICUs at hospitals. To perform the data collection, the required permissions and Institutional Review Board (IRB) approvals were obtained. The subjects involved in the study signed the consent form as an agreement to include their anonymous data. The data collection was based on a predesigned protocol with different elements such as age, height, body mass index, gender, heart disease history, smoking habits, etc. . . The measurements protocol included all the possible factors and parameters that may affect the BP results and cause a variation in the BP values to generalize the estimation process. The heart rate, DBP, and SBP were measured using a mercury sphygmomanometer (Yamasu-model 600) as a reference instantly before running the PPG signal measurements and at the same hand where the PPG probe was attached. The consent form of this study was described to the subjects before performing the measurements and their signed approvals were obtained. The measurement procedure includes the following; the subject was asked to lay back and relax with his/her hands forward. This represents a standard way of seating the patients during the measurement. A mercury sphygmomanometer was used to perform the reference BP measurements with the same operator for all subjects as the golden standard available device. Next, the PPG sensor was attached to the index finger of the patient. The hardware was connected to the PC via a DAQ card for further data recording and signal processing using the LabVIEW environment. Then, the power supply is turned on and the signal acquisition is started.
The LabVIEW environment was utilized to provide signal acquisition and digitizing. Compared with the literature, where the data were retrieved from an international database [20,31], this study was entirely based on experimental data performed with healthy and unhealthy subjects using our designed system. Figure 3 shows a demonstration of the measurement setup and Fig. 4 shows a block diagram of the LabVIEW interfacing used for PPG signal acquisition and processing. In BP monitoring, due to the limited number of datasets (i.e. wide BP ranges with high accuracy) providing a comprehensive estimator can be a challenge. Certainly, the recurrent of similar signals improves the estimation process. While, if the subjects' variation is high the data correlation and its occurrence in the model will be minimum.
However, generalizing the estimation process is required and highly dependent on the spread of the data collected from subjects with different BP ranges and various health conditions. Therefore, the selected datasets should be satisfying both conditions to provide a comprehensive BP estimator, which was taken into consideration during this study. Figure 5 shows the distribution of BP (SBP and DBP) values within the experimental data performed in this study. The signals of all subjects were classified into normal, low, and high blood pressure. The SBP and DBP ranged from 80 to 185 mmHg and 49 mmHg to 123 mmHg, respectively. The signals' features of each class were studied independently for each subject. This process was repeated for all features. The subject's demographic data summary is presented in Table 1.  3.4. Signal preprocessing Figure 6 shows an example of a raw PPG signal waveform. The PPG signal was measured and the raw signal was processed utilizing various preprocessing techniques applied in the digital domain. The PPG signal should be preprocessed with high attention to provide acceptable BP estimation accuracy. Figure 7 shows the process performed targeting the BP estimation utilizing the PPG signal.  Subsequent digital signal processing (DSP) steps were applied to the measured signals including filtration, smoothing, normalization, and segmentation to improve signals' quality and remove possible artifacts. These steps are very critical for the next step which involves feature extraction, as this step is highly sensitive to the characteristics of the signal and thereby essential in improving the BP estimation process. A digital band-pass filter was used with 0.05 Hz and 30 Hz corner frequencies. The segmentation was used to separate the systolic and diastolic waves of the PPG signal for further processing and feature extraction. The determination of the dicrotic notch point in each segment in the PPG signal is critical, as various features of the PPG signal rely on it (e.g. the determination of the area under the curve). Differentiating the PPG signal (dPPG/dt) shows the dicrotic notch point as a peak in the resulted signal. In each PPG segment, the index of the derived absolute peak corresponds to the notch position. Figure 7 illustrates the block diagram of the preprocessing procedure applied to the PPG signal.

Features' extraction
The PPG signal features represent the characteristics of the signals and encode the hidden and complex relationship between the PPG signal and the BP (characterized by the SBP and DBP). These features deal with the PPG signal's shape or the relative relation between the shape's characteristics together with its statistical representation. The statistical parameters such as the mean and the standard deviation were determined for all of the analyzed features. In this study, the following features were used for BP estimation: -Systolic amplitude (S.a) and diastolic amplitude (D.a): For the PPG signal, each segment is divided into two regions, i.e. systolic and diastolic, defined utilizing the dicrotic notch. In each segment, the S.a represents the difference between the systolic peak and the diastolic peak, while the D.a is the difference between the diastolic peak and the dicrotic notch.
-Systolic area (SA) and Diastolic area (DA): The area under the curve for each segment is determined for both the systolic and diastolic regions. To extract the systolic area the Trapezoidal numerical integration is used. -Pulse Interval (PI): it is the time interval between two consecutive diastolic peaks.
-Peak to peak interval (PPI): it is the time interval between two consecutive systolic peaks.
-Time to Full-Width at Half-Maximum (FWHM): It is the time difference at which its corresponding values equals half of the systolic peak.
-Augmentation Index (AI): It is the ratio of the S.a to the D.a, as a representation for the wave reflection in the arteries.
-Stiffness Index (SI): it is the ratio of the S.a to the DT.
-The ratio of the systolic time to the diastolic time.

Data analysis
In the estimation process, several features have been extracted from the measured PPG signal. To check the system performance when reducing the complexity of the computation and eliminating possible redundancies reduction techniques can be applied. The selection of the appropriate features might has an importance on the effectiveness and accuracy of the estimation results. Several techniques can be utilized in the feature selection process for parameters' estimation purposes such as the Principal Component Analysis (PCA) and Genetic Algorithm (GA) [32,33]. The GA technique The GA was proposed as a tool in feature selection to eliminate the redundancy of data and to represent the most significant features in the estimation process [33]. The GA was employed to find an optimal binary vector, where each bit corresponds to a feature. The inclusion or the exclusion of any feature in the optimal vector depends on the status of its corresponding bit in the binary vector (1: is included and 0: excluded). The resulting optimal vectors are evaluated based on the results of the estimation accuracy applied to a set of testing data utilizing the K-Nearest Neighbor (KNN) [34].
The PCA technique The PCA represents a mathematical technique employed to reduce the large-dimensional data sets to small-dimensional data sets while maintaining most characteristics of the original data set. Any m × n matrix B can be written as the product of three matrices [35,36]:

Evaluation criteria
Two different criteria were used to evaluate the performance of the BP estimation process, namely; -Means Error (ME) with its SD. The mean of all errors is: and where: x i : the actual (reference) value of BP, x ′ i is the corresponding i th estimated value of BP,x i is the average of the measurements, and n: is the number of performed measurements (i.e. sample size).
-MAE with its SD as: and Additionally, the Pearson's correlation and Bland-Altman assessments were performed between the estimated and reference BP values. Finally, international standards were utilized to assess the performance of the proposed system such as the Association for the American National Standards of the Association for the Advancement of Medical Instrumentation (AAMI) and the British Hypertension Society (BHS) [37,39].

Results and discussions
In the BP estimation approach, the raw data with various conditions and specifications can be converted into features representing the complex patterns and embedded information utilizing prediction models. The target is developing a prediction model able to determine specific parameters with certain accuracy (represented by the degree of certainty), which can be used further for classification and decision-making applications.
In the BP estimation process, there are two steps; training, and testing. The system is trained using the input features in such the main parameters and the design of the system was optimized to achieve a minimum estimation error. The estimation system that was already trained is then applied to predict the BP of the unknown.
In this work, four machine learning techniques have been implemented to perform the estimation process, namely; SVM, Ensemble Tree Regression (ETR), Gaussian Process Regression (GPR), and Artificial Neural Network (ANN). The features extracted from the PPG signal were utilized in the BP estimation process to predict the SBP and DBP values for measurements performed practically. The total number of cases was ≈450 and the data were split into two sets; (60%) of the cases were devoted to the training set and (40%) in the testing set. The performance evaluation was based on the ME and MAE of the estimated BP together with the SD of these error values. The extracted features were utilized as an input vector to machine learning models. For the optimized SVM, the hyperparameters that have been used were Gaussian kernel with a scale of 32, epsilon 0.0076, and the SMO solver. On the other hand, the GPR utilized the squared exponential kernel with a linear basis function. The ANN regression was designed with 10 hidden layers, whereas the ensemble decision tree with the bagging method. The data was standardized before applying the regression procedure. Table 2 shows the results of the BP estimation process while considering all the extracted features as the input vector (i.e. 13 features). The results showed that the estimation accuracy was better for the DBP compared with the SBP. Among all the presented ML techniques the SVM (for the SBP estimation) and GPR (for the DBP estimation) techniques provided the best estimation performance while using all the extracted features (i.e. 13 features were utilized in the estimation process). The key point of GPR results over all the regression algorithms can be related to its nonparametric Bayesian regression approach. Therefore it is more convenient when dealing with limited datasets. When considering the ME criteria the estimation results were as good as the results reported in the literature [26,40]. However, we believe that the ME criteria do not reflect the actual error. The ME involves the mean of the errors together with their directions, which can be positive or negative values. Thus, the algebraic summation compensates fully or partially for these error values. However, the MAE reveals the actual error magnitudes, irrespective of their direction relative to the reference values. Therefore, the MAE was used to evaluate the performance of the estimation process, as it reflects the actual deviation between the estimated values and the reference values.
The results showed that the best estimation error of the SBP was 4.69 ± 6.03 (MAE ± SD) mmHg utilizing the SVM technique, while for the DBP the best performance among all the estimation techniques was 1.53 ± 1.8 (MAE ± SD) mmHg utilizing the GPR technique. Even though these results were obtained from experimental data performed in the hospital using our designed system, they are comparable and performed (to our knowledge) better than many other results that were reported recently in the literature [26,41]. However, we have to highlight here that many studies have considered online datasets to perform BP estimation with a huge number of data, which is considered as an advantage for our system in dealing with a limited number of experimental data.

With data reduction
Features reduction techniques can be utilized to improve the accuracy of the estimation process and reduce the estimation complexity. This is attained by removing the data redundancy and considering the significant features of the PPG signal. In this study, 13 features were used as the dataset applied to two data reduction techniques (namely the PCA and GA techniques) for selecting the significant features. Table 3 shows the SBP and DBP estimation results utilizing several estimation techniques while applying the GA data reduction technique. For both SBP and DBP the optimal number of significant features was found to be 9 features. The results showed that the estimation performance and the accuracy didn't improve significantly when applying data reduction techniques. By considering the number of significant features, the GA cannot be further improved, as the results represent the optimal solution. To find the number of significant features with the best BP estimation value the PCA technique was utilized in the data reduction process with different components. Figure 8 shows the performance of different estimation techniques while varying the number of features utilizing the PCA technique. The results showed that with 2 PCA components the GPR technique has shown the best performance (MAE ± SD) with 4.82 ± 3.49 mmHg for the SBP and 1.37 ± 1.65 for the DBP. Table 4 shows the performance of the BP estimation with different PCA components utilizing the GPR estimation technique. When compared with the results of BP estimation (both SBP and DBP) without data reduction, the PCA technique with 2 components showed comparable performance. The DBP has not shown any performance improvement when increasing the number of PCA components above the 2 components.  In this study, the features of the PPG signal indicate that several of these features embed a hidden relationship with the BP in different weights. Specifically, the area under the curve and the heart rate significantly influenced the BP estimation. This is indicated in the BP estimation results in such utilizing two features (namely area under the curve and the heart rate) showed comparable performance with the results obtained using the 13-features (see Table 4 and Fig. 8). Generally, for each pulse with fixed vascular impedance, the pressure change is associated with the rate of change in blood volume (i.e. blood flow and thereby the cardiac output) [42]. Based on the similarity between the shape of the PPG signal and the pressure wave [43], the area under the curve can be related to the cardiac output. Thereby, the variation in the area under the curve could be a replicate for the blood pressure. Figure 9 shows the relationship between the estimated and actual values of SBP and DBP for both the training data and test data. To evaluate the performance between the estimated BP values (i.e. SBP and DBP) and with the corresponding estimation results. The linear-line fit was utilized while considering the Pearson correlation coefficient [44]. The results showed that utilizing the GPR technique the Pearson correlation coefficient was ≈ 0.99 for SBP and DBP with a regression coefficient of 0.77, indicating a highly linear relationship and thereby an extremely strong correlation between the actual and the predicted values. The Pearson's correlation coefficient for the test data and training data of the DBP was found to be ≈ 0.99, with a regression coefficient of 0.95. This indicates that the model was able to better estimate the DBP compared with the SBP. However, regression graphs are insufficient to evaluate the performance of the estimation process as the closeness of the predicted BP to the reference values. Figure 10 shows the distribution of the MAE of BP values for the GPR estimation technique with 2 PCA components. The results show that the MAE is normally distributed around the mean values of the SBP and DBP. The MAE increases when the BP values were varied from the nominal BP mean value (i.e. ≈130 mmHg for the SBP and 80 mmHg for the DBP (see Table 1)). Compared with the SBP, the divergence of the MAE in the DBP is less indicating a better estimation accuracy. Compared with data presented in Fig. 4 the estimation process deteriorates with the small number of measurements. This is also shown in Fig. 11 which represents the relative error of the BP estimation for both the SBP and DBP. It provides us the goodness of the estimation relative to the actual BP value. The SBP estimation was comparable relative to the BDP estimated value in such that 95.6% of the SBP values and 98.5% of the DBP values achieved Fig. 9. The performance of the BP estimation model utilizing the regression graphs using the correlation between the estimated BP and its corresponding reference value for both the SBP and DBP. a ≤ 10% relative error with a minimum relative error around the mean values. While 70% of the SBP values and 94% of the DBP values achieved relative errors of ≤ 5%. Therefore, increasing the number of measurements should improve the estimation results, especially related to the SBP. To assess the correspondence between the estimated BP values (i.e. SBP and DBP) and the reference (actual) values the Bland-Altman method was utilized [44]. The Bland-Altman is one of the statistical methods utilized to determine the agreement between two different quantitative measurements. The target is to estimate a certainty range where most of the scattered data is located within±2σ from the mean value. If 95% of the data is locked with the confidence interval, then the two parameters coincide with each other (in this case the estimated and the actual BP values have a strong agreement). Figure 12 shows the scattering diagram of the Bland-Altman plot for the differences plotted with the averages of the actual and estimated SBP and DBP. Three horizontal lines are also shown in the graph representing the mean difference, and the limits of agreement (confidence interval). The graph shows that most scattered data are located around the mean value and within the 95% confidence interval. For the SBP the ME was ≈ 0 mmHg with ± 11.67 mmHg as the upper and lower interval limits. For the DBP the ME was 0.08 mmHg with ± 3.48 mmHg as the upper and lower interval limits. This indicates an extremely strong correlation between the actual and the predicted values and the certainty of BP estimation results. According to the international standards for the evaluation of BP measurements specific criteria have to be accomplished to consider the results satisfactory. Examples of these international standards are the AAMI and the BHS [37][38][39]. The AAMI requires achieving BP measurements with MAE of lower than 5 mmHg with an SD lower than 8 mmHg [39]. The results show that our designed system together with the processing techniques utilized in this study is able to estimate both the SBP and DBP with good accuracy and to comply with the AAMI standards for both the accuracy of the measurements and the number of subjects. The MAE was 4.82 mmHg for the SBP with an SD of ≈ 3.5 mmHg, while it was for the DBP 1.38 mmHg with an SD of 1.65 mmHg with a total number of ≈450 cases. This fulfills the AAMI standards concerning the number of subjects, mean values together with the SD values. Table 5 shows an evaluation of the obtained results in accordance with the AAMI standards.
The accuracy of the measurements has been also assessed according to the BHS standards. Based on the BHS criteria, the accuracy of the measurements can be categorized (as grade A, B, and C) according to the percentage of the BP measurements errors bounded within the 5 mmHg, 10 mmHg, and 15 mmHg, respectively. Table 6 shows an assessment of the obtained results according to the BHS criteria. In this study and according to the BHS standards, the DBP measurements achieved Grade A and the SBP measurements achieved Grade B. A 98% of the DBP measurements were within the 5 mmHg, while for the SBP it was 58% of the measurements. However the obtained SBP is categorized as class B, the percentile of the data is slightly outside the acceptable limit (≈ 60%) of Grade A. Additionally, for the other categories (i.e. <10 mmHg and < 15 mmHg) the SBP falls within the Grade A category with 90.9% and 98.8%, respectively. Generally, the BP is strongly affected by the arteries conditions and the blood properties as a fluid. However, the procedure of this study involves developing a model to represent the BP with various PPG signal features. This model was verified by various subjects utilizing the same measurements procedure with different health conditions, which involves different blood pressure ranges. The goodness, robustness, and comprehensiveness of the model are strongly affected by the training data (i.e. properties and variations in the health conditions of the subjects). Therefore, the standardization of the measurements protocol and procedure reduces imperfections either due to variations between subjects or even for the same subject.
Due to the large variations in the assessment methods, type of the datasets, and the number of subjects it was a challenge to compare the absolute results for the work proposed in the literature. There are a tremendous number of studies discussing the BP estimation, very few utilized only the PPG signal. Therefore, the performance of the designed system was compared with the corresponding studies in the literature based on different criteria including type of dataset, biosignals utilized in the BP estimation, statistical estimation parameters such as ME, and MAE, together with their SD, and the results consistency with the BHS and AAMI international BP golden standards. Moreover, a large number of studies claimed high estimation accuracy based on the low ME values and utilized that in comparing their finding with the golden international standards. We believe that the use of the ME for evaluation is misleading as irrespective of the individual error-values the ME can be small (i.e. the error values from both sides of the means compensates for each other).
A common source of the dataset for the related work is the utilization of an online subset of the MIMIC database with an unspecified or different number of subjects and a different size of the dataset. However, the details of the selected subsets were normally not specified, as this can affect the estimation process and relate the coherence of the dataset as an indication of model performance. For example, the subjects may vary in their health conditions and thereby the shape of the PPG signal and ranges of the BP values. This might impede the estimation process as its performance deteriorates with the wide-ranged uncorrelated BP values, especially for the estimators that depend mainly on the shape features extracted from the PPG signal. The fact that most of our measurements were collected from the heart clinic or the ICU at hospitals is a challenge because most of the subjects were subjected to drugs. This may cause BP variations and deteriorate the estimation process. We believe that the BP instrumentation system designed and implemented in this work, supported by the BP estimation results, can be a suitable tool for BP monitoring. Overall and compared with other studies in the literature, the results obtained here were promising in terms of accuracy taking into consideration the type of dataset, number of features, and number of utilized biosignals.
As it was denoted that most of the data were collected in the heart clinic and ICU sections in hospitals, providing more healthy cases would improve the results. However, this is not the target of the study as the purpose is to design a comfortable-portable-noninvasive BP monitoring system for unhealthy subjects. Additionally, a limited number of systems that can be considered as comprehensive instrumentation systems able to run the PPG measurements, apply signal processing, and afterward perform BP estimation exist. All the PPG signals were measured using the same optically designed system and with the same instrumentation system. This could help in removing any variations in the system response due to the use of different hardware, which is not the case in this study compared with the database-based studies. Table 7 represents the comparison between the results obtained in this work with the results of other works with various conditions.

Conclusions
In this paper, we proposed a complete and comprehensive instrumentation system to estimate blood pressure (BP) noninvasively. The target is developing a portable BP monitoring system able to estimate the BP continuously and noninvasively utilizing the PPG signal, for clinical and home-use applications. Utilizing clinical-based protocols, ≈450 cases with various health conditions contributed to this study. The signals were processed and the features representing the PPG signal were extracted. The impact of the number of features on the estimation model was analyzed utilizing several data reduction techniques. The focus was to provide an accurate estimation of the BP (i.e. Systolic (SBP) and diastolic (DBP) blood pressure) using machine learning techniques. The results showed a good estimation accuracy for both the SBP and DBP, which was evaluated using various statistical techniques and international standards such as the BHS and AAMI standards. Such a study can be considered as a step further towards developing a stand-alone wearable system that is able to monitor the BP and classify various heart arrhythmias. A large number of subjects and selecting the optimal features are needed to improve the estimation accuracy, reliability, robustness, and long-term characteristics of the system; which will be our focus in future studies.