ENHANCED HYPERTENSION CLASSIFIER BASED ON PHOTOPLETHYSMOGRAM SIGNAL USING STATISTICAL ANALYSIS AND EXTREME LEARNING MACHINE METHOD

: Hypertension prevalence is known to increase with urbanization and ageing population. The combination of urbanization and ageing can have a compounding effect on the prevalence of hypertension. As populations age in urban areas, there is a higher risk of developing hypertension due to both lifestyle factors and physiological changes. This has significant public health implications, as hypertension is a major risk factor for cardiovascular disease, stroke, and kidney disease. The aim of this study is establishing an operator independent screening technique with reliable accuracy in classifying hypertensive subjects using finger photoplethysmogram signal. In achieving the targeted


INTRODUCTION
Blood Pressure (BP) is one of the most commonly measured clinical parameters as a key determinant in the cardiovascular circulatory system. Hypertension is the leading cause of cardiovascular disease and premature death worldwide. Based on data compiled by WHO [1], 26.4% of the world's population has hypertension and 60% of them are in developing countries, including Indonesia. Where more than 75% of countries experienced an increase in hypertension-related CVD from 48.2% during 2000-2010 to reach 76.2% in 2010-2019 [2]. According to a national survey conducted by the Indonesian Ministry of Health in 2018, the prevalence of hypertension (high blood pressure) among adults aged 18 years and above in Indonesia was 34.1% [3]. This means that more than one-third of Indonesian adults have high blood pressure, which is a significant public health concern. The prevalence of hypertension varies by age group, with higher rates among older adults [4]. Among adults aged 60 years and above, the prevalence of hypertension was 67.8%, while among those aged 18-29 years, it was 10.4% [3]. There are several factors that contribute to the high prevalence of hypertension (high blood pressure) among Indonesians that includes high salt and saturated fat in diet added with high tobacco and alcohol 3 ENHANCED HYPERTENSION CLASSIFIER consumption [4]. Rapid urbanization and aging with increasing life expectancy of Indonesia population is at higher risk of developing hypertension and the prevalence of hypertension is likely to continue to rise. Leading all the above factors, lack of awareness in blood pressure monitoring and limited access to healthcare services for early detection and management of hypertension.
Increased awareness and screening for hypertension can help reduce the prevalence of hypertension in Indonesia and improve overall health outcomes.
Hypertension itself is a condition of increased blood pressure levels, where the systolic and diastolic conditions are above normal limits, which are more than 130 mmHg and more than 80 mmHg [5,6]. This makes accurate blood pressure measurement essential for the diagnosis and management of hypertension [7]. Blood pressure is formed by the main thrust of blood pumped by the heart and the blockage of the microcirculation system. High blood pressure will result in the heart's performance to pump blood becoming heavier. As a result, the burden on the heart will be greater, and if it continues in the long term it can be at risk of causing cardiovascular disease [8].
This makes blood pressure level as one of the indicators in diagnosing various cardiovascular diseases, such as hypertension, stroke, heart failure and so on [9].
In general, the measurement of Blood Pressure values is carried out using the brachial artery auscultation method using a stethoscope and sphygmomanometer to detect the appearance and disappearance of Korotkoff sound. This method is still conventional and prone to measurement errors caused by patient movement, device calibration and procedural errors [7]. This method also uses mercury tensiometer or digital which uses materials that are not environmentally friendly, also causes discomfort in patients so that measurements are not continuous [10]. Therefore, a blood pressure measurement method is needed that can be carried out continuously and is more comfortable for patients to use.
In recent year an alternative method is offered by photoplethysmography which is a cardiac condition monitoring tool using a cuffless method that can be performed continuously [11]. PPG is a vascular optical measurement technique to detect changes in blood volume in the microvascular layer of the target tissue [12]. This technique uses infrared light which is noninvasive and painless when worn by patients [13]. PPG is also able to contain sufficient physiological information about cardiovascular blood circulation which makes it an effective technique for diagnosing several CVDs. In addition, PPG can be combined with the latest technologies such as the Internet of Things and biosensors [14]. This makes PPG an alternative for blood pressure level measurement for early detection of hypertension.
Various studies using PPG have been conducted, including Liang, et al. who conducted PPGbased blood pressure classification using deep learning method by combining Continuous Wavelet Transform (CWT) and CNN [15]. From this study, the results of the F value in three classifications were 80.52%, 92.55%, and 82.95%. Meanwhile Zhang, et al. tried to predict blood pressure levels with the Gradient Boosting Decision Tree method which resulted in an accuracy value above 70% for systolic pressure and accuracy above 64% for diastolic pressure [16]. On the other hand, Tjahjadi, et al. used KNN for blood pressure classification into three classes and was able to get 83.34%, 94.84%, and 88.49%, but the determination process took a long time [17] that the ELM method has high specificity and specificity, but also learning time and fewer features than other methods [19]. This result is also supported by Chy, et al. who compared the performance of KNN, SVM, and ELM in classifying an object, with ELM producing the best accuracy with a value of 87.73% and an F-score of 91.30% [20].
Extreme Learning Machine (ELM) is a method that works with the concept of single hidden layer feedforward networks (SLFNs), which was created to overcome the weakness of feedforward artificial neural network methods in the learning speed process. This makes the ELM method easy to operate and prevents overfitting. The learning speed of ELM is also fast and capable of greater generalization than KNN and SVM [20]. Considering the adverse capacity of ELM, this study tries to classify blood pressure levels using PPG signals. In the aim that it will be one of the reliable approaches for accurate and effective too for hypertension early detection.

MATERIALS AND METHOD
The study used photoplethysmography (PPG) data which is secondary data obtained from the Ethics Committee of Universiti Kebangsaan Malaysia Medical Center. The PPG data of 57 subjects, with 30 healthy subjects (7 males and 23 females) and 27 hypertension subjects (3 males and 24 females). The subjects collected did not have a history of diabetes mellitus and chronic inflammatory diseases, this aims to minimize interference from the initial lesion. Table 1 shows the parameters for all the subjects used in this study. The denoised PPG data went through several stages of data processing before finally being classified into hypertension and non-hypertension categories. The main classification method used in this research is Extreme Learning Machine (ELM). Figure 1 shows the stages of research conducted in this study.
The next feature is peak analysis, which determines the location of peaks that indicate systolic and diastolic, width of systolic and diastolic peaks, and prominence of systolic and diastolic peaks. This feature can later be used to determine the peak value that indicates systolic peak and diastolic peak, and prominence is used to find the RPTT value as illustrated in Figure 3.

c. Classification using Extreme Learning Machine Method
The data extracted from the feature extraction will be used as input in the classification process. ELM does not train input weights or biases like other neural networks. However, it uses nodes that provide maximum output value, and with randomly selected input weight and bias parameters, resulting in fast learning speed and good generalization performance [21]. If the activation function is infinitely differentiable, then the hidden layer output matrix can be determined and will provide a target value approach as good as desired. The structure of the ELM is shown in Figure 4. Based on this structure, the ELM can be mathematically model with ̃ as the number of nodes of the hidden layer and the activation function g(x) [23]. With wi is vector of weights connecting all components to i of hidden nodes and input nodes, is vector of weights that connect all components to i, bi is threshold, and wixj is inner product from Wi and Xj, the mathematical model can be shown by equation 2.
An SLFN with N hidden nodes and an activation function g(x) can be approximated with an error rate of 0, or in the sense of ∑ ‖ − ‖ = 0

=1
, is assumed to have , , so equation 3 can be calculated: With H is the matrix of hidden layer output, g(wi.xj+bi) is the output of hidden neurons related to input xi, β as matrix of output weights, and T is the matrix of target or output, where H + is the matrix of H modified by the Moore pseudo-pseudo-inverse Penrose method to force matrix multiplication for inputs with matrix dimensions different from those of the hidden layer. Equation (4) can be written as follows: Calculate the output weight (β) generated by the hidden layer and output layer with the prediction result (Y) obtained through the process of multiplying the hidden layer output matrix with the output weight using equation (5).
d. Data analysis Data analysis was approached to identify the higher accuracy features to be used in data retrieval using the ELM method and indicate the features that provide a good level of effectiveness in classifying hypertension. Apart from accuracy, the data was also evaluated for sensitivity and specificity based on equations 6 and 7. An appropriate pre-processing approach especially segmentation is an important step in PPG signal analysis and is used in various applications, such as heart rate monitoring, blood pressure estimation, and sleep analysis. Illustrated in figure 8. The training process is performed by learning iteratively until the optimal parameter values are reached, which are called as inputs to the testing process. This saved file contains information about the input properties, neuron output layer, neuron hidden layer, weights and biases for the training process in ELM. Therefore, the testing process is performed using data tests with input parameters obtained from the training process. This process is then tested against each feature input, skewness, peak analysis, and combined skewness and peak analysis. The results of the ELM for training process shown in Figure 9 and 10 proved that large hidden neurons lead to higher accuracy. This is caused by higher hidden neurons increasing the choice of better weightings and biases, ultimately affecting the level of accuracy. Hidden neuron 1500 results are more accurate with combination of skewness and peak analysis as input with accuracy up to 91.46%, compared peak analysis of the skewness function only. On the other hand, the hidden neuron 10 shows the highest accuracy 81.18% for the inputs with combination of skewness and peak analysis. However, the combined test and training accuracy actually yields the higher accuracy value (89.66%) for 1000 hidden neurons with combined skewness and peak analysis as input. In all the three different layers, the accuracy between training and testing phases showed a distinct difference which is estimated to be due to the limited amount of data used in the two phases. Most machine learning accuracies become more accurate as more data is trained.
In addition, the accuracy values obtained by combining the skewness and peak analysis consistently show better performance than the single feature. It can be seen that the more features, the higher the accuracy, with the main reason is that many features increase ELM process options.
The results also show that the higher the number of hidden neurons, the longer the process takes. This is due to more hidden neurons and more choices of weight options and good biases. Overall, the results of the classification performed with ELM show considerably better performance compared to some methods used to classify hypertension using PPG data, as shown in Table 2

CONCLUSIONS
This study uses ELM with feature inputs, skewness, peak analysis, combined skewness, and peak analysis. The ELM results of the training process with the output of 1500 hidden neurons are up to 91.46% for the accuracy when using skewness and peak analysis combined as inputs compared to peak analysis using only the skewness function. Additional larger training data should be considered, as there is still a difference in accuracy between training and test sentences.
Moreover, the accuracy values obtained by combining skewness and peak analysis consistently perform better than other features and showing a growing number of features that have been shown to improve ELM accuracy. Overall, the results of classification performed using ELM showing improvement over methods used to classify hypertension from PPG data in the past. A significant advantage of the proposed model is the ability to produce higher accuracy with smaller data set, which is a significant contribution for underdeveloped and developing countries where they are yet to build and establish their healthcare repositories.

CONFLICT OF INTERESTS
The author(s) declare that there is no conflict of interests.