Hypertension Diagnosis Index for Discrimination of High-Risk Hypertension ECG Signals Using Optimal Orthogonal Wavelet Filter Bank

Hypertension (HT) is an extreme increment in blood pressure that can prompt a stroke, kidney disease, and heart attack. HT does not show any symptoms at the early stage, but can lead to various cardiovascular diseases. Hence, it is essential to identify it at the beginning stages. It is tedious to analyze electrocardiogram (ECG) signals visually due to their low amplitude and small bandwidth. Hence, to avoid possible human errors in the diagnosis of HT patients, an automated ECG-based system is developed. This paper proposes the computerized segregation of low-risk hypertension (LRHT) and high-risk hypertension (HRHT) using ECG signals with an optimal orthogonal wavelet filter bank (OWFB) system. The HRHT class is comprised of patients with myocardial infarction, stroke, and syncope ECG signals. The ECG-data are acquired from physionet’s smart health for accessing risk via ECG event (SHAREE) database, which contains recordings of a total 139 subjects. First, ECG signals are segmented into epochs of 5 min. The segmented epochs are then decomposed into six wavelet sub-bands (WSBs) using OWFB. We extract the signal fractional dimension (SFD) and log-energy (LOGE) features from all six WSBs. Using Student’s t-test ranking, we choose the high ranked WSBs of LOGE and SFD features. We develop a novel hypertension diagnosis index (HDI) using two features (SFD and LOGE) to discriminate LRHT and HRHT classes using a single numeric value. The performance of our developed system is found to be encouraging, and we believe that it can be employed in intensive care units to monitor the abrupt rise in blood pressure while screening the ECG signals, provided this is tested with an extensive independent database.


Introduction
High blood pressure or hypertension (HT) is a severe disease, and patients have no symptoms in the early stages. Due to low awareness and without proper treatment, this may result in it being more harmful for hypertensive patients and increases the possibility of having cardiovascular diseases. In today's world, due to hypertension, the number of deaths has increased [1]. As per the 2005 global data, in India, 20.6% of males and 20.9% of females were suffering from hypertension. This trend is expected to rise to 22.9% (male) and 23.6 (female)%. The current survey shows the pervasiveness of hypertension in rural and urban India to be 25% and 10%, respectively. Only 25% of hypertension patients have their blood pressure (BP) under control after the treatment [2]. The BP is the pressure exerted by the blood against the walls of the arteries. The pressure relies on the work being done by the heart and the obstruction of the blood vessels [2]. The possible reasons for hypertension are less physical activity, lifestyle, smoking, stress, family history, and kidney disease [1]. Hence, it is a crucial issue to develop awareness, medical care, and treatment for hypertension. Clinically, we can classify hypertension into mild, moderate, and severe states [3]. It is more important to identify the severity of hypertension. The ranges of normal and hypertension blood pressure are given in Table 1. Table 1. Typical blood pressure ranges [3].

Blood Pressure Category Systolic (mmHg) Disystolic (mm Hg)
Normal BP less than 120 less than 80 The electrocardiogram (ECG) is a valuable tool used to measure the electrical activity of the heart [4][5][6][7][8][9]. Currently, there are various wearable and non-intrusive devices used to monitor hypertension using ECG [10][11][12]. To diagnose hypertension in a clinical environment, blood pressure measurement, which is the gold standard, is used. Depending on the range of blood pressure values (Table 1), the patients are classified as low-risk hypertension (LRHT) and high-risk hypertension (HRHT) patients.
Various techniques, algorithms, applications, and devices have been developed to detect and monitor hypertensive patients. Voss et al. [13] used high-resolution ECG, heart rate variability (HRV), blood pressure variability (BPV), and baroreflex sensitivity (BRS) signals. They found a difference in HRV signals of a normal pregnant female and a hypertensive pregnant female.
Poddar et al. [14] used the automated classification of hypertension and coronary artery disease patients using the probabilistic neural network (PNN), k-nearest neighbor (KNN), and support vector machine (SVM) classifiers with HRV analysis. They obtained the highest classification accuracy of 96.67%. Natrajan et al. [15] observed a significant reduction in high-frequency and an increase in low-frequency HRV signals of hypertensive patients.
Melillo and Izzo [16] used HRV signals along with various machine learning algorithm (SVM, decision tree (DT), and convolution neural network (CNN)) to identify HRHT patients, and obtained the highest accuracy of 87.8%.
Recently, Ni and Wang used fine-grained HRV-based methods and obtained an accuracy of 95% [3]. Song et al. [17] classified normal, hypertensive, and coronary heart disease (CHD) patients using HRV signals and the naive Bayes classifier. They obtained 92.3% classification accuracy.
Yue et al. [18] used machine learning algorithms to implement an automatic risk indication for mask-hypertension using HRV analysis [18]. They found that HRV, parameters in essential hypertension (EH), and mask hypertension (MH) in patients have significantly decreased.
Ni et al. [10] studied hypertension patients and normal subjects using a three-dimensional feature method with continuous HRV monitoring. They obtained the highest classification accuracy of 93.33% [10]. Mussalo et al. [19] analyzed different HRV features for various stages of hypertension patients. They observed significant changes in the HRV parameters of hypertension patients.
Thus, all the above-mentioned studies used HRV signals derived from ECG. The novelty of the proposed work is that we use optimal wavelet-based features extracted from ECG signals instead of using HRV. The optimal orthogonal wavelets that are designed by optimizing spectral localization (SL) were used in the proposed study [20]. Wavelets are regarded as the best tools for the analysis of non-stationary signals, including ECG [21][22][23][24]. Hence, we employed wavelet-based ECG features to develop an automated system for the identification of LRHT and HRHT. We applied the SL-optimized wavelet filter due to the following reasons [25]: (i) In conventional methods, most of the studies were performed by optimizing stop-band and pass-band energies by accurately defining the edge frequencies [26]. That may not be understood a priori in each application. Here, we used the orthogonal wavelet filter [27][28][29] designed by minimizing its frequency spread. (ii) Minimizing the spectral localization of a filter, it is possible to take care of both ripples and the transition band of the filter. (iii) The SL optimized filter gives precisely fewer ripples and sharp roll-off [30].
In our study, we used the SL optimal OWFB. Using a semi-definite program (SDP) technique [31,32], we optimized the filter coefficients, and the interior point algorithm provided the optimized solutions [33][34][35]. Hence, we tested the optimized OWFBs for analyzing ECG signals in order to separate LRHT from HRHT patients. The OWFB provided various sub-bands (SBs) of the ECG signal, and from these SBS, we extracted log energy (LOGE) and spectral fractal dimension (SFD) features [36]. Student's t-test ranking was applied to all extracted features and the most significant SBs of SFD and LOGE features.
The main contribution of this study is the development of the hypertension diagnosis index (HDI) using OWFB-based SFD and LOGE features. HDI provides the discrimination of LRHT and HRHT by a single numeric value. In a clinical environment, HDI is simpler and easier to use for the diagnosis of disease.
The remainder of the paper is arranged in the following manner. We discuss the details of the ECG dataset in Section 2. The methodology and an optimally-designed OWFB are described in Section 3. Section 4 illustrates the results obtained. In Section 5, we discuss the results obtained. At last, the concluding remarks of the paper are outlined in Section 6.

Dataset
The dataset for this study was taken from physionet's smart health for assessing the risk of events of ECG signals (SHAREE project) https://archive.physionet.org/pn6/shareedb/. A total of 139 subjects' ECG recordings were used with a length of 2 h:10 min:12 s, approximately, for each ECG signal. Each ECG recording contained three rows (channels/ signals) III, V3, and V5, and each signal had 1 million samples, approximately. In our study, leads III, V3, and V5 were named CH1 (Channel 1), CH2, and CH3, respectively. The ECG sampling frequency was 128 Hz with an 8-bit resolution, and the sampling interval was 0.0078125 s. The average age of patients was 71.4 for LRHT and 74.5 years for HRHT, which included 49 female and 90 male patients. To detect major cardiovascular and cerebrovascular events, patients were observed for a year. Seventeen patients (three syncopal, three strokes, eleven myocardial infarctions) were identified as HRHT subjects, while 122 patients as LRHT subjects. The dataset was authorized by the Federico II University Hospital Trust's Ethics Committee. All subjects involved in data collection provided consent and signed for the experimental use of data. Table 2 gives the details and statistics of the patients included in gathering the SHAREE database. We segmented our ECG signal into 5-min signals. After the segmentation of ECG signals, we obtained 3172 ECG epochs corresponding to LRHT and 442 epochs for HRHT subjects for each channel. Figures 1-4 show the LRHT and HRHT ECG signals of 5 min.

Methodology
To separate LRHT and HRHT automatically, the optimally-designed OWFB was used. A total of six sub-bands (SBs) were produced using wavelet decomposition [26]. Five SBs were used for detail, and one SB for approximation was used for ECG signals. After the wavelet decomposition into six SBs' log energy (LOGE) and signal fractional dimensions (SFD), features were extracted from all SBs. Thus, a total of 12 features were obtained from each ECG epoch, six LOGE and six SFD. The novelty of this work is the development of the HDI, which can be used to discriminate LRHT and HRHT ECG signals. The detailed outline of the proposed automated high-risk hypertension detection system is shown in Figure 5.

ECG Segmentation
For fast computing, we pre-processed the ECG signals. Long duration (2 h:10 min:12 s) ECG data were segmented for the 5-min duration, then each ECG segment was normalized using the Z-score before applying to the wavelet filter bank.

Design of Filter Bank
The OWFB contained two sets of filter banks (FLBs) ( Figure 6); one was called synthesis FLB, and another one was analysis FLB [25]. Both FLBs contained low-pass (LP) and high-pass (HP) filters. For analysis FLB, the outputs of LP and HP were downsampled by 2, and in synthesis FLB, the inputs to HP and LP were up-sampled by a factor of 2 prior to applying them. Let A 0 (z) and A 1 (z) be the LP and HP filters, respectively, for analysis FLB. Let B 0 (z) and B 1 (z) be the LP and HP filters of synthesis FLB, as shown in Figure 6. The analysis and synthesis LP filters were time-reversed copies of each other, which is an important characteristic of OWFB [20]. Using the quadrature conjugation technique [37], the HP filters A 1 (z) and B 1 (z) can be derived from LP filters A 0 (z) and B 0 (z). Hence, we can get the remaining three filters directly from the LP analysis filter A 0 (z). The perfect reconstruction (PR) and zero-moment (ZM) constraints [28] must be obeyed by optimal OWFB for output O(z) to be a delayed replica of the input I(z) [29]. For perfect reconstruction, the filter must fulfill the orthogonality condition as mentioned below [25]: Let P(z) = A 0 (z)B 0 (z), P(z) be called the product filter [34]. We can rewrite (1) in terms of the product filter as below: Let P(e j f ) be the frequency response of the product filter, which is represented by [34]: Now, we can represent the perfect reconstruction condition (2) in the frequency domain as, For the real A 0 (n), P(e j f ) = |A 0 (e j f )| 2 ≥ 0 [38]. Here, to design a real-coefficient orthogonal wavelet filter bank, a positive value of the P(e j f ) for f ∈ [0, π] is needed. The total number of roots at z = −1 can be defined as zero moments (ZM) of the filter [29,39]. To design the LP filter with M th -order, ZMs there should be 2M zeros at z = −1 in the product filter.

Constraint in the Time-Domain and Objective Function
To design the optimal OWFB, consider a(n) to be the unit impulse response of finite impulse response analysis LP filter of A 0 (z) and b(n) be the unit impulse response of synthesis LP filter B 0 (z) of order N − 1. The optimality criterion to design the orthogonal filter design is to minimize the mean squared spectral localization (MSSL). MSSL happens to be the same for both analysis and synthesis filters as the former is the time-reversed replica of the latter [24,32,44]. Now, we can define the MSSL, σ f 2 of the filter A 0 (z) as [45]: where E represents the squared-norm or energy of the filter. Imposing M zero-moments (regularity) and orthogonality constraints, we design an optimal OWFB with the objective of having minimum MSSL. The optimization problem for the filter design can be mentioned as below.
Hence, to minimize MSSL (6a) for the given regularity (6c) and orthogonality (6b) constraints, a constrained optimization problem was formulated. To develop a convex formulation, we need to express the constraints and objective function in terms of P(z). As specified above, the sequence p(n) (impulse response of the P(z)) is an auto-correlation sequence whose spectrum satisfies the non-negativity condition P(e j f ) ≥ 0. We can write the objective function (5) in the form of the product filter as: Thus, the optimization problem (6) can be represented in the form of the product filter as mentioned below: The above-mentioned optimization problem (8) is a non-convex optimization problem in variable p(n), whereas the optimization problem in (6) is a convex optimization problem in variable a(n). Now, we intend to convert the non-convex problem into a convex problem to get an optimal solution. Using (8b), the half-band condition is linear in variable p(n). Equations (8c) and (8d) represent the regularity conditions, which are also linear constraints. Only the non-positivity condition (8e) is a non-linear semi-infinite constraint (one constraint each for every f ∈ [0, π]), which needs to be converted into a finite constraint to formulate the convex optimization problem. Sharma and Moulin et al. [46,47] used the discretization method to transform the semi-infinite constraint into finite constraints. However, due to the inaccurate solution obtained by the discretization method, it is not advisable to use it.
Hence, we used the Kalman-Yakubovich-Popov lemma (KYPL) [48] for the formulation of a semidefinite program (SDP). By the KYP lemma, (8e) exists only if there exists a symmetric positive Z ∈ R N×N such that: Hence, the objective function (8a) in terms of sequence p(n) can be given below: Here Using (10), we obtained the objective function as a linear function of p(n). Furthermore, all constraints can be expressed as a linear function of p(n). Hence, the optimization problem (8) can be written as the following convex optimization problem [25].
subject to (8b), (8c), (8d), (9) and Z 0 Now, our optimization problem is convex as the objective function, and all the constraints are convex. To find the global solution of the problem, we can use interior point algorithms such as SedDumiand SPDT3 [49].The tools SPDT3 and SedDumi can solve the optimization problem accurately and efficiently. After finding the optimal p(n), the next step is to obtain the desired low-pass filter A 0 (z) using spectral localization of P(z).

Wavelet Decomposition
We designed optimal WFBs for sub-band decomposition of ECG signals [50,51]. We employed five levels of decomposition. The five-level wavelet decomposition gave us precise information about the 6 SBs of each ECG epoch. By this technique, we have extracted the desired frequencies present in the ECG signal. Hence, the five-level wavelet decomposition of the ECG signal was done by this method. The six SBs had five detailed (SB1-SB5) and one approximate (SB6) SBs.

Features Used
The selection of essential features was an important part of this work. Using this method, we were able to classify LRHT and HRHT ECG signals. The log energy (LOGE) and signal fractal dimension (SFD) features were computed from all six SBs.
Log energy (LOGE): To calculate the LOGE of each SB of the ECG signal, the logarithm of energy needs to be computed. The general formula of the log energy is [25]: LOGE m = log ∑ n r m (n) 2 (12) where LOGE m is the log energy of the m th sub-band and r m (n) is the amplitude of the n th sample of the m th sub-band.
Signal fractal dimension (SFD): Fractals are figures of geometry or curves that are a subset of a Euclidean space. These curves have the Hausdorff dimension strictly exceeding the topological dimension. The fractal dimension provides a statistical magnitude to the complexity detailing the pattern or fractal pattern with respect to the scale with which it is measured [25].
We can write the SFD equation as below: where P m is the number of self-similar patterns used to fill the original pattern and m is the ratio used to decompose the original pattern into P m self-similar patterns.

Hypertension Diagnosis Index
The extracted highly-significant features in Table 3 were used to develop the mathematical model (14) to discriminate the two classes [52][53][54].
We propose a hypertension diagnosis index (HDI) to discriminate against the LRHT and HRHT by a single numeric value. We used two sets of features (SFD and LOGE) with the highest t-value (lowest p-value) to compute the HDI [55]. We formed the mathematical simulation given by: (14) where LOGE SB2 and LOGE SB3 represent LOGE features extracted from SB2 and SB3, while SFD SB6 , SFD SB2 , SFD SB3 , and SFD SB4 represent the SFD feature extracted from SB6, SB2, SB3, and SB4.

Results
Using the 5-min ECG dataset, it is segmented in 3172 epochs of low-risk hypertension and 442 epochs of high-risk hypertension. Our whole experimental work was performed using MATLAB Version 9.1 with an Intel Xeon 3.5 gigahertz (GHz) and 16 gigabytes (GB) of random access memory (RAM). Tables 3-5 represent the statistical differences of LRHT and HRHT for the CH3, CH1, and CH2 ECG signals.
The result of Student's t-test for each SB is shown in Table 6 with t-values and p-values for both features (SFD and LOGE). Table 7 shows the result of the calculated HDI. Table 7 presents the range of HDI and shows a significant difference in the LRHT and HRHT. Figure 7 shows the discrimination between LRHT and HRHT by the significant numeric value. The segregation of both classes by HDI is more simple and easy to use in a clinical environment.

Discussion
The aim of this study was to calculate the performance of features (LOGE and SFD) extracted from novel optimal OWFB. In this research work, using spectrum-localized OWFB-based non-linear features, we could identify HRHT patients using a single numeric value. The employed optimal OWFB-based features yielded 100% discrimination of LRHT and HRHT patients using HDI. The salient features of our developed automated system are given below:

•
From Table 3, LOGE values of SB2-SB6 showed significant changes corresponding to LRHT and HRHT patients. • SFD of SB2 for LRHT and HRHT obtained the highest mean value, while SB1 showed the lowest mean value. LOGE of SB1 yielded the highest mean value for LRHT, and SB2 for HRHT patients yielded the lowest mean value.

•
The novelty of the proposed work was the development of HDI to discriminate between the two classes using a single value. • Table 7 presents the range of HDI and shows a significant difference in the LRHT and HRHT by a numeric value.

•
We did not need classifiers, which involve training and testing. It was fast and involved only the extraction of two feature sets.

•
The spectral localization technique was used to analyze non-stationary characteristics of the ECG signal. As we used spectral localized OWFB, our proposed work was unique as compared to other research works [3,16].

•
For better and fast computation, we used fewer features. The length of the ECG signal was 5 min. Hence, it was not computationally intensive and quicker in diagnosis.

•
Using the same database, Melillo and Izzo used various machine learning algorithms (SVM, decision tree (DT), and convolution neural network (CNN)) and obtained the highest accuracy of 87.8% with HRV signals [16]. Recently, Ni and Wang [3] obtained an accuracy of 95% using heart rate variability (HRV) signals.

•
Many studies have used HRV-based techniques to detect hypertension; we used wavelet-based features directly extracted from ECG. Our method was different from HRV-based methods and easy to use in the clinical environment [3].

•
The performance of the system was found to be promising, and we expect that it can be employed in intensive care units to monitor the abrupt rise in blood pressure while screening the ECG signals, provided it is tested with an extensive independent database. • The present research work was conducted using 139 ECG recordings segmented into 3614 (3172 as LRHT, 442 as HRHT (78 stroke, 78 syncope, and 286 myocardial infarction)) epochs of 5 min each comprised of CH1, CH2, and CH3. The ECG dataset was obtained from https://archive. physionet.org/pn6/shareedb/. Our whole experimental work was performed using MATLAB. Table 7 shows the results of the automated detection of LRHT and HRHT classes. In Table 8, we compare our proposed work with other methods. Using HDI, we can discriminate between the two classes by just the single numeric value with 100% accuracy.
The dataset consisted of 3614 ECG epochs, out of which 87% were LRHT and 13% were HRHT ECG signals. This imbalanced dataset is one of the limitations of our work. In general, LRHT data are greater than HRHT data. To reduce this imbalance problem, synthetic balancing data are needed. The other limitation of our research work is the selection of the optimal number of ZMs and the length of the filter. In order to achieve accurate identification of HRHT, we cannot predict the estimated order and ZM a priori.
In recent studies, deep learning methods were widely used for classification problems [56][57][58][59]. We can use deep learning methods like convolution neural networks (CNN) [60]. In deep learning-based techniques, we need not extract, select, and classify the handcrafted features. However, due to extensive data processing, the computational complexity is enormous. Hence, they require fast processing workstations and graphics processing units (GPU).

Conclusions
In this study, we used optimal OWFB-based non-linear features to discriminate LRHT and HRHT ECG signals automatically using an index (HDI). The five-level wavelet decomposition of ECG signals using optimal OWFB produced six (SBs). The LOGE and SFD features were extracted for all six SBs. Our proposed OWFB-based method was adequate to discriminate the HT ECG signals accurately utilizing features (LOGE and SFD) by a single numeric value. To evaluate the performance of the optimal wavelet filter bank, HDI was developed, which separated LRHT and HRHT groups using the proposed index. Our results show that the developed model was better than the other existing systems and ready to be tested using a large database. In the future, we plan to test the performance of our technique to detect the severity of hypertension using certain machine learning-based techniques with the same database. We also intend to use deep learning-based methods for the classification of LRHT and HRHT ECG signals as our future work.

Conflicts of Interest:
The authors declare no conflict of interest.