Comparative Features Extraction Techniques for Electrocardiogram Images Regression

In this study, the comparative techniques have been developed to perform features extraction for the regression of the ECG images. Two regression methods have been used that are the linear and nonlinear regression. The features extraction techniques developed in this study are the nonnegative matrix factorization used to extract the feature from the ECG images and compare the results with different techniques such as principal component analysis, kernel principal component analysis and independent principal component analysis. These features are used for image regression using two regression techniques and compare between these two regressions techniques. The performance evaluation through this comparison is the error rate that is the root mean square error between the actual data and the data predicted from the regression and the results conclude the principal component analysis technique outperforms the other techniques.


INTRODUCTION
Introductory statistics and regression texts often focus on how regression can be used to represent relationships between variables, rather than as a comparison of average outcomes.Regression can be used to predict an outcome given a function of these predictors and regression coefficients can be thought of as comparisons across predicted values or as comparisons among averages in the data development by Gelman and Hill (2007) and use of the Nonnegative Matrix Factorization (NMF) algorithm for feature extraction and identification in the fields of text mining and spectral data analysis are presented.The evolution and convergence properties of hybrid methods based on both sparsity and smoothness constraints for the resulting nonnegative matrix factors are discussed by Takeda et al. (2007).Nonnegative matrix factorization is a very useful tool to reduce dimension of data and is a means of extracting feature information from images.This information is used as a signature for the image and similar images should have similar signatures, these features extraction is used in machine learning and data mining by Mejía-Roa et al. (2015) and for face recognition based on linear regression by Fang et al. (2012).
Nonnegative Matrix Factorization (NMF) is a dimension-reduction technique based on a low-rank approximation of the feature space.Besides providing a reduction in the number of features.NMF technique has gained a great interest among the Bioinformatics community, since it is able to extract interpretable parts from high dimensional datasets.In addition to statistical independence proved effective in many clustering and classification tasks by Mejía-Roa et al. (2015), Li et al. (2014) and Berry et al. (2007).NMF only allows additive relationship.This constraint has proved to be closer to the way how humans perceive and understand the data based on this methodology, we can map the low-level image features into the additive combination of latent semantics for the clustering by Mejía-Roa et al. (2015) and Li et al. (2014).The evolution and convergence properties of hybrid methods based on both sparsity and smoothness constraints for the resulting nonnegative matrix factors are discussed by Berry et al. (2007).NMF is a method to obtain a representation of data using non-negativity constraints.These constraints lead to a part-based representation because they allow only additive, not subtractive, combinations of the original data.
Regression is a statistical approach to forecasting change in a dependent variable (sales revenue, for example) on the basis of change in one or more independent variables (population and income, for example).Known also as curve fitting or line fitting because a regression analysis equation can be used in fitting a curve or line to data points, in a manner such that the differences in the distances of data points from the curve or line are minimized.Relationships depicted in a regression analysis are, however, associative only and any cause effect (causal) inference is purely subjective, also called regression method or regression technique by Huang (2014).The general goal of regression analysis is to estimate the association between one or more explanatory variables and a single outcome variable.An outcome variable is presumed to depend in some way or be systematically predicted by the explanatory variables.
This study, proposed a study focused on the comparison between the four techniques for features extraction and two types of regression, including linear and nonlinear.

MATERIALS AND METHODS
This study is the work of master thesis on 2016 and in Department of Information Technology, Institute of Graduate Studies and Researches, Alexandria University, Egypt.
There are two regression methods linear and nonlinear regression in subsections A and B respectively.
A. Linear regression: Linear regression models the relationship between the continuous dependent variable and independent variable by fitting the data into a linear equation.The dependent variable is that we usually refer to it as outcome, or Y.The independent variable is that we refer to it as predictor, or X, which we suspect contributes to the outcome by Hoffmann and Shafer (2005).

B. Nonlinear regression:
The nonlinear regression is the same as that of linear regression, namely to relate a response Y to a vector of predictor variables.In nonlinear regression the prediction equation depends nonlinearly on one or more unknown parameters.The linear regression is often used for building a purely empirical model but nonlinear regression usually arises when there are physical reasons for believing that the relationship between the response and the predictors follows a particular functional form by Smyth (2002).
The proposed method is image regression for ECG images by the preprocessing for the images and to obtain the features from factoring matrix of nonnegative matrix factorization and apply the linear and nonlinear regression for these features and compare the results with the features obtained from the Principal Component Analysis (PCA),

RESULTS AND DISCUSSION
We first, enter the images to the system.The images are twenty ECG images and apply preprocessing for those images by resizing the image to the measured rate of (256×256) pixels and then converted it from RGB to gray scale and then convert them to two-dimensional matrix.
The second is the feature extraction using a NMF algorithm from the images.The third is apply the two linear and nonlinear regression methods using different number of features and calculate the root mean square regression error for each number of feature by extracting the error that is by subtracting the actual data from the predicted data and then calculate its root mean square and then compare with the other feature extraction methods such as PCA, KPCA and ICA.An example is taken for a number of features twenty and then gets the predicted data using the two regression methods using PCA, KPCA, ICA and NMF features extraction method.
The results using linear regression are shown in subsection A and in subsection B using nonlinear regression: A. Linear regression: Table 1 shows the root mean square error when using that ECG images for (1, 2, 3,......., 20) features extraction and using linear regression by using the matlab function Fitlm we note that the (PCA) is the minimum error according to the other techniques NMF, ICA and     2 shows the root mean square error when using that ECG images for (1, 2, 3,.......20) features extraction and using nonlinear regression by using the matlab function Fitnlm we note that the (PCA) is the minimum error according to the other techniques NMF, ICA 12.5 13 14  ----------------------------------------------------------------  concluded that the PCA is the best technique of the four techniques used PCA, NMF, KPCA and ICA and it was providing lowest error and then the highest percentage of performance among these technologies in the used ECG images.

Fig. 1 :
Fig.1:The general steps for the proposed system (ICA) Independent Component Analysis and Kernel Principal Component Analysis (KPCA) and Fig.1shows the steps for the proposed system.

Fig. 2 :
Fig. 2: Linear regression using PCA KPCA, Fig. 2 to 5 show the actual and the predicted data for a number of features twenty using the different features extraction methods such as PCA, KPCA, ICA and NMF.

Table 1 :
Linear regression error

Table 2 :
Nonlinear regression error