Lying position classification based on ECG waveform and random forest during sleep in healthy people

Background Several different lying positions, such as lying on the left side, supine, lying on the right side and prone position, existed when healthy people fell asleep. This article explored the influence of lying positions on the shape of ECG (electrocardiograph) waveform during sleep, and then lying position classification based on ECG waveform features and random forest was achieved. Methods By means of de-noising the overnight sleep ECG data from ISRUC website dataset, as well as extracting the waveform features, we calculated a total of 30 ECG waveform features, including 2 newly proposed features, S/R and ∠QSR. The means and significant difference level of these features within different lying positions were calculated, respectively. Then 12 features were selected for three kinds of classification schemes. Results The lying positions had comparatively less effect on time-limit features. QT interval and RR interval were significantly lower than that in supine (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{P}}\, \le \,0.01$$\end{document}P≤0.01). Significant differences appeared in most of the amplitude and double-direction features. When lying on the left side, the height of P wave and T wave, QRS area and T area, the QR potential difference and ∠QSR were significantly lower than those in supine (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{P}}\, \le \,0.01$$\end{document}P≤0.01). However, S/R was significantly greater on left than those in supine (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{P}}\, \le \,0.01$$\end{document}P≤0.01) and on right (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{P}}\, \le \,0.05$$\end{document}P≤0.05). The height of T wave and area under T wave were significantly higher in supine than those on right (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{P}}\, \le \,0.01$$\end{document}P≤0.01). For the subject specific classifier, a mean accuracy of 97.17% with Cohen’s kappa statistic κ of 0.91, and AUC > 0.97 were achieved. While the accuracy and κ dropped to 63.87% and 0.32, AUC > 0.66, respectively when the subject independent classifier was considered. Conclusions When subjects were lying on the left side during sleep, due to the effect of gravity on heart, the position of heart changed, for example, turned and rotated, causing changes in the vectorcardiogram of frontal plane and horizontal plane, which lead to a change in ECG. When lying on the right side, the heart was upheld by the mediastinum, so that the degree of freedom was poor, and the ECG waveform was almost unchanged. The proposed method could be used as a technique for convenient lying position classification.

Background Sleep is an essential process in human life, which plays a necessary role in self-repair, self-recovery of body condition, as well as integration and consolidation of memory. It is an indispensable part of human health. About one-third of a person's lifetime is spent during sleep. Good sleep can eliminate fatigue, restore one's strength and energy, and ensure body functioning well. For healthy subjects during the overnight sleep, different lying positions appear such as lying on the left side, supine (lying on the back), lying on the right side, and prone (lying on the stomach). This may cause the skin to squeeze or stretch, and the distance between the electrodes to shorten or prolong. On the other hand, the heart is squeezed slightly, and chest is pressed so that breath is influenced. All these body changes will result in ECG (electrocardiograph) waveform changes.
As early as in 1997, in the course of clinical myocardial ischemia monitoring, Adams et al. had found that the side lying position frequently caused obvious ECG changes [1]. Shinar et al. found that the R-wave durations were significantly different in three lying positions, and thus successfully identified 90% of body position changes during sleep by calculating the R-wave duration of lead I, II, and III lead ECG, simultaneously [2]. Shinar further used these three leads to classify four positions, finding that the II lead ECG worked best and achieved 80% accuracy [3]. When comparing standing and supine positions of healthy subjects, Batchvarov et al. found that the RR interval of 12-lead ECG was significantly shorter in standing than that in supine [4]. Smit et al. investigated the changes of QRS waves in ECG after normal exhalation, maximum inspiration, and maximum exhalation. It was concluded that the three kinds of breath-holding conditions had little effect on the QRS complex and individual differences were large [5].
Existing studies have shown that body positions and chest changes could cause changes in ECG waveforms, but there's no study exploring the consistent principle of such changes in ECG waveforms, systematically. It is of great importance for researchers to consider these impact in mind from lying position changing when studying the ECG waveform changes in different sleep stages. And furthermore, these changes in waveforms can be applied to non-artificial and low-intrusion lying position supervision. Consequently, in this article, we present a method of exploring the influence of lying positions on the shape of ECG waveforms during the overnight sleep in healthy subjects, and then lying position classification based on such principle and random forest is applied.

Methods
The study presented in this article can be divided into 3 parts. Data process mainly includes ECG signal preprocessing, character points detection, data epoch segmentation, features extraction with three kinds of waveform features. Then the significant differences between lying positions of waveform features are calculated. Finally lying position classification based on ECG waveform and random forest during sleep is achieved. The workflow is shown in Fig. 1. Pan et al. BioMed Eng OnLine (2018) 17:116 Dataset The data used in this article was from the ISRUC web sleep database, which provided a variety of physiological data from 10 healthy subjects [6]. The overnight sleep data in this database was recorded by polysomnography (PSG), which lasted for about 8 h. The experiment was finished at the Sleep Medicine Center of the University of Coimbra. For each subject, the database provided a total of 19 physiological data such as electrocardiogram (ECG) and lying position. The ECG sampling rate was 200 Hz. Because the R wave peaks morphology of No. 5 subject in the database was doublepeak, the determination of the R-wave peak point's horizontal and vertical coordinates were interfered. Thus this piece of data wasn't included in this study. For the remaining 9 participants, only a small number of subjects had prone position during the overnight sleep. Therefore, this article studied the ECG waveform changes within the left, supine and right-side lying position during the overnight sleep for 9 healthy subjects.

Signal preprocessing
The ECG signal in the ISRUC database mainly contained two kinds of noises, myoelectric interference caused by muscle electrical activity with a frequency of 2 Hz-2 kHz, and baseline drift caused by human respiratory coupling. In this study, first of all, the mean filter was applied to remove the interference from AC (alternating current) in the ECG signal. Secondly, the three-layer lifting wavelet decomposition method was used to remove the high frequency myoelectric interference. Finally, the effect of baseline drift was eliminated by the function fitting method. Since this article was to explore the changes of ECG waveform features, it was necessary to acquire high accuracy point locations of P-wave, QRS-wave, and T-wave. In this study, the multi-character points detection algorithm of ECG signals based on wavelet transform, proposed by Yang et al. was used to decompose and de-noise the original signal, and the position of the QRS complex were obtained [7]. Then the area increment method, which was proposed by Song et al. was applied to locate the P wave end at the right side of P wave peak, and the T wave origin at the left side of T wave peak [8]. Finally, all the subject's overnight ECG character points and waveforms were manually checked. After signal preprocessing and character points detection, the results are shown as follows in Fig. 2.

Data segmentation and ECG waveform features
The ISRUC database divided the subject's overnight sleep data into 30 s epochs. Then the sleep stage of each epoch was determined and the lying position was recorded. In this study, we excluded the time segments whose lying position duration was no longer than 1 min (two epochs), and those the ECG signal waveform disturbed during the body position changing so that the character points detection could not be performed.
The characteristics of ECG waveform morphology features and their meanings are shown in Table 1. In this study, these features are divided into three classes according to their orientation in the ECG chart, which are the time-limit features (horizontal direction features), amplitude features (vertical direction features) and double-direction features (features reflecting both time and amplitude simultaneously). The time-limit features reflect the time interval between the ECG waveforms character points on the time axis. The amplitude features reflect the height of the ECG waveforms and potential Fig. 2 The results after signal preprocessing and character points detection. From left to right, there are P wave origin, P wave peak, P wave end, Q wave peak, R wave peak, S wave peak, T wave origin, T wave peak, T wave end. This part of ECG signal was from No.1 subject, which appeared from 5 h 40 min 11 s 505 ms to 5 h 40 min 13 s 355 ms Pan et al. BioMed Eng OnLine (2018) 17:116 difference of points in the amplitude direction. The double-direction features mainly include area features, slope features and angle feature.
The calculation methods for several special waveform features are described as follows.

a. Waveform height features
The height of the waveform reflects the amplitude of the electrical signal. In actual ECG signal, the amplitude of the reference equipotential is not zero, and it fluctuates within a certain range. Therefore, the heights of P wave, R wave, S wave, and T wave cannot be directly represented by the vertical coordinates of waveform points. It is necessary to calculate the reference equipotential amplitude and the amplitude of each waveform with respect to the reference equipotential line. In TP segment all myocardial cells are at rest, so that there is no potential difference between them, and almost no electrical activity appears. TP segment is longer and more stable than PR segment, so TP segment was selected in this study to calculate the baseline equipotential line. Firstly, the mean filter was selected with width 5 to smooth the TP segment. Then we selected 5 points (TP(i), i = 1, 2, 3, 4, 5) at equal intervals in the TP segment. The average amplitude of this 5 points was recorded as a stable point, which was used to represent the baseline equipotential of the corresponding ECG waveform before this TP segment. Finally, the potential difference between the P wave, R wave, S wave and T wave peaks and the stable point was calculated as the height of the corresponding waveforms. Take R wave height as an example, the waveform height formula is as follows:

b. Slope features
Slope features can reflect both time and amplitude change at the same time. The absolute value of slope features will increase with the amplitude of waveform increasing, and will decrease with the time interval increasing. Taking RT slope as an example, the formula for calculating the absolute value of the slope of the connection line between the R wave peak point and T wave peak point is as follows:

c. Area features
In order to reduce the influence of lying position changes on the depth of Q-wave and S-wave, in this study we used the method of calculating the triangular-like area when calculating the QRS complex area. The origin of T wave might be affected by the double effect of the baseline drift and the ST segment change, resulting in different heights between the T wave start point and end point. Therefore, this method was also used when calculating the area under T wave. As shown in Fig. 3, the area of the QRS complex and the area under the T-wave should be calculated by subtracting the area of the triangle from the area obtained by summing the vertical ordinates of the ECG waveform, thereby correcting the calculation of QRS complex area and T-wave area. The formula is as follows: (1) Among them, Q represents the Q-wave peak horizontal ordinate, S represents the S-wave peak horizontal ordinate, Ts represents the beginning of the T-wave horizontal ordinate, Te represents the end of the T-wave horizontal ordinate. The meaning of other segments in the formula is shown in Fig. 3.

d. Corrected features
QTc (corrected QT interval) is heart-rate-corrected QT interval, that reflects the entire process of cardiac depolarization and repolarization. The calculation formula is Bazetts's algorithm as follows: Among the formula HRn is the standardized heart rate. It is calculated as follows:

e. Newly proposed features
As shown in Fig. 4, further observation on the ECG waveforms in three lying positions revealed that when lying on the left side, the S wave was lower than those in supine and lying on the right side. And the waveform amplitudes of the R waves in different lying positions were obviously different. Therefore, this study proposed two new features, namely S/R and angle ∠QSR. S/R is the ratio of S wave depth and R-wave height, which can reflect the relative depth of S waves. Angle ∠QSR is the angle value of the inner angle ∠QSR of the triangular QRS. Firstly, the lengths of QR, RS and QS are calculated. Then according to the cosine theorem, ∠QSR can be obtained. In this article, the unit of ∠QSR is degree, and the formula is as follows: Classifier: random forest RF (Random forest) is a novel classification method proposed by Breiman in 2001 [9]. It is a classifier that is built randomly and contains a large number of decision trees. The classification result is acquired by voting, because the output is determined by the mode of the output of each tree. Such randomness is mainly embodied in two aspects. On the one hand, a dataset of size N, which is the same as all training dataset, is selected using  the bootstrapping procedure to train each decision tree. On the other hand, a subset of all features is randomly selected at each internal node. Consequently, RF can handle high-dimensional dataset (involving many features) without feature selection, and it is better at solving multiple classification problems when comparing with SVM (supporting vector machine). The decision trees are independent of each other in training procedure, so the parallel computing can be applied, which leads to fast calculation compared with ANN (artificial neural network). Besides, the structure of RF is simpler and it is easy to build, and it has strong ability to avoid over-fitting at the same time.
Because of the advantages of fast calculation, high precision, strong anti-noise ability and avoiding over-fitting when compared with other good classification method, random forest was chosen in this study. The number of trees was set as 500. After significance analysis, 12 features, including QT, RR, TP, ∠QSR, S/R, QR, P peak, R peak, T peak, T area, QRS area, T area/QRS area, were selected for classification.
When establishing each decision tree, there are two random processes to avoid over-fitting. The input data for random forest is sampled by bootstrapping procedure randomly, that is, there may be duplicate samples in the input data. Assuming N dataset, the number of input data is also N. This makes the input data of each tree not a full dataset during training, making it relatively easy to avoid over-fitting.
Then from M features, m features (m ≪ M) are randomly selected. After that, the decision tree is created by completely splitting way, so that either one leaf node of the decision tree cannot continue to split, or all the samples inside belong to the same class. Since the two random processes applied, over-fitting does not occur even without pruning. Every tree obtained by this algorithm is very weak, but they are very powerful when combined as random forest.
Each decision tree is like an expert proficient in a narrow field (because we choose m from M features to let each decision tree learn), so that there is a random forest including many experts who are proficient in different fields. When solving a new problem (new input data), they can view this from different perspective. And in the end, various experts vote to get the results. In this study, we separated the data as training data and testing data, building the RF as classifier by TreeBagger through MATLAB and the classification was achieved. We randomly selected 1-99% of the data in the database as training data, and the rest as testing data. Then the learning curves including accuracy and Cohen' k were plotted to verifying the absence of overfitting. When the proportion of training data was more than 30%, the accuracy and Cohen' k didn't increase any more. But when the proportion of training data was more than 50%, the accuracy was stable and the Cohen' k started to decrease, which meant that the overfitting existed. As we can saw in Fig. 7 in "Results" section, when the proportion was 20%, the accuracy reached a high level of 97.17% and the Cohen' k reached an acceptable level of 0.91. Besides, less training data would lead to faster calculation. Consequently, we selected 20% of the data as training data to acquire high accuracy as well as Cohen' k, and avoiding overfitting.

Performance evaluation
The performance of classifier was evaluated by accuracy, Cohen's kappa statistic κ, ROC-AUC (receiver operating characteristic curve-area under curve), Sensitivity, Specificity and F1-scores. Accuracy stands for the percentage of correctly classified epochs in the whole dataset. Statistic κ is a more effective evaluator because it takes the prior probability into account. It can be calculated as P A is the proportion of correctly observed, while P C is the proportion of randomly expected. P prio is equal to 1. Such variables can be calculated by the second formula. m means the number of class. In this study m = 3. And P means the proportion of the corresponding sample to the entire. Statistic κ ≤ 0 means that the observed result is even worse than random expecting. And κ ≥ 0 means that all sample are classified into the correct class. A higher value of κ indicates a better classification result between our classifier and the expected results. ROC curve is a graphical plot that presents the ability of a binary classifier system. It is created by plotting the FPR (false positive rate) and TPR (true positive rate) at various threshold. Because that the classifiers in this study are ternary classifiers, after classification results are obtained, in order to draw ROC curve and calculate the AUC, Sensitivity, Specificity and F1-scores of one lying position, the other two lying positions are combined. E.g. before drawing ROC curve and calculating such several indexes of lying on the left, epochs of supine and lying on the right are combined as not-left, then the 2 × 2 confusion matrix is built.
Generally speaking, a good classifier should be associated with high values of accuracy, statistic κ and AUC.

Classification scheme
In this study, we developed three kinds of classification scheme for different cases, including subject specific scheme, subject independent scheme without feature normalization and subject independent scheme with feature normalization. The result of ECG waveform features significance analysis between different lying positions will be presented in "Results" section. After significance analysis, 12 features, which showed strong significant difference between lying positions including QT, RR, TP, ∠QSR, S/R, QR, P peak, R peak, T peak, T area, QRS area, T area/QRS area, were selected for classification.
A total of 5114 epochs of the overnight sleep data from 9 subjects were included in this study. Due to the fact that most subjects did not have prone position, or only had several prone epochs in overnight sleep, the prone epochs were manually removed. Consequently, there are only three classes in classification including lying on the left, supine, and lying on the right. The details and workflow are shown in Fig. 5.

a. Subject specific scheme
For each subject, 20% epochs of three kinds of lying positions were randomly selected for training the classifier, and the rest 80% were used as testing data. The reasons 20% for training and 80% for testing are that on the one hand, the waveforms were obviously different in 3 lying positions. Strong significant difference of waveform features appeared in "Results" section. On the other hand, we were trying to train the classifier with limited data. So that when putting into application, we could build a small database for patients, extracting ECG signals for only half an hour, to train the classifier. And then clinical automatic classification with high accuracy were achieved. In order to avoid the errors caused by selecting samples randomly, the training and classification processes were repeated for 10 times with different training data. At last, the average value and standard deviation of accuracy and κ statistic were calculated.

b. Subject independent scheme without feature normalization
For each specific subject to be analyzed, all the records from other 8 subjects were pooled together to form the training dataset. This process repeated for 9 times. Finally, the same as the specific scheme, the average value and standard deviation of accuracy and κ statistic were calculated.

c. Subject independent scheme with feature normalization
However, because of the individual differences, all features need normalization before classifier training. One of the most widely used normalization method is to transform all the features scales to a new range, such as [0,1]. But when the outliers of data appear, the transformed data scale will be unsymmetrical. To solve this problem, we developed a normalization method based on quantile. The 5% and 95% quantiles of data were selected firstly and the scale of these two samples was linearly transformed to [0,1],

Results
A total of 5114 epochs of the overnight sleep data from 9 subjects were included in this study. Table 2 shows the frequency distribution of sleep stages and lying positions for these epochs. The results part mainly includes significance analysis of features and classification performance.

Significance analysis of features
This study calculated the 30 waveform features of the overnight ECG sleep data from 9 healthy subjects in the database, and calculated the means and standard deviations according to the four lying positions. The calculation results and significant differences between the different lying positions are shown in Tables 3 and 4, respectively. Due to the fact that most subjects did not have prone, or only had several prone epochs in overnight sleep, the standard deviations of features in prone were not shown in Table 3. On the other hand, the waveform features significance level of only three conditions, including left-supine, left-right and right-supine positions, were calculated. The P values of ECG waveform features significant level among different lying positions are shown in Table 4.

Classification performance
After significance analysis, 12 features, which showed strong significant difference between lying positions including QT, RR, TP, ∠QSR, S/R, QR, P peak, R peak, T peak, T area, QRS area, T area/QRS area, were selected for classification. Table 5 gives the confusion matrices of all individuals for subject specific scheme and subject independent scheme without or with feature quantile normalization. The numbers in Table 5 refers to the amount of epochs of target position while classified as output position. Table 6 shows the classification performance based on 12 features for subject specific scheme and subject independent scheme without or with feature normalization. The process repeated 10 times, and the means and standard deviation were calculated and listed in Table 6. Figure 6 shows the classifier performance of three scheme: (a-c) show the ROC curve of 3 lying positions respectively, and (d-f ) show the AUC, Sensitivity, Specificity and F1-scores of the classification result. The AUC of three lying positions in subject specific scheme reached at 0.9886 ± 0.0043, 0.9725 ± 0.0106 and 0.9925 ± 0.0019, respectively. While in subject independent scheme without features normalization 0.6859 ± 0.0050, 0.3570 ± 0.0035, 0.6321 ± 0.0055, and in subject independent scheme with features normalization 0.7708 ± 0.0017, 0.6646 ± 0.0047, 0.7132 ± 0.0040. Because the results of subject specific scheme presented in Table 6 and Fig. 6 include overall accuracy of 97.17% ± 2.74%, κ 0.9121 ± 0.1010 and AUC > 0.97 in three lying position classification), we tried to decrease the proportion of training data. The results are shown below in Table 7. In order to verify the absence of overfitting, the learning curve are shown in Fig. 7. The comparison of the classification performance between RF, SVM and ANN is shown in Fig. 8. We can see that RF and ANN perform better than SVM, and the accuracies of RF and ANN are close. The

Table 3 Means and standard deviations of 30 ECG waveform features in 4 lying positions
In this table, the time-limit features are calculated in millisecond (ms), the amplitude features are calculated in millivolt (mV), and the angle indicator is calculated in degree. Due to the fact that most subjects did not have prone, or only had several prone epochs in overnight sleep, the standard deviations of features in prone were not shown in this table Cohen' k of ANN is slightly higher than RF. However, according to Table 8, the calculation of RF is much faster. Consequently, RF performs best in general.

Discussions of results
The reason why we developed three kinds of schemes is that firstly we tried to establish a database which could be used for many subjects. However, because of the individual difference, the results were not acceptable. Consequently, we applied the normalization method to transform all the features scales to a new range. The results of subject independent scheme with feature normalization were much better but the accuracy was still not enough for clinical application. Finally, we developed the subject specific scheme, which was similar to building a database with the ECG data from a specific subject and then classifying the lying positions for this subject based on the database. That's why the results were acceptable and this method could be applied in clinical monitoring. As can be seen from Table 4, the lying positions have less influence on time-limit features, because most of the time-limit features show no significant differences between different body lying positions. Compared with supine, only QT interval, RR interval, and TP segment are significantly shorter when lying on the left side. The reason needs further exploration.
It can be seen that the influence of lying position on ECG waveforms is mainly reflected in the amplitude features and double-direction features. The amplitude features include the heights of P wave, R wave, and T wave. The relative height features include QR potential difference, RS potential difference, R peak T peak potential

Table 5 Confusion matrices based on 12 features
Confusion matrices based on 12 features for (a) subject specific scheme, (b) subject independent scheme without feature normalization, (c) subject independent scheme with feature normalization   difference, and RT slope. Area features includes QRS complex area and T wave area. These three types of amplitude features were significantly smaller when lying on the left side than those in supine or right, or less than those in other two lying positions simultaneously. Only a few features show significant differences between supine and lying on the right side. However, the S-wave-related waveform features are different. When lying on the left side, the depth of S wave is significantly greater than that in supine, and S/R is significantly greater than that both in supine and right. This feature reflects the decrease of R wave and the deepening of S wave in left-side lying. ∠QSR is significantly smaller in left than that in supine and right. This feature reflects the difference between the relative depth of the Q wave and S wave.
The influence of lying positions on ECG waveforms is mainly reflected in the amplitude features. Since the ECG waveform directly reflects the potential difference of the leads, and the signal is extracted from the electrodes on body surface, the body position changes will cause a change of relative position between the electrodes and heart. Thus ECG waveform morphology changed. This change can be embodied in two aspects. On the one hand, when the chest is under pressure, the distribution of body  fluids changes, so that the impedance of the chest changes. Also the heart is squeezed and deformed. On the other hand, the heart is affected by gravity when lying on the side. Different parts of heart have different degree of freedom, which results in heart rotation and swing. The significant differences of ECG waveform features in 3 lying positions could be utilized for automatic lying position classification during sleep. For three kinds of schemes, the overall classification accuracy of subject specific scheme reached 97.17%, κ statistic 0.91 and AUC > 0.97, which was almost perfect. This can be used for clinical lying position monitoring after setting up a subject specific dataset. Further study in Table 7 showed that such dataset didn't need to be large, and the performance could a The comparison of accuracy b The comparison of Cohen' K Fig. 8 The comparison of the classification performance between RF, SVM and ANN. We can see that RM and ANN perform better than SVM, and the accuracies of RF and ANN are close. The Cohen' k of ANN is slightly higher than RF. However, the calculation of RF is much faster. Consequently, RF performs best in general Pan et al. BioMed Eng OnLine (2018) 17:116 be acceptable. The results of subject independent scheme without or with feature normalization were accuracy 44.73% and 63.87%, κ statistic 0.09 and 0.32, respectively. The classification accuracy of three lying positions in subject independent scheme was much better with feature normalization when compared with the results without feature normalization. On the other hand, the classification accuracy of lying on the left side was higher than those in supine and right. This can be applied for avoiding left lying in some patients with specific diseases, clinically.
The accuracy of classification results may be influenced by the ECG quality. Firstly, in order to distinguish the horizontal features (several time features were < 0.1 s), we chose the dataset with sampling rate 200 Hz. This could make sure that the time resolution was 0.005 s. Secondly, when the subjects were turning over during sleep, the signal was disturbed severely and we had to discard this epoch. But when the subjects were not changing their lying position, the signal was stable. Thirdly, we applied signal preprocessing based on wavelet transform, and it worked well. At last, the ECG signal acquisition technology is mature in recent years. As mentioned above, the ECG signal quality was good enough for this study, which could be reflected in the accuracy of character points detection.

The structure of heart and vectorcardiogram
The bottom of heart in anatomical mainly consists of left atrium and a small part of right atrium, where the aorta and pulmonary artery cross [10]. Because of this structure, the bottom of heart in the thorax is comparatively fixed, while ventricular and the apex of heart are comparatively free. When the lying position changes or the diaphragm contracts, the heart apex will swing to a limited extent. This leads to the direction of electrocardial vector change, and so that it's projection, ECG, changes.
In a complete cardiac cycle, action potential begins from the sinoatrial node firstly, and then passes through the anterior, middle and posterior inter-nodal tract to the atrioventricular node. During this process the electrocardial vector is always from the upper right to the lower left. The process of forming the P loop is shown in Fig. 9a. Then the action potential passes through the bundle of His to the ventricle, firstly from the left bundle branch to the inter-ventricular septum, and then from the left and right bundle branches to the left and right ventricular walls, respectively. Due to the left ventricular wall being much thicker than the right, the direction of the two vectors composition is to the lower left. The formation of QRS loop is shown in Fig. 9b, c. After the action potential arrives at the apex, it travels upward along the Purkinje fiber. In this process, the direction of electrocardial vector is still to the left. Finally, after a period of time, ions reflux inside and outside the cell membrane. The formation of T loop reflects the repolarization of ventricular. A complete ECG cycle ends.

The causes of this phenomenon
VCG intuitively reflects the direction and magnitude of the action potential vector in heart, and the ECG is actually the projection of the vector in different leads. The relationship between frontal VCG and limb lead, transverse VCG and chest lead are shown in Fig. 10a, b, respectively. The influence of lying positions on the heart can be reflected in VCG. Compared with the upright position, the position of the heart is in a relatively horizontal position when supine. As the heart rotates along the long axis (see this change in the direction from the apex to the bottom of heart, the heart rotates clockwise), the right atrium and right ventricle move left and slightly forward, and the left atrium and left ventricle are correspondingly shifted to the posterior position. The ventricular septum is almost parallel to the frontal plane instead of the side plane. View this from the frontal plane, the apex moves to the upper left and back, and the heart rotates anticlockwise along the long axis. So that there is a left-leaning tendency on the electric axis. When subjects are lying on the left side, because of the position of the bottom of heart fixed, the apex is swinging to the left, and the VCG in frontal plane is rotating anticlockwise. So that the projection lengths of P loop and T loop in lead II direction are reduced, that means, the heights of P wave and T wave in ECG waveform decrease. Reflected in the waveform features, P peak as well as T peak were significantly reduced. On the other hand, the projection length of huge part of QRS loop decreases while the tiny part increases, so the R wave of the ECG waveform becomes lower and S wave becomes deeper. Reflected in the waveform features, S/R increased while the ∠QSR decreased.
The accessible volume of heart in chest is larger when the subject is lying on the left side, because the left lung of human body is smaller than the right part and the heart is at the left side inside human chest. Therefore, the swing amplitude of heart is relatively larger. When subject is lying on the right side, the apex of the heart moves towards the mediastinum, and the heart rotates clockwise along the long axis. There shows a rightleaning tendency on the electric axis (notes: The left discussed here is the left of subject, not the left of observer). However, because the heart is upheld by the mediastinum, the range of motion is limited, so there is no obvious swing and rotation as lying on the left side. This may explain the results that waveform features rarely show significant differences between supine and lying on the right side.

Discussions of other studies
The changes of position and shape of heart in chest have drawn the researchers' attention. Mincholé et al. modeled the changes in the Karhunen-Loeve transform coefficients of the QRS complex and the ST-T waveform. It was found that the changes in body position can be reflected in the gradual changes of the two coefficients series. Then based on ECG, they determined the lying position changes of healthy people. The resulting probability of detection reached 94%, and the probability of false alarm was 0%, respectively. However, the false alarm rate in ischemia database was once per hour [11]. Since myocardial ischemia is widely judged by ST-T segment, the accuracy of lying position detection will decrease sharply, and the misjudgment as well as missed judgment of myocardial ischemia may be more severe if the influence of lying position on S wave morphology is not taken into consideration. Li et al. compared the heart morphology in supine and standing upright. When the subject was in supine, the heart rotated clockwise along the long axis. The heart apex moved to the left and back position. But it moved in the opposite direction when standing upright. When the subjects were standing upright, the diaphragm muscles moved down, and the heart remained vertical. At this time the electrical axis shifted to the right, the SNS (sympathetic nerve system) activity increased. But PNS (parasympathetic nerve system) activity increased in supine a Frontal plane VCG and limb lead b Cross surface VCG and chest lead Fig. 10 The relation between VCG and ECG