QRStree: A prefix tree-based model to fetal QRS complexes detection

Non-invasive fetal electrocardiography (NI-FECG) plays an important role in fetal heart rate (FHR) measurement during the pregnancy. However, despite the large number of methods that have been proposed for adult ECG signal processing, the analysis of NI-FECG remains challenging and largely unexplored. In this study, we propose a prefix tree-based framework, called QRStree, for FHR measurement directly from the abdominal ECG (AECG). The procedure is composed of three stages: Firstly, a preprocessing stage is employed for noise elimination. Secondly, the proposed prefix tree-based method is used for fetal QRS complexes (FQRS) detection. Finally, a correction stage is applied for false positive and false negative correction. The novelty of the framework relies on using the range of FHR to establish the connections between the FQRS. The consecutive FQRS can be considered as strings composed of alphabet items, thus we can use the prefix tree to store them. A vertex of the tree contains an alphabet, thus a path of the tree gives a string. Such that, by storing the connections of the FQRS into the prefix tree structure, the problem of FQRS detection converts to a problem of optimal path selection. Specifically, after selecting the optimal path of the tree, the nodes in the optimal path are collected as detected FQRS. Since the prefix tree can cover every possible combination of the FQRS candidates, it has the potential to reduce the occurrence of miss detections. Results on two different databases show that the proposed method is effective in FHR measurement from single-channel AECG. The focus on single-channel FHR measurement facilitates the long-term monitoring for healthcare at home.


Introduction
Non-invasive fetal electrocardiography (NI-FECG) can be used for fetal heart rate (FHR) measurement throughout the pregnancy [1][2][3][4]. However, extracting the FHR information from the abdominal ECG (AECG) remains a challenging task [5][6][7]. On the one hand, the AECG collected from the abdomen is inevitably affected by a variety of noise interferences [8]. On the other hand, the maternal ECG (MECG) component of the AECG is the predominant interference source, which has a much larger amplitude than the fetal ECG (FECG) (see Fig 1) [9][10][11]. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 This paper addresses the issue of detecting the location of fetal QRS complexes (FQRS) for FHR measurement.
In order to obtain a reliable FHR measurement, the location of FQRS is the primary feature that any approach must achieve from the AECG [1,12,13]. However, despite the significant advances in the field of adult QRS detection, the analysis of FQRS detection remains largely unexplored [14][15][16]. Unlike the adult QRS which can be directly detected from the AECG, the FQRS is usually detected after a procedure of MECG elimination [17,18]. For this purpose, a considerable amount of literature has been published to remove the MECG from the AECG for FHR monitoring [1,19], such as the blind source separation (BSS) methods [17,20,21], the adaptive noise cancelling (ANC) methods [22][23][24], and the template subtraction (TS) methods [25][26][27]. Although the techniques based on the separation or cancellation of MECG make FHR measurement possible, the FHR outcome highly depends on the performance of MECG elimination, that is, a reliable FHR is hard to be obtained when the MECG is not completely removed, what is more, the FECG signal is significantly distorted after suppressing the MECG [23,28]. As discussed in [29], the morphology of cardiac electrical signals contains a lot of information related to cardiac defects. Recently, the work of [30] presents a segmentation-based method to detect the FQRS from single-channel AECG. It uses a convolutional neural network to distinguish whether the segmented AECG contains FQRS. In this work, after analyzing graphical representation of the FQRS, we propose a prefix tree-based framework, called QRStree, for FHR measurement without separation of FECG.
As shown in Fig 2, the heart beats of the fetus are inherently sequential, and the fetal R peaks exist in the local maxima. We can use the range of the FHR to obtain the distance range between the fetal R peaks (defined in Eq (2)). Such that the consecutive FQRS can be connected by the distance range. For the purpose of illustration, an example of connected FQRS is shown in Fig 2. Here, we use a prefix tree structure to store these sequential connections. The prefix tree, also called trie, is a useful data structure to store dynamic sets such as strings and sequences [31][32][33]. It is widely used in the field of analyzing data characteristics or gaining information needed for decision-making [34][35][36]. The consecutive FQRS can be considered as strings composed of alphabet items, thus we can use the prefix tree to store them. A vertex of the tree contains an alphabet, thus a path of the tree gives a string. Each string is represented as the path from a representative vertex to the root. Such that, after storing the sequential connections into the prefix tree structure, the problem of FQRS detection converts to a problem of optimal path selection. Specifically, by analyzing the graphical representation of the paths, the optimal path of the tree can be selected, and the nodes in the optimal path are collected as detected FQRS.
In this work, the proposed method is compared with five single-channel methods. These methods include four TS methods [25][26][27]37] and one segmentation-based method [30]. The experimental results show that the proposed method is effective in FQRS detection. The prefix tree-based framework has the following advantages: • Firstly, since the prefix tree can cover every possible combination of the FQRS candidates, it has the potential to reduce the occurrence of miss detections.
• Secondly, since the tree structure is built on the range of FHR, it will not be affected by the large amplitude of MECG.
• Finally, although the FECG and MECG overlap in both time and frequency domains [23], only the local maxima satisfied the range of FHR are collected to construct the tree. And the model can skip the maternal R peaks which dissatisfy the range of FHR. Such that the FQRS and MQRS are 'heart rate separable'.
The details of the proposed method are described in the following sections.

Basic structure
Unlike the TS methods which are based on the elimination of MECG, we propose a novel method to detect the location of FQRS directly from the AECG. As shown in Fig 3, the procedure of the proposed method mainly consists of three stages. Stage 1: Noise elimination; Stage 2: The prefix tree-based method for FQRS detection; Stage 3: False positive (FP) and false negative (FN) correction. The details of each stage are described in the following subsections. step 2, the features of the nodes (single FQRS candidate) and the features of the paths (consecutive FQRS candidates) are extracted. In step 3, the optimal path of the tree is selected, then the location of the FQRS can be detected. In this study, each FQRS candidate can be considered as an alphabet, thus the consecutive FQRS is considered as strings composed of alphabet items. The prefix tree is a tree shaped data structure widely used in storing strings. Here, three characteristics of the prefix tree are listed: • A vertex of the tree has an alphabet, thus a path of the tree gives a string.
• Each string is represented as the path from a representative vertex to the root. • Every string in the tree is unique.
These characteristics are used in the construction of the prefix tree.
Step 1: Tree construction. Fig 4 shows the procedure of tree construction. As shown in Fig 4A, the real fetal R peaks are scattered in the local maxima. The goal of the proposed method is to find out the real fetal R peaks from the candidate peaks. In this step, the prefix tree based on the limit of the timed distance between the FQRS is structured.
The timed distance of two detected R peaks RR is defined as where R i is the timed position of the i-th detected R peak. Given the range of FHR f 2 (f L , f H ) bpm, the range of the distance between two fetal R peaks where f is the FHR, where fs is the sampling frequency. The prefix tree is constructed layer by layer. As shown in Fig 4B, the prefix tree starts with the root at layer 0 and with a null value. In order to construct the layer 1 of the tree (see Fig 4A and 4B), the local maxima at the beginning of AECG are evaluated by where p j is the timed position of the j-th local maximum. Then the local maxima satisfied the Eq (3) are collected to use as the nodes in the layer 1. As shown in Fig 4B, the p 1 , p 2 , . . ., p 10 are used as the nodes in the layer 1. Every node contains one FQRS candidate. In order to construct the layer k (k � 2) of the tree (see Fig 4A, 4C and 4D), the limit of RR distance based on the FHR is employed. Specifically, the local maxima are evaluated by then the local maxima satisfied the Eq (4) can be saved as the fetal R peak candidates for layer k (k � 2). As shown in Fig 4C, the p 8 , p 9 , . . ., p 11 are used as the nodes in the layer 2 of p 1 . After the node p 1 , the p 14 , p 15 , p 16 are used as the nodes in the layer 3 of p 8 (see Fig 4D). By using these strategies, a multilayer tree could be structured.
Step 2: Feature extraction. In this step, the features of the nodes (single FQRS candidate) and the features of the paths (consecutive FQRS candidates) are extracted. In a β-layer tree (structured in step 1 of stage 2), each path includes β fetal R peak candidates. As shown in Fig  4D, besides the root, the path 1,8,14 contains three fetal R peak candidates (p 1 , p 8 , p 14 ).
A fetal R peak candidate is composed of a local minimum (Q-peak candidate), a local maximum (R-peak candidate) and a local minimum (S-peak candidate). As shown in Fig 5, three features of the single FQRS candidate are collected, namely the ratio of RS amplitude to RS distance R RS (R RS = A RS /D RS ), amplitude of RS (A RS ), the ratio of QR amplitude to QR distance R QR (R QR = A QR /D QR ).
By analyzing the path-wise representation of FQRS candidates in the tree, three regular patterns of the inter-QRS correlation can be noted.
• In this work, we use the variances of R QR and R RS to represent the graphical similarities between the local maxima in a path. It is noted that the graphical similarities between the fetal R peaks are higher than the noise.
• In some cases, the values of single noise peak are larger than the real fetal R peak in terms of A RS , R QR and R RS . However, the overall values (e.g., median) of the real fetal R peaks in a path are larger than the noise in these terms.
• The RR distances of the real fetal R peaks are relatively stable in a short time.
Therefore, six features of the paths (consecutive FQRS candidates) are retained to represent the regular patterns.
• F QR : the variance of R QR .
• F RS : the variance of R RS .
• F Am : the median of A RS .
• F QRm : the median of R QR .
• F RSm : the median of R RS .
• F RR : the variance of RR distances.
The variance is given by where N corresponds to the number of fetal R peak candidates in a path, where μ corresponds to the mean value of the feature. The F QR and F RS represent the graphical similarity between the fetal R peak candidates in a path. The lower variance value shows a higher graphical similarity. And the greater median value shows a greater probability of being a path of real fetal R peaks. In addition, the F RR represents the stability of the RR distances in a path.
Step 3: Optimal path selection. After extracting the six features of paths (F QR , F RS , F Am , F QRm , F RSm and F RR ), effective techniques are needed to select the optimal path for fetal QRS detection.
In this step, a robust ranking approach is employed for the optimal path selection. Firstly, the paths are ranked in these six features separately. Secondly, a total ranking S is given by where S X is the ranking of the corresponding feature. The total ranking S fuses the performance of the paths on all features. Thirdly, the optimal path of the tree can be selected by where I is the index of the optimal path. In the optimal path with β FQRS candidates (β-layer tree), the first λ (0 < λ � β) FQRS candidates can be saved as detected fetal R peaks. Then the same procedure in stage 2 is employed to the subsequent signals (after the last detected fetal R peak). As shown in Algorithm 1, the locations of the FQRS are obtained in each iteration, until the entire AECG is detected. Algorithm 1 Procedure of prefix tree-based method.

Stage 3: FP and FN correction
In this work, the construction of the tree can ensure that the range of RR distances (see Eq (8)) is always satisfied within a tree. However, the length of the AECG channel is longer than the tree. We need to construct multiple trees to cover the entire AECG channel. In such a situation, the RR distances between the trees cannot ensure that the range of RR distances is always satisfied. Therefore, after the stage of FQRS detection, a procedure based on the RR distances is implemented for false positive (FP) and false negative (FN) correction. The FP corresponds to the wrongly detected FQRS, and the FN corresponds to the missed detected FQRS.
Specifically, the detected fetal R peaks are evaluated on the RR distances as: Then the detected FQRS which satisfies the Eq (8) could be used to calculate the FHR directly. However, when the Eq (8) is not satisfied, it means that wrong detection occurs within two consecutive trees. Then the detected FQRS of two consecutive trees would be removed (see Fig 6). And the considered interval would be inserted with m replicas of the previous detected FQRS in an equal-interval manner. The number of replicas m is defined by where D int is the length of the considered interval, b�c is the round down operation.

Parameter settings
In this paper, a new method based on the prefix tree structure is introduced to FHR measurement. The parameters are described as follows: • f L and f H : f L and f H indicate the range of FHR. The FHR range should cover the FHR of interest. As indicated in [38], the normal range of FHR is 120 to 160 bpm. In order to cover the normal range of FHR, the f L and f H are set at 110 and 180, respectively.
• β and λ: β is the number of layers in the tree. After the optimal path is selected, only the first λ FQRS candidates would be saved as detected FQRS. A deeper tree needs more computational resources to build. In consideration of the limited computational resources, β is set at 6. And we have not found significant performance improvement when using deeper trees. For the parameter λ, large value is not recommended. When the fetal R peaks are wrongly detected, it is likely that all of the detected FQRS in the selected path are wrongly detected FQRS. In this situation, setting λ with a large value (e.g., six) would increase the number of wrong detections. In this work, after optimizing by grid search within the range of 1 to 6, λ is set at 2. We build a six-layer tree in 0.09 s on Matlab 2017 using a PC with 8 GB RAM and an Intel 2.20 GHz CPU.

NI-FECG databases
The real AECG records from two public databases are collected to illustrate the efficiency of the proposed method. These databases include the abdominal and direct fetal electrocardiogram database (ADFECGDB) [39] and the Set A of 2013 PhysioNet/Computing in cardiology challenge database (PCDB) [40]. Both databases are available on PhysioNet (https://physionet. org/physiobank/database/adfecgdb/, https://physionet.org/physiobank/database/challenge/ 2013/). The details of the databases are summarized as follows: • The abdominal and direct fetal electrocardiogram database (ADFECGDB) contains five records collected from five pregnant women. Each record has 4 abdominal channels and one scalp ECG. The signal lasts for 5 min and is sampled at f s = 1 kHz. The reference FQRS annotation derived from the scalp ECG is available [39].
• Set A of 2013 PhysioNet/Computing in cardiology challenge database (PCDB) consists of 75 one-minute abdominal records. Data is sampled at f s = 1 kHz. Each record contains four abdominal channels and the FQRS reference is available. To date, this database is the largest publicly available dataset [40].

Evaluation metrics
In the previous methods (e.g., the TS methods), the FQRS is detected on the residual signals after the MECG is removed. The comparisons between the detected beats and the reference beats are usually used to assess the performances of FQRS detection. When the detected FQRS is within 50 ms of the reference annotation, then the detected FQRS could be considered as a true positive (correctly detected fetal QRS) [9]. Specifically, the sensitivity (SE), positive predictive value (PPV) and F 1 measure (F 1 ) are the evaluation metrics typically used to assess the error of FQRS detection [1]. The definitions of the three metrics are given by where TP corresponds to the number of true positives. As mentioned earlier, the FP and FN correspond to the number of false positive (falsely detected non-FQRS peaks) and false negative (missed FQRS detections), respectively.

Results
In this study, the real AECG records from the PCDB and ADFECGDB are collected to evaluate the efficiency of the proposed method. Results of the proposed method on PCDB are shown in Tables 1 and 2. Results of the proposed method on ADFECGDB are shown in Table 3. In order to obtain the overall distribution of the metrics, the results on these two databases are visually summarized in Fig 7. [25][26][27]37] in Figs 8 and 9. These methods include the Cerutti method (Cerutti), the Vullings method (Vullings), the TS PCA and the EKF [25][26][27]37]. The work of [9] has shown that the TS methods are able to extract the location of FQRS. Within the TS category, the TS PCA performs the best [9]. These compared methods are implemented on the FECGSYN toolbox [9]. As shown in Figs 8 and 9, among four compared methods, the TS PCA outperforms other methods. and the result of QRStree is comparable with the state-of-art result reported in the field.
In this work, we also compare the proposed method with the segmentation-based method [30] (see Fig 10). As indicated in [30], seven records (a01, a02, a03, a04, a05, a06 and a07) from the PCDB are used as the test data, and the result of the best channel is used as the result of the corresponding record. Results show that the QRStree achieves better performance with 85.68 ± 17.48 SE (%), 85.57 ± 16.71 PPV (%) and 85.61 ± 17.05 F 1 (%).

Discussions
In this study, a new prefix tree-based method is proposed for FHR measurement from singlechannel AECG recording. Tree structure is a good method used to describe the FQRS candidate which will be stored in one node in the tree. In the field of NI-FECG signal processing, the frequency and temporal overlap of the MECG and FECG signals makes the FHR measurement challenging. However, since the tree structure is built on the range of FHR, only the local maxima satisfied the range will be collected for use, that is, it can skip the maternal R peaks which are not satisfied the range of FHR. As a result, the FQRS and MQRS become 'heart rate separable'. It means that the proposed method will not be affected by the large amplitude of MECG, which is the predominant interference source.
The prefix tree in this work is a complete prefix tree, which includes all the possible combination of the FQRS candidates. It is a highly compact tree structure that enables efficient mining of the inter-QRS correlation between the FQRS. Specifically, all the real fetal R peaks are laid in the tree structure, in which the path consisted of real fetal R peaks can be selected. Such that, it has potential to reduce the occurrence of miss detections. In addition, the proposed method only requires single-channel AECG. Compared with the algorithms which require multiple channels, it is a considerable advantage from the standpoint of pregnant women. It is worth noting that focusing on one-lead FQRS detection techniques facilitates the production of low-cost and easy-to-use devices for intrapartum and antepartum monitoring.
Results show that the proposed method achieves a good performance. Key to the good performance is that the proposed approach can effectively extract the inter-QRS correlation between the FQRS. This work shows an interesting way for FHR measurement without degrading the FECG signal. The location of FQRS is directly extracted from the AECG signal, such that the proposed approach has the potential to provide information to extract the original waveform of FQRS from the AECG.
By cancelling the MECG, numerous techniques make FHR extraction possible. However, the TS methods require the accurate location of the MQRS for MECG elimination. Otherwise,  Fig 12). Firstly, the FHR range is set at 130-140 bpm. Then we broaden the FHR range in steps of 20 bpm. It is noted that the FQRS can not be detected by narrow FHR range (e.g., 130-140 bpm), since the FHR range has not covered the FHR of interest. As the FHR range becomes broader, the performance has been improved accordingly (e.g., 100-170 bpm). However, after the FHR of   (Table 3).
https://doi.org/10.1371/journal.pone.0223057.g007  interest is covered, the performance degrades slowly as the FHR range broadens, since the number of paths is increased and it would raise the risk of misdetections (e.g., 70-200 bpm).
In this work, the main objective is to provide a novel method for FQRS detection directly from the AECG without canceling the MECG. And the results show that the proposed method is effective in the task of FQRS detection. However, one limitation of the propose method is that the performance is reduced when the AECG signal is affected by severe noise. In this situation, the methods based on MECG elimination (e.g., the TS methods) can obtain better performance. It is expected that some noise components are removed in the process of MECG elimination. In the future, it is worth to investigate the potential of the proposed method on different noise levels by using synthetic data.

Conclusions
In this study, a novel framework is presented to FQRS detection from single-channel AECG. Unlike the previous studies based on the elimination or separation of MECG, the proposed QRStree detects the location of FQRS directly from the AECG. Specifically, the FQRS is connected by the range of FHR. And these connections are organized and stored in a simple, but yet powerful tree structure. The results show that the proposed method exhibits a good performance on the task of FQRS detection. This work provides a new perspective for the development of FHR measurement.