Device-Free Localization via an Extreme Learning Machine with Parameterized Geometrical Feature Extraction

Device-free localization (DFL) is becoming one of the new technologies in wireless localization field, due to its advantage that the target to be localized does not need to be attached to any electronic device. In the radio-frequency (RF) DFL system, radio transmitters (RTs) and radio receivers (RXs) are used to sense the target collaboratively, and the location of the target can be estimated by fusing the changes of the received signal strength (RSS) measurements associated with the wireless links. In this paper, we will propose an extreme learning machine (ELM) approach for DFL, to improve the efficiency and the accuracy of the localization algorithm. Different from the conventional machine learning approaches for wireless localization, in which the above differential RSS measurements are trivially used as the only input features, we introduce the parameterized geometrical representation for an affected link, which consists of its geometrical intercepts and differential RSS measurement. Parameterized geometrical feature extraction (PGFE) is performed for the affected links and the features are used as the inputs of ELM. The proposed PGFE-ELM for DFL is trained in the offline phase and performed for real-time localization in the online phase, where the estimated location of the target is obtained through the created ELM. PGFE-ELM has the advantages that the affected links used by ELM in the online phase can be different from those used for training in the offline phase, and can be more robust to deal with the uncertain combination of the detectable wireless links. Experimental results show that the proposed PGFE-ELM can improve the localization accuracy and learning speed significantly compared with a number of the existing machine learning and DFL approaches, including the weighted K-nearest neighbor (WKNN), support vector machine (SVM), back propagation neural network (BPNN), as well as the well-known radio tomographic imaging (RTI) DFL approach.


Introduction
Although some navigation systems, such as GPS and BeiDou, are popular for outdoor localization and navigation, they cannot function well in indoor environments, as signals from the satellites are often blocked by tall buildings [1]. Current indoor localization systems usually require that the entity to be localized be equipped with some kind of electronic device [2][3][4]. Device-free localization (DFL) was recently introduced as a new radio-frequency (RF)-based localization approach, where the target does not need to have any attached electronic device [5].

Related Work
In the past decade, different approaches have been proposed for DFL, mainly including the RTI approach, the compressive sensing (CS) approach, the fingerprinting approach, and the geometric approach, etc. RTI was proposed by Wilson et al. [8] for DFL by imaging the attenuation of the target. In RTI, the monitoring area was divided into voxels, and each voxel has different contributions for the signal attenuation of each link. Thus, the DFL is formulated as the linear ill-posed inverse problem, and solved by the regularization approach, such as the Tikhonov regularization [8,9]. Patwari et al. [28] applied measurement-based models to analyze and verify both the advantages and drawbacks of the correlated link shadowing in DFL. Wilson et al. [10] focused on the study of the RSS measurements in wireless networks to estimate the locations of both moving and stationary people, and showed the possibility of tracking more than one person. As the assumption in RTI that the weightings of the voxels along one link are the same is not practical in real situation, Lei et al. [11] proposed a geometry-based elliptical model with variable voxel weights and adopted an orthogonal matching pursuit algorithm to improve the localization accuracy. Banerjee et al. [29] used variance-based RTI (VRTI) in target tracking and localization, and showed that receiver attacks can be detected and the source of the unlawful activity can be identified with good precision. Kaltiokallio et al. [30] proposed an online recalibration approach through the finite-state machine, which allowed the system to adapt to the changes in the radio environment, and proposed a novel spatial weight model for RTI [31]. Zhao et al. [32] proposed least square variance-based radio tomography approach for DFL to reduce the impact of the environment noise. Although RTI can achieve relatively good performance, the determination of some parameters in regularization only depends on the experience, which is lack of the theoretical derivation and proof. In addition, it does not consider the negative impacts of the links that cannot detect the target, which may degrade the performance significantly.
As the weights of the voxels in RTI are space-domain sparse, the CS approach was applied to deal with the linear ill-posed inverse problem [33][34][35][36][37][38]. The time-of-flight measurements of the shadowed links were considered as the observation information, and a CS-based particle filter was proposed by making use of the space-domain scarcity and the time-domain scarcity [35]. The multiple frequencies and the multiple transmission power levels were used to enrich the link measurement information, and the location information of the target was reconstructed by a recursive CS approach [38].
The fingerprinting approach for DFL involves two phases, i.e., the offline phase and the online phase [39][40][41]. An accurate radio map is the key for achieving good localization accuracy, which is created in the offline phase by recording the differential RSS measurements of the links when the target locates at the reference points (RP) with known positions. In the online phase, the location of the target can be estimated by matching the created radio map. Aly et al. [42] leveraged an automation tool for fingerprint construction to study modified scenarios for WiFi-based DFL. Usually, the performance of fingerprinting approach may be tolerably acceptable, but the tedious calibration is a necessary step. So, it faces the challenges on how to building the accurate radio map efficiently.
The geometric approach estimates the location of the target through the geometric relationship between the target location and the links, therefore it is a model-based approach and does not need tedious offline calibration which is a must in the fingerprinting approach. From the perspective of the geometric approach, Zhang et al. [43,44] proposed a signal dynamic model to determine the properties of the RSS change behavior, together with three tracking approaches by the midpoint, the intersection, and the best-cover geometric calculation, respectively. The best-cover approach can achieve relatively higher localization accuracy than other two approaches, but tedious calibration is still required. Talapmpas et al. [45] proposed the multichannel geometric filter for DFL by using channel diversity to diminish the effects of multipath fading of the links, and improve the localization accuracy for DFL in cluttered environments.
A Bayesian approach was introduced for DFL by using the probabilistic observation information of the shadowed links, the constraint information of the non-shadowed links, and the prior probabilistic estimation information, to strengthen the robustness of the model and avoid the overfitting problem [46]. Savazzi et al. [47] studied the diffraction principle to deal with the average path loss and the fluctuations of the RSS measurements induced by the moving target, and proposed a modified stochastic Bayesian approach for real-time target localization.
As a machine learning approach, Chiang et al. [14] proposed a modified fuzzy SVM and applied it to DFL. The undergoing SVM involves a quadratic programming problem, which is computationally expensive. Thus, SVM-based DFL is time-consuming, and difficult to deal with the large number of data in DFL. Dong et al. proposed a joint learning approach for intrusion detection. However, it needs to identify problematic wireless devices that report wrong signal readings [48] and the computational burden is high. Although both the fingerprinting and machine learning approaches proposed for DFL are based on pattern matching, the machine learning approach needs to determine the relevant parameters to build the model for DFL that can be used in the online phase, while the fingerprinting approach is implemented by matching with the radio map established in the offline phase.

Extreme Learning Machine
In this section, the ELM theory is briefly introduced to facilitate the understanding of the proposed approach for DFL. ELM, which adopts a three-layer structure including the input layer, the hidden layer, and the output layer (see Figure 1), was originally proposed for SLFNs and was extended to the generalized SLFNs where the hidden layer need not be neuron-like [49]. Different from other traditional machine learning approaches, in ELM, the hidden layer parameters are pre-assigned randomly, the output weights can be analytically calculated through the least-square approach. It is proven that theoretically ELM has the universal approximation capability with any non-constant piecewise continuous activation function [50].
Suppose the number of hidden nodes is L in the hidden layer, the output of the hidden node i is described as g(x; a i , b i ) with a i and b i being the corresponding hidden parameters of this node, respectively, i = 1, . . . , L. For a given dataset {( where x i is a n-dimensional input vector and t i is the corresponding m-dimensional observation vector. Its mapped feature vector is: The output function of ELM for generalized SLFNs with L hidden nodes can be represented by: where β = [β 1 , β 2 , ..., β L ] T is the vector of the output weights connecting the hidden layer and the output. Different from the traditional machine learning approaches, ELM aims to reach the smallest training error and the smallest norm of the output weights: where σ 1 > 0, σ 2 > 0, p, q = 0, (1/2), 1, 2, ..., +∞, C is a user specified parameter and provides a tradeoff between the empirical risk minimization and the structural risk minimization, H is the hidden layer output matrix (randomized matrix): T is the training data target matrix: Different from the traditional machine learning approaches, ELM aims to reach the smallest training error and the smallest norm of the output weights: 0, (1 2),1, 2,..., p q    , C is a user specified parameter and provides a tradeoff between the empirical risk minimization and the structural risk minimization, H is the hidden layer output matrix (randomized matrix): T is the training data target matrix: The three-step learning process of ELM can be summarized as follows: (1) Randomly assign the hidden layer parameters, i.e., i a and i b .
(2) Calculate the hidden layer output matrix H.
(3) Obtain the output weight vector: † where † H is the Moore-Penrose generalized inverse of the matrix H.
Usually, the resultant solution is equivalent to the ELM optimization solution with    , which can be mathematically written as: The three-step learning process of ELM can be summarized as follows: (1) Randomly assign the hidden layer parameters, i.e., a i and b i .
(2) Calculate the hidden layer output matrix H.
(3) Obtain the output weight vector: where H † is the Moore-Penrose generalized inverse of the matrix H.
Usually, the resultant solution is equivalent to the ELM optimization solution with σ 1 = σ 2 = p = q = 2, which can be mathematically written as: where ξ i = [ξ 1,m , ..., ξ i,m ] is the training error vector of the m output nodes with respect to the training sample x i . Thus, the optimization problem can derive a stable solution with better generalization performance. Then, based on the Karush-Kuhn-Tucker (KKT) theorem, we have: where I is the unit matrix.

Proposed Approach
In this section, we will introduce the details of the proposed parameterized geometrical feature extraction based ELM (PGFE-ELM) for DFL.

Geometrical Represention of the Affected Link
We will firstly introduce the meaning of the affected link. Let RSS 0 f denote the RSS measurement of the link f when there is no any entity in the monitoring area, and RSS f denotes the RSS measurement of the link f when the target enters the monitoring area. The differential RSS measurement of the link f can be expressed as: We call the links with distinct differential RSS measurements due to the presence of the target as the affected links. If we build a coordinate system for the monitoring area of the DFL, all the fixed nodes have their own corresponding coordinates. As shown in Figure 2, when a straight line passing through two points P 1 (x 1 , y 1 ) and P 2 (x 2 , y 2 ), and assuming x 1 = x 2 and y 1 = y 2 , the x-axis intercept I x and the y-axis intercept I y of the straight line can be calculated by: Particularly, when the straight line parallels to the y-axis or the x-axis (see Figure 3a,b), we can find that I y = ∞ and I x = ∞, respectively.
In the monitoring area, each affected link from one node to another node can be expressed using a straight line described by its geometrical representation, which consists of its x-axis intercept and y-axis intercept, as well as its differential RSS measurement.

PGFE-ELM Based DFL
The framework of the PGFE-ELM based DFL is shown in Figure 4. It involves two phases, the offline training phase and the online localization phase. During the offline training phase, reference points (RPs) are used for data collection and ELM training as the samples. Each RP is associated with a number of affected links, and the corresponding affected links of all the RPs can be used for providing the ELM training dataset.
For each RP, we firstly perform differential RSS computation. In order to reduce the computational overhead of ELM and avoid the overfitting problem, the affected link selection is necessary, where a number of affected links with relatively larger differential RSS measurements are selected.
Feature extraction is very important to ELM, which can affect the performance of ELM significantly. Traditionally, the differential RSS measurements of the affected links are the only input features, which is not enough to create an accurate model. Thus, we will perform the parameterized

PGFE-ELM Based DFL
The framework of the PGFE-ELM based DFL is shown in Figure 4. It involves two phases, the offline training phase and the online localization phase. During the offline training phase, reference points (RPs) are used for data collection and ELM training as the samples. Each RP is associated with a number of affected links, and the corresponding affected links of all the RPs can be used for providing the ELM training dataset.

PGFE-ELM Based DFL
The framework of the PGFE-ELM based DFL is shown in Figure 4. It involves two phases, the offline training phase and the online localization phase. During the offline training phase, reference points (RPs) are used for data collection and ELM training as the samples. Each RP is associated with a number of affected links, and the corresponding affected links of all the RPs can be used for providing the ELM training dataset. For the selected affected links, we can extract their x-axis intercepts and y-axis intercepts, together with their corresponding differential RSS measurements as the input features of ELM. Assume n affected links are selected, and then the input features for ELM can be represented as:  Assuming that there are l RPs, and each RP corresponds to n selected affected links, so the input matrix of ELM can be written as:  I  I  I  I  I  I  I  I  I  I  I  I  Input   I  I  I  I  I  For each RP, we firstly perform differential RSS computation. In order to reduce the computational overhead of ELM and avoid the overfitting problem, the affected link selection is necessary, where a number of affected links with relatively larger differential RSS measurements are selected.
Feature extraction is very important to ELM, which can affect the performance of ELM significantly. Traditionally, the differential RSS measurements of the affected links are the only input features, which is not enough to create an accurate model. Thus, we will perform the parameterized geometrical representations of the selected affected links and the corresponding feature extraction using the proposed PGFE.
For the selected affected links, we can extract their x-axis intercepts and y-axis intercepts, together with their corresponding differential RSS measurements as the input features of ELM. Assume n affected links are selected, and then the input features for ELM can be represented as:  For the selected affected links, we can extract their x-axis intercepts and y-axis intercepts, together with their corresponding differential RSS measurements as the input features of ELM. Assume n affected links are selected, and then the input features for ELM can be represented as:  Assuming that there are l RPs, and each RP corresponds to n selected affected links, so the input matrix of ELM can be written as: I  I  I  I  I  I  I  I  I  I  I  As shown in Figure 5, in PGFE-ELM, the input features of ELM are extended from the only ∆RSS values of the affected links to the x-axis intercepts, y-axis intercepts and ∆RSS values of the affected links, and the outputs are the location estimation of the target. So the more accurate ELM can be trained with extended input features.
Assuming that there are l RPs, and each RP corresponds to n selected affected links, so the input matrix of ELM can be written as: and the input weights with L hidden nodes are: The output can be expressed as: According to (6), the output weight of ELM will be: During the online localization phase, the differential RSS computation and affected link selection are performed to select the appropriate number of the affected links for the target. After that, all the selected affected links are represented geometrically, and the x-axis intercepts, y-axis intercepts and the corresponding differential RSS measurements constitute the testing dataset together. Finally, the trained ELM model outputs the location of the target given the extracted features of the affected links.
PGFE-ELM has the following two advantages: (1) during the offline training phase, for providing training samples, the combination of the n affected links associated each RP can be different from sample to sample; (2) the affected links used by ELM in the online localization phase can be different from those used for training in the offline training phase. As a result, PGFE-ELM is more robust to deal with the uncertain combination of the detectable wireless links, both during the offline training phase and the online localization phase.

Experiments Settings
We build the wireless network using CC2530 ZigBee nodes, which are based on the IEEE 802. 15 area. Two different experiments are performed to evaluate the proposed approach, respectively for the outdoor environment and the indoor environment.
In the experiments, we choose the weighted K-nearest neighbor (WKNN), SVM, and BPNN for comparison to verify the performance of the proposed PGFE-ELM. As mentioned in Section 4.1, we set I y = 9999 and I x = 9999 in the following experiments for the two situations in Figure 3a,b. After data collection, the localization algorithms are carried out in Matlab 2012a environment running in an Inter i5 3.2 GHz CPU and 4G RAM. In addition, we use the following localization accuracy (17) as the evaluation criteria: where (x i , y i ) is the predicted coordinates, and (x i0 , y i0 ) is the real coordinates of the ith testing point (TP), and z is the number of the TPs.
In the performance evaluation processes, the data from RPs is used for training and the data from TPs is used for testing. Each RP provides one training sample, and each TP provides one testing sample. It should be noted that we use the same person to collect the training data and the testing data. The RSS are sampled at the 5-min interval in the working days, and the RSS values of all the RPs and TPs are collected 100 times, and the average values of these RPs and TPs consist of the training dataset and the testing dataset, respectively.

Experimental Performance Evaluation for Outdoor Environment
The outdoor environment experiment was set up on the campus of the University of Science and Technology Beijing. As shown in Figure 6, the monitoring area is a 6 m × 6 m square, with 16 nodes placed along its boundary and the adjacent node distances of 1.5 m. In the experiments, for two RP distribution scenarios are examined, one with 0.3 m space between adjacent RPs (i.e., 0.3 m RP spaced scenario) and another one with 0.6 m space between adjacent RPs (i.e., 0.6 m RP spaced scenario). In the first scenario, there are 289 RPs in total, whereas in the second scenario there are 81 RPs in total. In order to implement the proposed PGFE-ELM, we build a coordinate system for the monitoring area, in which the x-axis and the y-axis overlap the two edges of the monitoring area (see Figure 7). In the performance evaluation processes, the data from RPs is used for training and the data from TPs is used for testing. Each RP provides one training sample, and each TP provides one testing sample. It should be noted that we use the same person to collect the training data and the testing data. The RSS are sampled at the 5-min interval in the working days, and the RSS values of all the RPs and TPs are collected 100 times, and the average values of these RPs and TPs consist of the training dataset and the testing dataset, respectively.

Experimental Performance Evaluation for Outdoor Environment
The outdoor environment experiment was set up on the campus of the University of Science and Technology Beijing. As shown in Figure 6, the monitoring area is a 6 m × 6 m square, with 16 nodes placed along its boundary and the adjacent node distances of 1.5 m. In the experiments, for two RP distribution scenarios are examined, one with 0.3 m space between adjacent RPs (i.e., 0.3 m RP spaced scenario) and another one with 0.6 m space between adjacent RPs (i.e., 0.6 m RP spaced scenario). In the first scenario, there are 289 RPs in total, whereas in the second scenario there are 81 RPs in total. In order to implement the proposed PGFE-ELM, we build a coordinate system for the monitoring area, in which the x-axis and the y-axis overlap the two edges of the monitoring area (see Figure 7). There are two tuning parameters in ELM (sigmoid function is used as the activation function), i.e., the regularization factor C and the number of hidden nodes L. Figure 8 illustrates the average error curve using all the links at the 0.3 m RP spaced scenario with respect to C given L = 10. It can be found that the average error curve presents a parabolic shape with the increase of C and obtains the smallest average error when C = 10 2 . Therefore, we choose C = 10 2 in the following experiments. Figure 9 illustrates how the average localization accuracy of PGFE-ELM changes with the increase of the number of hidden nodes from 5 to 100, for the 0.3 m RP spaced scenario with the number of the affected links from 2 to 10 and all the affected links. According to Figure 9, it can be There are two tuning parameters in ELM (sigmoid function is used as the activation function), i.e., the regularization factor C and the number of hidden nodes L. Figure 8 illustrates the average error curve using all the links at the 0.3 m RP spaced scenario with respect to C given L = 10. It can be found that the average error curve presents a parabolic shape with the increase of C and obtains the smallest average error when C = 10 2 . Therefore, we choose C = 10 2 in the following experiments.         Figure 9, it can be found that the best average localization accuracies can be achieved when L = 10, no matter how many affected links are selected. The situation when L = 10 with 3 selected affected links can obtain the best average localization accuracy, which equals to 1.40 m. Figure 10 illustrates the average localization accuracy of PGFE-ELM when the RP is 0.6 m spaced scenario. Similar to Figure 9, the corresponding best average localization accuracies are obtained when L = 10. The situation when L = 10 with 3 affected links can obtain the best average localization accuracy, which equals to 1.55 m. Comparing Figures 9 and 10, we can find that the average localization accuracies of most situations in Figure 9 with the 0.3 m RP spaced scenario are better than the corresponding situations in Figure 10. Whereas the average localization accuracies of in Figure 10 with the 0.6 m RP spaced scenario is worse but with less calibration overhead.    Figure 10 illustrates the average localization accuracy of PGFE-ELM when the RP is 0.6 m spaced scenario. Similar to Figure 9, the corresponding best average localization accuracies are obtained when L = 10. The situation when L = 10 with 3 affected links can obtain the best average localization accuracy, which equals to 1.55 m. Comparing Figures 9 and 10, we can find that the average localization accuracies of most situations in Figure 9 with the 0.3 m RP spaced scenario are better than the corresponding situations in Figure 10. Whereas the average localization accuracies of in Figure 10 with the 0.6 m RP spaced scenario is worse but with less calibration overhead. Normally the affected link is identified by comparing the differential RSS measurement with a predefined threshold. However, due to the uncertainties of the experimental environments, such threshold may introduce false detection. Differently, PGFE-ELM compares the differential RSS measurements of all the links and selects the given number of affected links from the most significant ones. As it is difficult to know the optimal number of the affected links in PGFE-ELM, we examine this by experiments. Figure 11 illustrates the comparison results of PGFE-ELM accuracy versus the number of affected links with the number of hidden nodes 10 which is showed best performance in Figures 9  and 10. We can find that PGFE-ELM achieves the best localization accuracy in both of the two RP spaced scenarios when the number of affected links is set as 3. The main reasons are that: if the assigned number of the affected links is set as 2, the number of links may be not enough for estimating the target location; whereas if the assigned number of the affected links is too large, some links may be identified as affected links incorrectly and subsequently degrade the localization performance.
There is only one tuning parameter in WKNN, i.e., the number of neighbors K. Figure 12 illustrates the how the average localization accuracy of WKNN at 0.3 m RP spaced scenario changes when K increases from 2 to 10, with the number of the affected links from 2 to 10 and all the affected links. According to Figure 12, it can be found that all the curves present the downtrends before K = 8 and most of them rise after K = 8. The situation of K = 8 with 7 affected links obtains the best average Normally the affected link is identified by comparing the differential RSS measurement with a predefined threshold. However, due to the uncertainties of the experimental environments, such threshold may introduce false detection. Differently, PGFE-ELM compares the differential RSS measurements of all the links and selects the given number of affected links from the most significant ones. As it is difficult to know the optimal number of the affected links in PGFE-ELM, we examine this by experiments. Figure 11 illustrates the comparison results of PGFE-ELM accuracy versus the number of affected links with the number of hidden nodes 10 which is showed best performance in Figures 9 and 10. We can find that PGFE-ELM achieves the best localization accuracy in both of the two RP spaced scenarios when the number of affected links is set as 3. The main reasons are that: if the assigned number of the affected links is set as 2, the number of links may be not enough for estimating the target location; whereas if the assigned number of the affected links is too large, some links may be identified as affected links incorrectly and subsequently degrade the localization performance.    There is only one tuning parameter in WKNN, i.e., the number of neighbors K. Figure 12 illustrates the how the average localization accuracy of WKNN at 0.3 m RP spaced scenario changes when K increases from 2 to 10, with the number of the affected links from 2 to 10 and all the affected links. According to Figure 12, it can be found that all the curves present the downtrends before K = 8 and most of them rise after K = 8. The situation of K = 8 with 7 affected links obtains the best average localization accuracy, which equals to 2.06 m. Figure 13 illustrates the results of WKNN at 0.6 m RP spaced scenario, when increasing K from 2 to 10. Similar to Figure 12, all the curves present the downtrends before K = 7 and most of them rise after K = 7. The situation of K = 8 with 7 affected links obtains the smallest error, which equals to 2.37 m. Similar to ELM, in the 0.3 m RP spaced scenario, WKNN can achieve better average localization accuracy compared with the 0.3 m RP spaced scenario.    Figure 14 illustrates the results of SVM with Gaussian kernel at 0.3 m and 0.6 m RP spaced scenarios s with the number of the affected links changing from 2 to 10 and all the affected links. All the parameters in SVM are obtained through the cross validation. According to Figure 14, it can be found that the situation of 0.6 RP spaced scenario is relatively smoother than the situation of 0.3 m RP spaced scenario with tiny fluctuation, and situation with 9 affected links at 0.3 m RP spaced scenario obtains the best average localization accuracy, which equals to 1.67 m. Figure 12. Results of WKNN at 0.3 m RP spaced scenario.      Figure 15 illustrates the results of BPNN with one hidden layer and 10 hidden nodes at 0.3 m and 0.6 m RP spaced scenarios with the changing of the number of the affected links from 2 to 10 and all the affected links. In Figure 15, the situation with 5 affected links at 0.6 m spaced scenario obtains the best average localization accuracy, which equals to 2.23 m.  Figure 15 illustrates the results of BPNN with one hidden layer and 10 hidden nodes at 0.3 m and 0.6 m RP spaced scenarios with the changing of the number of the affected links from 2 to 10 and all the affected links. In Figure 15, the situation with 5 affected links at 0.6 m spaced scenario obtains the best average localization accuracy, which equals to 2.23 m.  Tables 1 and 2 demonstrate the time consumption of ELM, WKNN, SVM and BPNN with 3, 5, 7, 9 and all links at 0.3 m RP spaced scenario and 0.6 m RP spaced scenario, respectively. We can find that obviously ELM is much faster than other machine learning approaches, especially in training time. For example, ELM is thousands of times faster than SVM and hundreds of times faster than BPNN at 0.3 m RP spaced scenario. Furthermore, the testing time of ELM equals to or is close to 0 s in most situations. Evidently, in terms of time consumption, with the increase of data quantity, the advantage of ELM will be more significant. Furthermore, the best and worst localization accuracy among TPs in different situations is listed in Tables A1-A5 in the Appendix A, respectively. For ELM and BPNN, the number of hidden nodes L is set as 10; for WKNN, K is set as 8 in the 0.3 m scenario and 7 in the 0.6 m scenario. We can find that in some situations, the best localization accuracy of ELM and SVM are even under 0.1 m. In addition, the worst localization accuracy in different situations of ELM is about 3 m, and other approaches can be high as 5 m. These results indicate that ELM is more robust than other machine learning approaches.
According to the above figures and tables, we can find that the results of the 0.3 m RP spaced scenario of all the four machine learning approaches are better than their corresponding results of the 0.6 m RP spaced scenario in most situations, which indicates that 0.3 m between adjacent RPs is more suitable than the case with 0.6 m. Furthermore, if we use all the affected links in the experiment, all the situations cannot obtain their own best average localization accuracies, which indicates that choosing the appropriate number of the affected links is important and can reduce the computation burden of the models and improve the localization accuracy efficiently. Finally, we also can find that all of the localization accuracy of PGFE-ELM in different situations is much better than WKNN, SVM, and BPNN, which shows its excellent generalization performance. It should note that the best average localization accuracy of these approaches is 1.40 m, which is achieved by PGFE-ELM with the 0.3 m RP spaced.
In order to verify the validity of the proposed PGFE, we perform the experiment using the original dataset only with the differential RSS values. Table 3 lists the comparison results of the four machine learning approaches with and without PGFE in the situation of 3 affected links of the 0.3 m RP spaced scenario. It can be found that the results with PGFE are much better than the corresponding ones without PGFE, WKNN and SVM can achieve about 1 m improvement when PGFE is used.

Experimental Performance Evaluation for Indoor Environment
The indoor environment experiment was set up in a staff activity room. As shown in Figure 16, the monitoring area is a 6 m × 6 m square, with 16 nodes placed along its boundary and the adjacent node distances of 1.5 m. There is a table tennis table inside the area as an obstacle. Similar to the outdoor experiment, two RP distribution scenarios are examined, one with 0.3 m spaced between adjacent RPs (i.e., the 0.3 m RP spaced scenario) and another one with 0.6 m spaced between adjacent RPs (i.e., the 0.6 m RP spaced scenario).

Experimental Performance Evaluation for Indoor Environment
The indoor environment experiment was set up in a staff activity room. As shown in Figure 16, the monitoring area is a 6 m × 6 m square, with 16 nodes placed along its boundary and the adjacent node distances of 1.5 m. There is a table tennis table inside the area as an obstacle. Similar to the outdoor experiment, two RP distribution scenarios are examined, one with 0.3 m spaced between adjacent RPs (i.e., the 0.3 m RP spaced scenario) and another one with 0.6 m spaced between adjacent RPs (i.e., the 0.6 m RP spaced scenario).  The number of the hidden nodes of ELM and BPNN is set as 10, K of WKNN is set as 8, the corresponding parameters of SVM are determined through the cross validation. Figures 17 and 18 illustrate the comparison results of the four machine learning approaches for the 0.3 m and the 0.6 m spaced scenarios, respectively, with the corresponding best and worst localization accuracy among TPs listed in Tables A6-A8 in the Appendix A. The number of the hidden nodes of ELM and BPNN is set as 10, K of WKNN is set as 8, the corresponding parameters of SVM are determined through the cross validation. Figures 17 and 18 illustrate the comparison results of the four machine learning approaches for the 0.3 m and the 0.6 m spaced scenarios, respectively, with the corresponding best and worst localization accuracy among TPs listed in Tables A6-A8 in the Appendix A.   Tables 4 and 5 show the time consumption result of the approaches. We can find that the average localization accuracy of BPNN is the worst, ELM obtains the best performance in most situations, and sometimes the performance of SVM is similar to ELM. In terms of time consumption, ELM has still shown outstanding advantage, and the testing time equals to 0 s in many situations (see Tables 4 and 5). The testing time of SVM is roughly the same as ELM, but its training time is hundreds of times slower than ELM.   Tables 4 and 5 show the time consumption result of the approaches. We can find that the average localization accuracy of BPNN is the worst, ELM obtains the best performance in most situations, and sometimes the performance of SVM is similar to ELM. In terms of time consumption, ELM has still shown outstanding advantage, and the testing time equals to 0 s in many situations (see Tables 4  and 5). The testing time of SVM is roughly the same as ELM, but its training time is hundreds of times slower than ELM.

Discussion
Although the proposed PGFE-ELM is more robust than other machine learning-based DFL approaches, its localization accuracy depends on the original data quality. We can find that the average localization accuracy of the outdoor experiment is more than 1 m, but the best average localization accuracy of the indoor experiment is about 0.6 m. The main reason we think is that we collected the outdoor environment data in a winter, windy and cold weather, which may have great impacts on the data quality, as the weather can affect the device performance and data transmission characteristic seriously.
According to the above experiments, PGFE-ELM obtains the best performance in both outdoor and indoor environments, especially in terms of learning speed, which is hundreds and even thousands of times faster than others. Thus, compared with other machine learning approaches, PGFE-ELM is more suitable for DFL. In order to verify the proposed PGFE-ELM further, we performed the comparison between PGFE-ELM and RTI. The experimental data was obtained from the SPAN Lab of the University of Utah [51]. There are 35 location points in the dataset. We randomly choose 30 location points for training PGFE-ELM and five location points for testing. According to the comparison result, we can find that the average localization accuracy of PGFE-ELM and RTI are 0.7183 feet and 0.8244 feet, respectively, which indicate that PGFE-ELM can achieve better performance than RTI.
Compared with PGFE-ELM, RTI is easy for use as it does not need the offline phase for data collection and training, but its parameters highly depend on the locations and the number of the nodes and must be reset when the locations and the number of the nodes are changed. However, when such changes happen, the trained PGFE-ELM can still be used directly without training again, so the practicability of PGFE-ELM is better than RTI.

Conclusions
In this paper, the parameterized geometrical representation of the affected link is introduced, and a novel ELM is proposed to implement the fast and accurate DFL based on the parameterized geometrical feature extraction from the affected links. The experimental results show that the proposed PGFE-ELM can achieve much better localization accuracy and faster learning speed than three existing machine learning approaches WKNN, SVM and BPNN, it can also achieve better localization accuracy than existing RTI approach. The proposed PGFE-ELM is developed based on the original ELM, further study is required to develop the on-line ELM approach for DFL to deal with the dynamic communication environment. Also this paper only considers localization problem for single target, future work can be done to address the multi-target localization problem. Author Contributions: Jie Zhang proposed the PGFE-ELM for DFL. Wendong Xiao supervised the work. Sen Zhang suggests the basic idea of geometrical representation for affected link and revised the paper. Shoudong Huang involved in the technical discussion of the paper and provided many helpful suggestions.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
In this appendix, the localization accuracy among TPs of the four machine learning approaches at 0.3 m and 0.6 m spaced scenarios in outdoor and indoor environments are listed in Tables A1-A8.