Fingerprinting Indoor Positioning Method Based on Kernel Ridge Regression with Feature Reduction

An important goal of indoor positioning systems is to improve positioning accuracy as well as reduce power consumption. In this paper, we propose an indoor positioning method based on the received signal strength (RSS) fingerprint. The proposed method used a certain criterion to select fixed access points (FPs) in an offline phase instead of an online phase for location estimation. Principal component analysis (PCA) was applied to reduce the features of the RSS measurements but retain the most information possible for establishing the positioning model. Then, a kernel-based ridge regression method was used to obtain the nonlinear relationship between the principal components of the RSS measures and the position of the target. We thoroughly investigated the performance of the proposed method in realistic wireless local area network (WLAN) and wireless sensor network (WSN) indoor environments and made comparisons with recently developed methods. The experimental results indicated that the proposed method was less dependent on the density of the reference points and had higher positioning accuracy than the commonly used positioning methods, and it adapts to different application environments.


Introduction
Recent advances in information science have made it practical and accessible to provide indoor positioning services (IPS), for applications such as indoor personal navigation, healthcare, and environmental monitoring [1][2][3]. The key task for such systems is to determine the position of the user or portable device. Accurate positioning in complicated indoor environments is quite challenging and has received wide attention in the past decade [4][5][6].
A variety of wireless technologies have been used for indoor positioning. Some of them achieve high accuracy in the order of tens of centimeters, which usually requires additional specialized hardware [7][8][9][10]. Techniques based on radio frequency (RF) signals are considered as the most common cost-effective solutions for indoor positioning. Among them, the fingerprinting technique, which makes use of only received signal strength (RSS) values, has gained significant interest [11][12][13][14]. Generally, a predeployment site survey process or offline phase is required during which a radio map is constructed by collecting RSS samples (fingerprint) from location fixed access points (FP), which are access points in a WLAN or anchor nodes in a WSN at different reference points (RP) over the whole positioning area. Then, the target is located during the online phase by matching the online RSS values with the prestored fingerprints. In general, such methods require neither the location of the FPs nor a wireless channel propagation model. The fingerprinting method was first proposed in [15], where the Euclidean distance between online and offline RSS values was used to select the matched RPs and then weights the positions of the RPs to obtain the estimated target position; this technique is also known as weighted K-nearest neighbor (WKNN) algorithm. The concept of the fingerprinting method is based on the fact that the RSS value at any given point is determined primarily by the surrounding environment and the location of the FPs and is, thus, unique. However, some shortcomings of RSS measurement, such as its vulnerability to environmental changes and random fluctuations, prevent the positioning accuracy from reaching a satisfactory level [16].
Various pattern recognition algorithms have been applied for better positioning performance [17,18]. One aim of these algorithms is to improve the positioning accuracy even if there are anomalous or missing RSS measurements. Classification algorithms such as support vector machine (SVM) [1,19,20], decision tree [21], or even neural networks classifier [22] have been used for indoor localization. In general, the number of classifier models increases with the number of RPs when indoor localization is considered as a multiclass classification problem, which incurs high time and memory consumption. Some models use regression algorithms with features extracted from the RSS values to build a robust and adaptive relationship between the positions and the RSS measures. The sparse recovery algorithm, the least absolute shrinkage and selection operator (LASSO) algorithm, kernel ridge regression (KRR), and Elastic-Net algorithm have been used to model linear or nonlinear relations for better performance [23][24][25][26].
Feature extraction is a commonly used method for data reduction and noise elimination [27,28]. The new method we proposed uses feature reduction before a relationship is established between the locations and the RSS values. The proposed technique uses principal component analysis (PCA) to find a new presentation of data in terms of least square [29,30]. Thus, a new set of variables called principal components (PCs) instead of RSS values is obtained. The advantages of using PCA to reduce features are as follows: first, PCs are compact descriptions of the measured RSS values. Choosing a proper number of PCs eliminates redundancy but retains maximal information, which is more conducive to establishing the positioning model. The significance of PCA also lies in the reduction of the computational complexity. In general, the dimension of RSS features, or the number of FPs, is an important factor in determining computational complexity during the entire positioning process. Instead of choosing a subset of FPs as reported in previous works [23,24,31], PCA selects a subset of PCs but retains information from all the FPs with low dimensionality. Then, a kernel-based model is used to explore a proper linear combination of kernels with selected PCs to represent the position. In the experiments, the localization system is evaluated thoroughly using the collected RSS data from two actual indoor environments, including a corridor with WLAN signals and a hall with WSN signals.
The rest of this paper is organized as follows: the details of the proposed positioning algorithm are described in Section 2. The experimental setup and a discussion of the results are introduced in Section 3. Finally, the conclusion is presented in Section 4.

Proposed Algorithm Based on PCA and KRR
This section describes the proposed fingerprinting method as shown in Figure 1, which consists of two phases: the offline phase and the online phase. In the offline phase, a site survey is carried out to collect the RP's information and the corresponding RSS values, then spatial filter based on FP coverage and FP selection according to certain criterion is used to select the proper set of FPs to build a radio map, and then the model training with the proposed PCA and KRR algorithm is performed. In the online phase, the localization process using the PCA reduction and KRR is performed to get the target position.

Radio Map Construction with FP Selection.
Consider an indoor environment where RF signals from FPs can be received throughout the area. A number of RPs, denoted as ½P 1 , P 2 , ⋯, P N , are set, and their coordinates pðP i Þ = ½x i , y i are recorded. The RF signals are scanned at each RP, and RSS values collected from the same FP are averaged and then stored as a fingerprint into radio map in a vector F i = ½ f i1 , ⋯, f ij , ⋯, f iK , where f ij , i = 1, ⋯, N, j = 1, ⋯, K, holds the time average RSS values from FP j at location P i . Let χ = fFP 1 , ⋯, FP K g be the set of FPs. Because not all FPs detected at the site are available at each RP, a certain value, -100 dBm for WSN-based anchors and 0 dBm for WLAN-based APs in our experiments, is experimentally set to imply an FP's unavailability.
It is also worth mentioning that in an indoor environment, especially in a WLAN-based positioning scenario, up to dozens of APs can usually be detected throughout the site. For example, in our experiments, 198 APs in total were detectable on one floor, which is far more than the number of FPs required for positioning. The signal from each FP could provide some information about the location, while too many FPs used for positioning increases the cost of storage and the computation of the algorithm. In addition, FPs having a large variance of RSS values are not suitable for positioning and could contribute to large errors. Thus, a certain criterion should be used to select a particular set of FPs that present the characteristics of the signal distribution effectively.
In our work, we considered three FP section methods to find a balance between the number of FPs and the positioning accuracy they can achieve. Because the FP selection is applied in the offline phase in our method, the radio map constructed with the selected FPs is to be used for further positioning model setup. Therefore, it is necessary that FPs detected in the online phase be as consistent as possible with those selected in the offline phase. Therefore, a spatial filter based on the coverage is first applied to find the FPs with the largest coverage. We define the FP coverage as where c ij = 1 if the RSS value from FP j is available at RP i and is 0 otherwise. The FP j with C j ≥ C 0 is selected as the basis of the subsequent FP section, where C 0 is a threshold varying with the number of RPs in different scenarios. This value is set to 9 in one of our implementations, which means at least 9 RPs can receive RSS data from each filtered FP. After the spatial filter, a group of |H | , H ⊂ χ, FPs are selected. FP selection is then carried out within this group. Three FP selection criteria, denoted as Strongest FPs, Least-variance FPs, and Combined criterion, are evaluated in our work.
(1) Strongest FPs. This criterion assigns a score to an FP according to the strength of its signal throughout the site, which is defined as: (2) Least-variance FPs. In this criterion, scores are also assigned to FPs but with focus on the variance of the RSS signals throughout the collection time. The score of FP j is calculated as follows: where Δ ij is the unbiased estimated variance of the RSS readings from FP j at location P i . An FP with a low variance indicates that it can provide stable signals over time, which is conducive to a correct match between the online RSS and the fingerprints. It is important to note that some RPs cannot receive the signal from FP j as mentioned earlier; the variance of these RPs needs to be set to a large value (e.g., 100 in our implementation). The FPs are sorted in increasing order according to their scores, and then a subset of |S 2 | , S 2 ⊆ H, FPs with the lowest scores are selected.
(3) Combined criterion selected FPs. The criterion accounts for both the spatial distribution of RSS across all RPs and the RSS variance over the collecting time. According to this criterion, each FP is assigned a score calculated as follows: where A higher score means better stability of RSS over time as well as greater discriminability of FPs across the site. A subset of |S 3 | , S 3 ⊆ H, FPs with the highest ξ 3 are selected. Now assume that K ′ FPs are retained after selection to build the radio map Ψ, which is denoted by: The total fingerprint F = ½F 1 , F 2 , ⋯, F N ′ T can be considered prior knowledge of the signal characteristics of the surveyed site. In the online phase, the target gets a new F t to determine its coordinatep t , which is the key task of the indoor positioning. In this way, a radio map Ψ is crucially important for the proper training of the regression model.

Feature
Reduction with PCA. PCA is a fast and efficient technique of dimensionality reduction used widely [32]. Instead of directly using all the selected FPs, our approach replaces the RSS values with a subset of PCs, which is obtained by a transformation based on the PCA. Given a transformation between vector F and ψ as ψ = FW, where F ∈ R 1 * K′ , ψ ∈ R 1 * K , and k < K ′ . F should be reconstructed from ψW T and labeled asF. PCA seeks to solve the problem where W is the transformation matrix and N is the number of training samples. This optimization problem has been solved well, and W can be obtained by the following steps. First, calculate the covariance matrix of all RSS samples defined by: where F = 1/N∑ N i=1 F i is the mean of the samples, C σ is a positive-semidefinite symmetric matrix, and its eigenvalues are easy to compute and sort in descending order as fλ 1 , λ 2 , ⋯, λ K ′g. The corresponding normalized eigenvectors fw 1 , w 2 , ⋯, w K ′g are geometrically orthonormal and statistically uncorrelated. Then, the K ′ × k transformation matrix W has the form: Here k is the number of PCs selected for the best transmission from RSS values to PCs with reduced dimension for maximal extraction of signal features. Then, the proper value of k is determined by cross-validation with the sampled fingerprints in the radio map during the offline phase. When the matrix W is available, the new PC-based fingerprinting Φ transformed from the RSS-based F is given by: : ð10Þ A regression model can be built between the new Φ and the locations.
During the positioning stage, the online PCs can be directly extracted from the online RSS measurements using the trained matrix W: With the extracted Ψ t , the regression model can be used to determine the position of the target.

KRR Approach Based on PCs.
A brief review is given for a general understanding of the theory of following adopted kernel ridge regression. Given a training set fðψ 1 , p 1 Þ, ⋯, ðψ N , p N Þg, where N is the number of samples, each ψ i ∈ ℜ k is a row vector in Φ denoting an input sample with a corresponding out p i ∈ ℜ. The ridge regression algorithm entails solving the following optimization problem [33].
where ω ∈ R k is the weight vector, λ ≥ 0 is a regularization parameter tuned to control the compromise between the training error and the complexity of the solution. Then, a regression model is found to describe the linear relationship between input vector ψ and output p, such that where ω = Φ T ðΦΦ T + λI k Þ −1 P, and I k is a k × k identity matrix. Generally, in nonlinear cases, a kernel-based method is introduced to solve the problems, which map the samples into a higher dimensional feature space where the problems become linear separable, such that where superscript Λ stands for a higher-dimensional space. The mapping ϕ which is chosen to convert nonlinear relation between the output and the independent input variables into linear relation is not necessary to know. Then, the regression can be constructed in the feature space, and the solution to the regression problem only depends on the dot product in the feature space. The kernel function satisfying Mercer's condition is introduced as a format of a dot product, Kðψ i , ψ j Þ = ϕðψ i Þ · ϕðψ j Þ. Now, the weight vector can be rewritten as: where K is an N × N kernel matrix with the element of κðψ i , ψ j Þ, P is the coordinate sets of the RPs, and α = ðK + λI k Þ −1 P can be determined by the PC-based fingerprinting Φ.
In the online phase, the target's position can be directly estimated aŝ In our method, a Gaussian kernel is used: where σ is the bandwidth of the Gaussian kernel. Considering that our output is two coordinates,x andŷ, two 4 Wireless Communications and Mobile Computing parameters, σ x and σ y , should be set and determined by the training RSS data.

Experimental Results and Analysis
In this section, we introduce the experimental setups and evaluate the performance of the proposed algorithm by comparing it with other algorithms. The performance of the proposed method is measured by the average positioning error (AE) of all test points and the empirical cumulative distribution function (CDF) of errors. The former is intended to calculate the Euclidean distance between the estimated and the actual location of the test point, while the latter indicates the maximum and minimum errors.  Figure 3 shows the number of received FPs at each RP of the two sites in the offline phase. It can be seen from Figure 2    signals on both maps. In these two typical indoor environments, especially that of Map 1, the dense distribution of rooms and walls greatly attenuated or obstructed the radio signals. Although a large number of FPs could be detected in the whole area, the FPs available at each RP were greatly reduced.

Experimental Results with Different FP Selections.
Localization performance is always related to the number of FPs (K ′ ) used for positioning. In our method, FPs with certain coverage areas were first selected using a spatial filter. The threshold C 0 varied with the layout of the map, as well as the distributed RPs and FPs. Figure 4 shows the CDF of the positioning error when different thresholds were applied to filter the FPs. On Map 1, although there are up to 198 FPs, only 52 FPs can be received by more than 15 RPs out of a total of 92. The number K ′ increases to 89 when the thresh-old drops to 9. With the FP selection criterion, 80 FPs were used for positioning, and almost no performance difference was observed as shown in Figure 4(a). Figure 4(b) shows the experimental results for Map 2. When C 0 ≤ 50, all 18 FPs were received by all RPs. This may be attributed to the regular distribution of WSN anchor nodes and an indoor environment without many barriers. Considering this, no spatial filter was used in the following experiments for Map 2. Figure 5 depicts the average location errors under the three different FP selection criteria, namely, the Strongest FPs, Least-variance FPs, and Combined criterion, when a proper subset of PCs is used for KRR-based localization.
It can be seen that no matter which FP selection criterion is adopted, the positioning method with more selected FPs obtains higher positioning accuracy at both sites. As shown in Figure 5(a), when 80 FPs are adopted, that is, 80 FPs are selected from the spatial filtered 98 FPs according to the FP selection criterion, the AE under the three criteria is all This is a completely different approach from those that apply FP selection in the online phase, as proposed in [23,24,31]. A larger value of k seems to retain more information after PCA transformation, but it may also retain redundant information and noise, which can be seen from the experimental results. Figure 6 compares the average positioning errors for the different number of FPs versus the number of PCs at both sites. As illustrated in Figure 6, no matter how many FPs are used in the experiments, the positioning accuracy varies with the number of PCs, and there is always an optimal value k p

Wireless Communications and Mobile Computing
for the highest positioning accuracy, but this value is certainly not the maximal one. For example, in the case of the 80 FPs adopted in Map 1, the optimal k p is 23. Further research found that this value k p ensured that the selected PCs retained more than 85% of the total information if eigenvalues were used to quantify the information contribution of each PC, which provides a relatively simple way to determine this optimal value. It is also worth mentioning that no matter how many PCs are extracted, the premise is to have sufficient FPs' information for high positioning accuracy. From Figure 6(a), the best performance in the case of 40 FPs is 2.77 m, while with 80 FPs, the worst performance is 2.56 m. The same is true for Map 2, where the corresponding values for the best and worst performance are 2.28 m and 1.88 m. It can also be inferred from Figure 5 that the number of FPs has a greater impact on performance compared with the number of PCs, which means considering enough FPs before choosing the right number of PCs. Figure 7 shows the comparison of positioning error among Euclidian-WKNN, LASSO, KRR, and the proposed PCA-KRR method with respect to the number of RPs. For Map 1, a total number of 80 FPs selected with the Combined criterion were used with the optimal value of k p . The number of FPs is 18 for Map 2. From the results displayed in Figure 7, we can see that the accuracy of the positioning gradually drops for both maps as the number of RPs used for positioning decreases, while the proposed PCA-KRR method achieves significantly better accuracy for both maps than the other methods regardless of the number of RPs. For instance, when the number of PRs used was 92 and a quarter of RPs (23)    3.5. Tuning the Parameters. There are several ways to tune the regularization parameter λ and the bandwidth of the Gaussian kernel σ involved in the proposed method. One effective approach for setting these parameters is the wellknown cross-validation (CV). Another approach is to use the training data to find the best values that make the positioning algorithm the most accurate, which has been applied in this paper. We use some of the collected RSS data at the test points for parameter training. The experimental results show that the best value of λ changes only with the positioning site and does not change with the number of RPs and PCs used in the method. This value is 0.001 and 0.0001 in Map 1 and Map 2, respectively. As mentioned earlier, there are also two kernel widths σ x and σ y for two coordinates. Table 1 shows the positioning error for the different σ values used for positioning in the two maps. When σ x and σ y take the same value, there is no significant impact on the positioning accuracy, which can be observed from the results for both maps. Whether the two parameters choose the same value 14

Conclusion
In this paper, we have proposed a new RSS fingerprintingbased positioning method for indoor localization. We showed that PCA-based feature reduction can efficiently extract the RSS features related to positions, and the nonlinear ridge regression can build a proper model between the features of RSS and positions. The proposed positioning method, PCA-KRR, provides high positioning accuracy in various indoor environments. Extensive experimental results have demonstrated that the performance of the proposed method is sensitive to the number of APs or anchor nodes, while the sparsity of the RPs did not reduce the accuracy of

Data Availability
All data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.