A new Wi-Fi dynamic selection of nearest neighbor localization algorithm based on RSS characteristic value extraction by hybrid filtering

Fingerprinting localization based on Wi-Fi received signal strength (RSS) is the most widely used indoor localization method. It typically includes offline training and online matching phases. The selection of the RSS characteristic value is a key step. The weighted K nearest neighbor (WKNN) algorithm is the most commonly used position-determination algorithm. The mean value of the RSS data collected over a time interval is usually taken as its characteristic value. However, the RSS measurements contain Gaussian and non-Gaussian noise, which cannot be filtered out effectively by the mean value method. The traditional WKNN algorithm adopts a fixed K . However, reference points far away from the test point (TP) may be selected as the nearest neighbors to participate in the position calculation, which may result in accuracy degradation. This paper proposes the weighted dynamic K nearest neighbor algorithm (WDKNN-HF), which utilizes a hybrid of particle filtering and Kalman filtering to extract the RSS characteristic value. In the online matching phase, a dynamic K matching algorithm based on Euclidean distances is developed to determine the coordinates of TPs. Two experiments are conducted in two different indoor scenes. Experimental results demonstrate that the proposed algorithm can obtain better positioning accuracy than existing algorithms, such as KNN, WKNN, enhanced-WKNN (EWKNN) and self-adaptive weighted K nearest neighbor (SAWKNN).


Introduction
Indoor positioning systems (IPSs) have become a research hotspot in recent years and many IPS technologies have been proposed. These technologies are divided into infrastructurebased and infrastructure-free technologies [1]. Infrastructurebased technologies mainly include UWB (ultra-wideband) [2], Bluetooth [3] and RFID (radio frequency identifier) [4] techniques, whereas infrastructure-free technologies typically include Wi-Fi [5,6], magnetic field [7] and PDR (pedestrian dead reckoning) [8,9] techniques. Infrastructure-based technologies require pre-installation facilities and specialized hardware, whereas infrastructure-free technologies require no additional hardware or facilities. Wi-Fi positioning technology based on received signal strength (RSS) [10] is the most commonly used indoor positioning technology because of the wide deployment of Wi-Fi access points (APs) and the popularity of mobile devices such as mobile phones.
Wi-Fi positioning methods mainly include Wi-Fi round trip time [11], channel state information [12] and Wi-Fi RSS. Wi-Fi RSS-based positioning methods can be divided into two categories: the triangulation method and the fingerprinting method. The triangulation positioning method converts RSS observations into distances between APs and the test point (TP) using a signal propagation model; therefore, it needs to know the locations of at least three APs. The fingerprinting method based on Wi-Fi RSS is the most frequently used method [13] because it does not need the locations of APs or a signal propagation model. The fingerprinting method typically includes two phases: offline training and online matching [14]. In the offline phase, the task is to collect RSS values of all reference points (RPs) in the location area and then extract RSS characteristic values to establish the fingerprint database; the database contains RSS characteristic values of all RPs and their corresponding physical information, such as serial number, coordinates, etc. In the online phase, the RSS characteristic value of the TP is measured and then the location of the TP can be calculated via a probability algorithm and deterministic algorithm. The probability algorithm requires knowledge of the signal propagation model and repeated observations of each physical point to establish an empirical distribution. Traditional deterministic algorithms include the NN (nearest neighbor), KNN (K nearest neighbor) [15,16] and WKNN (weighted K nearest neighbor) [17] algorithms. WKNN is the most frequently used algorithm, for which K is the number of nearest RPs selected to calculate the coordinates of the TP. The selection of the K value in the positioning phase can affect the positioning accuracy.
The selection of the RSS characteristic value in fingerprint positioning is a necessary procedure for both the offline phase and online phase. Different RSS characteristic values may result in diverse positioning results. In general, traditional RSS characteristic values include the mean value, median value, mode value, maximum value, and so on. The mean value is the most frequently used one. As indoor signal propagation often has a multipath effect [18] and NLOS (non-line-ofsight) propagation [19][20][21], there are both Gaussian and non-Gaussian interference signals in the indoor RSS signal. The mean value algorithm does not deal with interference signals effectively. The Kalman filtering (KF) algorithm [22,23] has better performance in processing a signal containing Gaussian noise, whereas the particle filtering (PF) algorithm [24,25] can filter out non-Gaussian noise. A single filtering algorithm cannot completely eliminate the interference signals. In this paper, a hybrid filtering algorithm combining PF and KF is proposed to extract the RSS characteristic values for both RPs and TPs.
In the traditional WKNN algorithm, a fixed number of RPs that have the minimum Manhattan distances or Euclidean distances from the TP are usually selected to calculate the coordinates of the TP. However, not all TPs can obtain the best positioning accuracy with the same K value; for example, in [26], the best positioning accuracy is obtained when K is 4, but in [27], K is 3. The major cause for this phenomenon is that there is a certain defect in using the fixed K value: at one TP, the RPs with larger actual distances from the TP may be included for position calculation, whereas at another TP, the RPs with smaller actual distances from the TP are excluded. Thus, the selection of RPs in indoor areas should be carefully made for both good accuracy and simple implementation. Different numbers of RPs can be selected for position calculation at different TPs, meaning that the value of K is dynamically adjusted. In this paper, a new dynamic K selection algorithm WDKNN (weighted dynamic-K-nearest neighbor) based on Euclidean distance is proposed in the online matching phase. Different from the traditional KNN or WKNN algorithms, the proposed WDKNN algorithm can dynamically adjust the number of RPs selected for position calculation and it does not require any prior knowledge in the offline phase or additional hardware from the positioning system.
The rest of this article is structured as follows. The related works are introduced in section 2. Section 3 provides the details of our proposed WDKNN-HF algorithm, and experiments and the result analyses are conducted in section 4. Section 5 presents the conclusion of the entire work and suggestions for future research.

Related works
Indoor environments are complex and refraction, reflection, occlusion and other factors may cause multipath and NLOS propagation in the RSS signal propagation process. In [28], different RSS characteristic values, including the mean value, median value, mode value and maximum value, are selected, and their positioning performances are compared and analyzed. Kaemarungsi and Krishnamurthy [29] analyze the influence of environment and equipment on the RSS characteristic value and point out that the RSS signal characteristic value distributions vary in different environments. The authors suggest that in addition to the mean value, the standard deviation (STD) and the distribution of RSS should be included. Sun and Zhu [30] put forward a hybrid filter algorithm based on the mean filter, median filter and Gaussian filter and RSS data are optimized by using this algorithm. Zhang et al [31] first analyze the influence of the environment on RSS and then the performances of three experimental data processing methods are compared; the Gaussian filter algorithm is considered to be the best. In [32], to optimize RSS observations, the author uses a filter algorithm mixed by a mean filter, a Gaussian filter, a KF and linear interpolation to process RSS observations. Song et al [33] develop an RSS-based indoor positioning algorithm using a PF; the RSS measured at the TP is used as a direct observation in the correction stage of the filtering algorithm and the PF is applied to the nonlinear tracking mode. Zhuang et al [34] propose a two-filter integration algorithm for motion sensor-based positioning and RSS-based fingerprinting. A KF is first adopted to obtain a smoothed constrained Wi-Fi fingerprinting and then an extended KF is used to integrate smoothed constrained Wi-Fi fingerprinting and sensor-based positioning. Huang et al [4] utilize a KF to process RSS observations. They compare the results of RSS raw data, RSS data through traditional statistics (the mean RSS value) and RSS data through the KF, finding that the KF algorithm can significantly remove the drift effect from RSS raw data, but its result is similar to the mean value.
Traditional online matching algorithms include the NN algorithm, KNN algorithm and WKNN algorithm. First, Manhattan distances or Euclidean distances D iT from the TP to all RPs are calculated via: In (1), RSS j T represents the online RSS observations of the jth AP at the TP, RSS j i represents the offline RSS observations of the jth AP at the ith RP and it is assumed that there are m APs and n RPs; when r is one, D iT represents Manhattan distance, and when r is two, D iT represents Euclidean distance. Then, sort all Manhattan distances or Euclidean distances in ascending order.
In the NN algorithm, the RP that has the smallest Manhattan distance or Euclidean distance from the TP is selected and its coordinates are taken as the TP's estimated coordinates. However, in complex and dynamic indoor environments, the positioning result of the NN algorithm has a great contingency; that is, the RP far away from the TP may has the smallest Manhattan distance or Euclidean distance rather than the real nearest RP and it may produce significant position error. The KNN algorithm selects the K-nearest RPs and takes the average of their coordinates as the estimated coordinates of the TP, reducing the contingency of taking a single nearest point. The WKNN algorithm selects K-nearest RPs to calculate the TP's coordinates, as well. Considering that the Manhattan distances or Euclidean distances between the TP and these selected RPs are different, WKNN gives different weights to RPs considering their different contributions to the positioning result. The WKNN algorithm can obtain a better positioning accuracy than those of the KNN and NN algorithms. In the WKNN algorithm, the weights of the selected RPs are calculated as inversely proportional to the Manhattan distance or Euclidean distance; that is, an RP with a small RSS distance is assigned a larger weight, while an RP with a large RSS distance is assigned a smaller weight. The weight of the RPs is calculated by: The estimated coordinates of the TP are the weighted average of the coordinates of these selected RPs and are calculated by: In both the KNN algorithm and the WKNN algorithm, the value of K affects the positioning accuracy. Different K values may lead to different positioning results. In [26], the best positioning accuracy is obtained when K is 4; in [27], the best positioning accuracy is obtained when K is 3; however, in [35], K is 13. This phenomenon is mainly caused by the difference of environments and the distributions of RPs. Some researchers have studied dynamic K-value selection methods. Shin et al [27] propose the EWKNN (enhanced weighted Knearest neighbor) algorithm. The EWKNN utilizes two comparisons to select RPs for position calculation, compared with the traditional WKNN algorithm, the mean position error (ME) is reduced from 2.87 m to 2.11 m. Hu [36] propose the SAWKNN (self-adaptive weighted K-nearest neighbor) algorithm, which can achieve better positioning precision than the traditional WKNN. The experimental results show that the SAWKNN outperforms traditional WKNN by 21% in positioning accuracy. The EWKNN and SAWKNN algorithms both adopt Manhattan distance. Positioning performances of the KNN, Euclidian-WKNN, Manhattan-WKNN and EWKNN algorithms are compared and analyzed in [37].

The weighted dynamic-K nearest neighbor algorithm based on hybrid filtering RSS characteristic value extraction
This section presents the WDKNN-HF algorithm (weighted dynamic-K nearest neighbor algorithm that uses hybrid filtering to extract the RSS characteristic value).
The architecture of the proposed WDKNN-HF algorithm is illustrated in figure 1. The WDKNN-HF algorithm includes an offline training phase and an online matching phase. In the offline phase, RSS observations are collected at all RPs by a mobile device. To extract RSS characteristic values, the hybrid filtering algorithm is adopted to process RSS observations. Then, all of the RPs' RSS characteristic values and their corresponding spatial information are stored to establish the fingerprint database. In the online phase, the RSS characteristic values of the TP are measured and instead of using a fixed K value, a dynamic K-value selection algorithm based on Euclidean distance is adopted to calculate the TP's coordinates.

Hybrid filtering algorithm based on particle filtering and Kalman filtering
The framework of the WDKNN-HF algorithm indicates that the RSS characteristic value extraction is a key step in both offline and online phases. In an indoor environment, the RSS observations contain Gaussian and non-Gaussian noise. To deal with both of these noise signals, a hybrid filtering algorithm that combines PF and KF is adopted for the extraction of the RSS characteristic value.

Hybrid filtering algorithm.
Step 1: The PF is applied to RSS observation values. PF is a process of approximating the probability density function by finding a group of random samples propagating in the state space, and replacing the integral operation with the sample mean value to obtain the minimum variance distribution of the state [24,25].
The state space model of the nonlinear dynamic system is expressed as follows: x i is the state of the system at the time i, z i is the observation vector at time i, f is the state transfer function of the system, h is the system measurement function and v i and u i represent the process noise and the observation noise, respectively.
Equation (5) represents a list of RSS observation values from an AP at the sampling point.
First, M particles {p k , w k }, normally distributed, are generated; the RSS observation rss i is taken as the expected value; V is the variance of particles {p k }: Then, the differences between the (i + 1)th RSS observation and each particle are calculated, and equation (7) is used to determine the weight w k of each particle: The weight is then normalized as: Particle degradation is a common defect in PF, to reduce this phenomenon, the resampling of particles is conducted. The resampling of particles may result in the reduction of the particles' diversity. To deal with this problem, a criterion N eff is used to calculate the number of valid particles to determine whether to perform resampling.
If N eff is less than a certain threshold N threshold , the system resampling [38] is adopted to select the particles, which are denoted as p k , and the particles with small weight are replaced by particles with large weight. N threshold can be given as an empirical value. In this article, N threshold is set to 2 3 M. Finally, the weighted average of these particles is calculated as the estimated value of the ith RSS after PF, which is denoted as − → rss i : All RSS observations of (5) are processed by PF, and the processed observations are denoted as − − → RSS: Step 2: KF is applied to the processed RSS values − − → RSS by the PF. The KF [22,23] takes the minimum mean square error as the optimal estimation criterion, adopts the state space model of signal and noise, updates the estimation of state variables with the estimated value of the previous moment and the observed value of the current moment, and calculates the estimated value of the current moment.
The system process and system measurement equations of the KF are described as follows: The discrete process system is one-dimensional without control input, A = 1, H = 1. w i and v i represent the noise of process and measurement, respectively; they follow the normal distributions: In equation (13), Q is the variance of process noise and R is the variance of measurement noise. All − − → RSS components are taken as system measurements and the variance R is calculated: In equation (14), − → rss is the average value of − − → RSS. Q can be given as an empirical value, in this process, we set Q = 2R.
− − → rss 1 is taken as the initial value of the system state x 0 , and the predicted value of the state at time i is calculated: The initial covariance P 0 is set to 1; the covariance P − i corresponding to the state prediction value is: Then, the optimal estimated RSS value x i at time i is calculated by: Kalman gain Kg can be calculated via equation (18), and the updated P i can be calculated via equation (19).
Finally, RSS characteristic value RSS is obtained via: Step 3: The RSS values collected from all APs at all sampling points are processed by steps 1 and 2 to obtain all of the RSS characteristic values.

Test of the hybrid filtering algorithm on RSS data process.
To test the performance of the hybrid filtering algorithm on RSS data, an experiment was conducted. At the sampling point, RSS data from two different APs (AP1 and AP2) are collected by the same smartphone. AP1 is close to the sampling point and AP2 is far away from the sampling point, each AP is measured 200 times. Figure 2 shows the results of different processing methods on RSS observations from these two APs. It can be intuitively seen that the fluctuation amplitude of RSS data by hybrid filtering is smaller than raw RSS data and RSS data by PF. Take AP2 for instance, the minimum and maximum RSS data of the hybrid filtering algorithm are −79.4dBm and −67.7dBm, which are better than the −83dBm and −64dBm of raw RSS data, and the −82dBm and −65dBm of the PF algorithm. It can be concluded that the hybrid filtering method can deal with the signal noise better.
The STD reflects the degree of dispersion of a set of data; the smaller the STD is, the smaller the dispersion of the data is.
Christoph [39] proposed the smoothness index S to quantify the smoothness of the curve. The smaller S is, the smoother the curve is and the better the signal quality is.
S is calculated by: In (21), N is the number of observations. Table 1 shows the comparison of STD and S of RSS observations by different processing methods. Both STD and of the hybrid filtering method for AP 1 and AP 2 are the smallest, quantitatively illustrating the effect of the hybrid filtering method on mitigating signal noise.

Weighted dynamic-K nearest neighbor (WDKNN) algorithm
As the distance between the sampling point and the AP increases, the signal strength value received by the sampling point gradually weakens [40][41][42][43]. Equation (22) is the wireless signal attenuation model [44][45][46], in which P(d) and P(d 0 ) respectively represent the RSS values measured at distances d and d 0 from an AP, η is the path loss exponent and d 0 is a benchmark, usually with a value of 1 m, and ω is the environment noise.
Equation (22) indicates that the P(d) of the sampling points in the same space may depend on the value of d and that the closer TP and RP have similar P(d) values. Moreover, the Euclidean distance reflects the similarity degree of P(d) values between the two points.
From equation (22), we can ascertain that if the actual distance of two sampling points is small, then they have a small RSS value gap, and thus their Euclidean distance (23) is also small. However, it is not absolutely true in the real indoor environment. The experimental result in [36] indicated that the chance of having d 1T < d 2T when D 1T < D 2T is about 97%. We can infer that if d 1T < d 2T , then RSS 1T < RSS 2T , and then D 1T < D 2T in general. In this article, Euclidean distance is used to select RPs for position calculation. Now, the WDKNN algorithm is described as follows: Step 1 Equation (23) is used to calculate the Euclidean distances from the TP to all RPs, denoted as d i , where n is the number of RPs and m is the number of APs, signals of which can be received at both the TP and the RP. Then all Euclidean distances are arranged in ascending order such that d 1 is the minimum Euclidean distance, denoted as d min .
Step 2: Equation (24) is used to calculate the ratio γ j of the remaining d j (j = 2, 3, 4 . . . n) to d min : Equation (25) is used to select RPs to participate in the position calculation; the threshold value γ threshold is used as the metric for whether the RP is selected or discarded. The number of selected RPs is K and its value varies for different TPs.
{ γj≤γ threshold , corresponding RP is selected for matching γj>γ threshold , corresponding RP is discarded. (25) The γ threshold varies with different equipment and environment, and it can be given by the empirical value.
Step 3: The RPs adopted for position calculation are selected by step 2, and their weights are calculated via: Then, the weighted average values of the selected RPs' coordinates are calculated by equation (27) as the estimated coordinates of the TP: In (27), (x i , y i ) represents the coordinates of the ith RP. Finally, repeat steps 1, 2, and 3 and all TP estimated coordinates are calculated.

Evaluation indicators of positioning performance
The proposed WDKNN-HF algorithm is compared with traditional algorithms in terms of the ME, STD, root-mean-square error (RMSE), 90th percentile error and cumulative distribution function (CDF). The ME, STD and RMSE are calculated as follows: In (28)- (31), δ is the position error of the TP, (x estimation , y estimation ) stands for the TP's estimated coordinates, (x real , y real ) are the TP's's real coordinates and n is the total number of TPs.  adjacent points is approximately 2.2 m, and 75 TPs are selected. In both the 3F and the office, each RP is measured 60 times and each TP is measured 40 times.

Experimental setup
To avoid the variability of heterogeneous devices, all experimental data were collected by the same smartphone, a Huawei P9 smartphone with the Android operating system. For convenience, an independent coordinate system in each scene was also established. In the schematic diagram of the experimental environments as shown in figures 3(a) and (b), a plane Cartesian coordinate system is respectively established with O as the origin, the horizontal axis as the X-axis and the vertical axis as the Y-axis. The coordinates of sampling points were measured by using the Leica TS60 total station, which can achieve 3 millimeter precision. Figure 4 shows the trend of the MEs of the WDKNN algorithm with the change of γ threshold in the 3F (a) and the office (b), respectively. Figure 4 indicate that with the increase of γ threshold , the ME tends to decrease first and then increase on the whole. At a certain γ threshold , the ME reaches the minimum, that is, at this γ threshold , the best position accuracy is obtained. Figure 4 demonstrates that the best position accuracy is achieved when γ threshold is 1.3 in the 3F and 1.13 in the office. The reason for this difference is that in the separate scenes, the indoor environments are different and the spacing of RPs is also diverse. In this paper, γ threshold is set to 1.3 for the 3F and 1.13 for the office.

Positioning performance of the proposed hybrid filtering algorithm.
In this section, the position results of the hybrid filtering algorithm with the traditional KNN and WKNN algorithms are compared. WKNN-HF represents the RSS characteristic values that are extracted by hybrid filtering algorithm and the online phase adopts the traditional WKNN algorithm. Figure 5 show the position accuracy of the KNN algorithm, the WKNN algorithm and the WKNN-HF algorithm with K ranging from 2 to 10 for the 3F (a) and the office (b), respectively. As the figures show, the ME tends to decrease first and then increase, and the position error of WKNN-HF is smaller than that of KNN and WKNN on the whole. In the office, when K is small, such as 2, 3, 4 and 5, the position accuracy of the proposed WKNN-HF algorithm improves obviously. With the increase of K, the gaps of the position results between the WKNN and WKNN-HF algorithms are reduced, and some position errors of WKNN-HF are even slightly larger than that of WKNN, such as when K is 7. The reason for this phenomenon is that, when K is large, the RPs selected for the position calculation will inevitably include the RPs that have large actual distances from the TP. The WDKNN-HF algorithm proposed in this paper can improve this phenomenon, and specific experimental analysis is presented below.

Positioning performance of the WDKNN algorithm.
In this section, the positioning performance of WDKNN (taking the mean RSS as the characteristic value) and the traditional KNN and WKNN are compared. The last section indicates that in the 3F, the KNN and WKNN algorithms obtain the best position accuracy when K is 4, whereas in the office, K is 6. In this paper, the KNN and WKNN algorithms' K are set to 4 and 6 for the 3F and the office, respectively. Figure 6 shows boxplots of these three algorithms, the boxplots display some positioning error statistics including the max, 90th, 75th, 25th and 10th percentile errors, median, min and mean error (the red line).

Positioning performance of the proposed WDKNN-HF algorithm.
In this section, the positioning performances of the WDKNN-HF algorithm and other algorithms, including KNN, WKNN, WKNN-HF and WDKNN, are compared. Table 2 depicts some positioning error statistics, including the ME, STD, RMSE and 90th percentile error.
From the results displayed in table 2, we can observe that the proposed WDKNN-HF algorithm obtains better position performance than the other four algorithms in terms of the ME, STD, RMSE and 90th percentile error. In 3F, the ME of WDKNN-HF is 0.32 m better than that of KNN, 0.23 m better than that of WKNN, 0.17 m better than that of WKNN-HF and 0.1 m better than that of WDKNN. The RMSE of WDKNN-HF is 2.10 m, which is better than the 2.48 m, 2.36 m, 2.28 m and 2.17 m of the KNN, WKNN, WKNN-HF and WDKNN, respectively. In the office, the ME of WDKNN-HF is 0.44 m better than that of KNN, 0.29 m better than that of WKNN The 90th percentile error is a frequently used indicator to evaluate position performance. From table 2, we can observe that in the 3F, the proposed WDKNN-HF algorithm produces a 90th percentile error of 3.12 m, which is significantly better than the 3.71 m, 3.58 m, 3.55 m and 3.45 m of KNN, WKNN, WKNN-HF and WDKNN, respectively. In the office, the proposed WDKNN-HF algorithm produces a 90th percentile error of 2.60 m, which is significantly better than the 2.96 m, 2.93 m, 2.86 m and 2.73 m of KNN, WKNN, WKNN-HF and WDKNN, respectively. Figure 7 shows the CDF of the position errors of these five algorithms of the office. When the error thresholds are 1 m, 2 m   Some existing dynamic K selection algorithms (EWKNN and SAWKNN) and the proposed WDKNN-HF The results displays in table 3 indicate that the proposed WDKNN-HF algorithm obtains better position performance than the other three algorithms. In 3F, the ME of the proposed WDKNN-HF is 1.73 m, which is better than the 1.95 m and 1.86 m of the EWKNN and SAWKNN, respectively. The RMSE of WDKNN-HF is 2.1 m, which is better than the 2.52 m and 2.22 m of the EWKNN and SAWKNN, respectively. WDKNN-HF produces a 90th percentile error of 3.12 m, which is better than the 3.71 m and 3.54 m of the EWKNN and SAWKNN, respectively. In the office, the WDKNN-HF algorithm produces an ME of 1.47 m, which is better than the 1.63 m and 1.59 m of the EWKNN and SAWKNN, respectively. The RMSE of WDKNN-HF is 1.67 m, which is better than the 1.83 m and 1.99 m of the EWKNN and SAWKNN, respectively. WDKNN-HF produces a 90th percentile error of 2.60 m, which is better than the 2.90 m and 3.15 m of the EWKNN and SAWKNN, respectively. Figure 8 shows the CDF of the position error of these three algorithms of the office, and we can observe that all position errors of the WDKNN-HF and EWKNN algorithms are less than 4 m, whereas the max position error of the SAWKNN algorithm is 5.39 m. When the error thresholds are 1 m, 2 m and 3 m, the CDF values of WDKNN-HF are respectively 30.67%, 78.67% and 97.67%, which are higher than the 20%, 66.67% and 90.67% of the EKNN and the 28%, 65.33% and 88% of the SAWKNN. Therefore, compared to the existing dynamic K selection algorithms, the proposed WDKNN-HF algorithm can achieve a performance gain. Figure 9 displays the location performance of the seven algorithms in terms of the error vector at each TP of the office. The error vector is represented by an arrow pointing from the real position to the estimated position.  demonstrates that the proposed WDKNN-HF outperforms the other algorithms. The position results of the traditional KNN, WKNN and WKNN-HF algorithms exhibit the centralization phenomenon, whereas the results of SAWKNN exhibit polarization; the WDKNN-HF algorithm can improve these phenomena. Relatively, EWKNN and WDKNN perform better, close to WDKNN-HF, but their MEs are larger than that of WDKNN-HF.

Conclusion
This paper has presented a novel WDKNN-HF algorithm, which adopts a hybrid filtering algorithm to extract RSS characteristic values and a better dynamic K value selection matching algorithm in the online phase. Field tests were carried out to verify the proposed WDKNN-HF algorithm. Experimental results demonstrated that compared with the traditional mean RSS value, the hybrid filtering algorithm can effectively improve the quality of the RSS signals, which considerably improves the positioning accuracy. The proposed dynamic K value selection algorithm based on Euclidean distance is intended to eliminate RPs that have large actual distances from the TP in the online positioning phase. According to the experimental results of two scenes, the improvement of the proposed WDKNN-HF algorithm in localization performance is diverse in different environments, and the performance of the office is better than that of the 3F. Generally, the position performance of the proposed WDKNN-HF algorithm is considerably better than that of traditional algorithms, such as the KNN and WKNN algorithms, and it also outperforms existing dynamic K selection algorithms such as EWKNN and SAWKNN. Possible directions for future work include the selection of APs to further improve position accuracy, the rapid construction of a fingerprint database and an efficient position calculation algorithm to improve the efficiency of the positioning system.