A Light-Weight Technique to Detect GPS Spoofing Using Attenuated Signal Envelopes

Global Positioning System (GPS) spoofing attacks have attracted more attention as one of the most effective GPS attacks. Since the signals from an authentic satellite and the spoofer undergo different attenuation, the captured envelope of fake GPS signals exhibits distinctive transmission characteristics due to short transmission paths. This can be utilized for GPS spoofing detection. The existing technique for GPS spoofing are either computationally too expensive, require specialize hardware/software updates, or are not accurate enough. To solve these issues, we propose a light-weight GPS spoofing detection method based on a dynamic threshold and captured signal envelope. We validate the proposed technique using experiments based on actual GPS signals and hardware. The relation between envelope characteristics and the distance between a GPS transmitter and receiver are revealed. Inspired by the uncovered relation, a threshold approach towards the detection of GPS spoofing is developed. The proposed approach features a dynamic threshold determined by the dispersion value of a signal envelope's variance instead of a fixed threshold to maximize detection performance in multiple attack scenarios. The results show that the proposed technique can effectively detect GPS spoofing attacks with better accuracy and lower computational complexity as compared to existing techniques.


I. INTRODUCTION
The Global Positioning System (GPS) is widely used in navigation systems, communication systems, engineering surveys, Financial institutions, and critical infrastructures such as the power grid. It provides precise location and time information based on a satellite position in space and the signal propagation range. The signal transmission time is synchronized since atomic clocks are equipped on GPS satellites for time synchronizing. The satellite position is calculated by GPS ephemeris and almanac. A GPS civil receiver decodes the navigation data for GPS time and the satellite positions, and the propagation range is measured by a coarse/acquisition (C/A) code.
The generation and processing procedure of GPS civil signals is transparent and detailed in [1]. The C/A code of each satellite is public while the GPS ephemeris and almanac are open access and GPS signals are not encrypted. These cause the system is vulnerable to a spoofing attack. A GPS spoofing attack refers to an adversary gaining control over the calculated location and time of the victim receiver by broadcasting fake signals at GPS frequency. Todd Humphreys's team showed a successful spoofing attack on unmanned aircrafts [2] and surface vessels [3]. In their experiments, the fake signals are designed to be aligned with the authentic signals at first, the signal power of the fake signals is then increased to attract the victim receiver, once the receiver starts tracking the fake signal, the signal is designed to gradually deviate the receiver from its actual position. The authors in [4] show that a spoofing attack can be implemented by broadcasting fake signals through RF devices, which makes spoofing attacks easier. Fake GPS signals can be generated by software [5] for any target time and location. Spoofing attacks are discussed theoretically in [6] and [7]. Although [6] studied spoofing attacks based on ephemeris manipulation, [7] focuses on the  Spoofing attacks can be catastrophic to systems that rely on GPS time or location. One such example is that of the modern power grids [8]. A clock offset of greater than 26.5 μs breaks the stability of the power grid and can cause a black-out. Similarly, GPS spoofing attacks can deceive navigation units of vehicles and mislead the vehicles to restricted areas [2]. In addition, GPS service has already penetrated into many aspect of our lives with the development of the Internet of things (IoT). For example, precision agriculture, intelligent transportation system, and commercial activities (e.g. shared bicycle, Pokemon GO) all depend on the GPS.
The importance and widespread applications of GPS make detection of GPS spoofing crucial. There are many existing works, we categorize them as techniques based on encryption and authentication; signal quality and signal processing based techniques; and assistance based techniques. Encryption methods [9], [10], [11], [12] insert special information into GPS signals while authentication methods [13], [14], [15], [16], [17] take advantage of the unpredictable nature of GPS signals. Although, Encryption-or authentication-based methods are more robust than others, their higher computational complexity and implementation requirements in terms of the design of the GPS scheme make them less feasible. Assistance based techniques such as using directional antennas to examine the signal of arrival [18], [19], [20], [21], [22] or using sensors and clocks to provide reference time [23], [24], [25] require additional hardware support. Morever, assistance based techniques rely on synchronized reference information and catch which not only results in a complex system but also increases the detection latency. Finally, there are techniques purely based on the received signals, including signal quality [26], [27], [28], [29] and signal processing methods [30], [31], [32], [33], [34]. More details about related works are in Section VI.
Targeting the effectiveness and timeliness of GPS spoofing detection, we work from the basic feature of communication transmission -signal attenuation. Authentic GPS signals are transmitted from the satellites in space and are reflected by the ionosphere, troposphere, and urban environments with dense buildings. Since authentic signals traverse significanlty longer distances as compared to fake signals, the envelope of authentic and fake signals differ significantly. In this article, we propose a distribution-based spoofing detection method and a dynamic threshold selection method to improve the overall performance. Our main contributions are as follows: 1) An analytical model for the distribution of a signal's envelope based on the distance between the transmitter and receiver. 2) A light-weight threshold technique based on the distribution of signal envelopes to detect GPS spoofing attacks. 3) A dynamic threshold selection mechanism based on the dispersion of variance of a signal's envelope. The rest of this article is organized as follows. We first introduce the system model in Section II. In Section III, we describe the proposed technique and the experimental design is described in Section IV. The detection results are presented in Section V and the proposed dispersion value-based threshold selection mechanism is explained in Section V. Section V discusses the results and a comparison of the proposed detection method with related literature is presented in Section VI. In the end, we conclude the article in Section VII.

II. SYSTEM MODEL
We consider the system model shown in Fig. 2. GPS satellites orbit at 19300 km above the earth's surface while the receiver and adversary are located on the earth's surface. We consider the following two scenarios: 1) Legitimate Scenario: The GPS receiver tracks authentic GPS signals that are transmitted by GPS satellites in the space. Signals experience reflections within ionosphere and atmosphere before reaching a receiver. At the surface of earth, the average signal power is around −171 dBW and the signal to noise ratio is around 30 dB. 2) Attack Scenario: The GPS receiver tracks fake signals that are transmitted by an adversary. Fake signals are transmitted at a higher signal power to overlay the authentic ones. The adversary broadcasts fake signals by an antenna which is placed close to the victim receiver and keeps a relatively stable distance to maintain a continuous and stable attack [3].

III. PROPOSED SPOOFING DETECTION TECHNIQUE
Denoting the transmitter and receiver by Tx and Rx, respectively, the signal received by the Rx can be modeled as follows [35]: where x n,i (t ) and x n, j (t ) are the in-phase and quadrature components of x n (t ), respectively. η(t ) is the additive white Gaussian noise and N is the total number of symbols. Let I (t ) and Q(t ) be the in-phase and quadrature components of y(t ), we get where r(t ) and θ (t ) are the envelope and phase of the received signal, η i (t ) and η j (t ) are the in-phase and quadrature components of signal noise η(t ). The white Gaussian noise can be ignored for signal envelope calculation since it is independent of authentic and fake GPS signals. As the proposed technique is based on the signal envelope, with out loss of generality, the received signal envelope is given as follows: Considering a Rayleigh distributed channel, the mean and variance of the received signal envelope will vary with the distance (d i ) between the Tx and Rx as follows: Thus, the probability density functions for a spoofed signal envelope and an authentic signal envelope will be significantly different given that the GPS satellites are 19300 km away from the earth's surface while the adversary cannot be too far from the receiver due to the capability of attack devices [1]. Accordingly, the probability of missed detection (P MD ), probability of false alarm (P FA ), and probability of detection (P D ) for attack scenarios that have different distances between Tx and Rx can be calculated as follows: where γ is the detection threshold, d i is the distance between the adversary's antenna and the Tx, and d 0 the distance between the GPS satellites and the Tx. Moreover, f L (r) and f A (r) represent the pdf of the envelope r(t ) for legitimate signals and spoofed signals, respectively. The pdf of the envelope of a signal can be considered rayleigh distributed or normally distributed [36].

A. RAYLEIGH DISTRIBUTED SIGNAL ENVELOPE
A Rayleigh distribution is characterized by the following pdf: where, the scale parameter σ R is related to d i , the distance between Tx and Rx [37]. An attack with larger d i leads to greater σ R,i and vice versa, i.e., the pdf of the envelope will vary according to the distance between a transmitter and receiver. Using this insight, the pdfs of the envelope under attack with the attacker located at d i ∈ {0.2, 0.5, 1, 5} m and the pdf under the legitmate scenario, i.e., d 0 = 19300 km are plotted in Fig. 3(a). we observe different pdf curves for different d i s.

B. NORMALLY DISTRIBUTED ENVELOPE
If the envelope of a received signal is assumed to be normally distributed then the pdf is as follows: where μ is the mean of envelope, and σ 2 N is the variance of envelope given by (6) and (7), respectively. Then, the variance of the envelope can be re-written as: Thus, the mean and variance of the envelope in (12) will vary according to the distance of the attacker from the Rx. The pdf curves for d 0 = 19300 km and d i = 0.2 m, 0.5 m, 1 m, 5 m are shown in Fig. 3(b). We observe that as d i changes the mean and variance of the pdfs also change.

IV. EXPERIMENT DESIGN
We conducted experiments to capture authentic GPS signals and also generated fake signals to be transmitted by an adversary. The experiment illustration is shown in Fig. 4. The victim receiver side consists of a computer, NI USRP device and a GPS antenna. The computer controls the USRP-2943R to capture the signals at GPS civil frequency 1575.42 MHz with the GPS antenna. The adversry consists of a laptop, BladeRF and an antenna. The laptop controls the BladeRF to broadcast the generated fake GPS signals.
The fake GPS signals were generated by the open source software GPS-SIM-SDR [5]. The spoofing position is set to be a route in China while the receiver is actually located at a fix position in Singapore. The GPS time is also spoofed to an early time. The BladeRF board is configured to set the transmit frequency at 1575.42 MHz, sampling rate at 2.5 MHz, and the transmit gain at 73 dB. The USRP is configured to set the receiver frequency at 1575.42 MHz, sampling rate at 16 MHz,   In the experiment, a phone with the 'GPS test' and 'Baidu Map' applications installed was used to verify the attack. GPS test shows the visible satellite, their signal strength and the GPS time, while the Baidu Map shows the position. As shown in Fig. 5(a), without the spoofing attack, the signal to noise ratio (SNR) of different satellites varies, the calculated time coincides with the actual time, and the calculated location is at the National University of Singapore (NUS) campus in Singapore. Fig. 5(b) shows the results when there is a spoofing attack at 5 m, the SNR of different satellites are similar and above 30, the calculated time is the spoofed time, and the calculated location is at China. This shows that our experiment design can successfully generate spoofed signals for actual android applications. Thus, the results presented in this paper are representative of actual GPS spoofing attacks.
The signal's envelope is calculated by (5) from the I & Q samples. A 500 window size was used for the mean of envelope calculation. Fig. 6 shows the mean value of 500 windows observed for attacks at different distances. The average value of envelope means are 3.441 × 10 −4 , 1.423 × 10 −4 , 1.194 × 10 −4 , 7.592 × 10 −5 , 7.151 × 10 −5 for attacks at 0.01 m, 0.2 m, 0.5 m, 1 m, and 5 m, respectively. The envelope mean for authentic signals is 6.248 × 10 −5 . We fitted the average envelope mean value over a window with 500 samples withr = ae bl (14) in Fig. 7, where a = 1.405 × 10 −4 , b = 9.795 × 10 −2 . We observe that the average value of the mean of a signal's envelope decreases exponentially with increasing the distance between the receiver's and an attacker's antenna

V. RESULTS
In this section, we evaluate the performance of the proposed spoofing detection mechanism. Considering the Rayleigh distribution for the signal envelope, the miss classification probabilities can be described by: Fig. 8 shows receiver operating characteristic (ROC) curves for the different d i values. we observe that the area under the ROC curve increases as d i reduces. The changes of P D , P MD , and P FA over various thresholds are plotted in Fig. 9. As d 0 is constant, we only get one curve for P FA . However, P D and P MD curves vary according to the distance between attacker and Rx. We observe that a threshold between 1 and 2 with a d i > 0.2 m leads to better results.
Assuming the signal envelope to be distributed as a normal distribution, Fig. 10 shows the variance directly calculated from the signal's envelope and the average of the variance of 500 samples over various distances d i . We observe that for each d i , the variance directly calculated from the signal's envelope is approximately the same for the average variance calculate over a window of samples. Thus, we do not need  to applying averaging. The ROC for the proposed scheme is shown in Fig. 11. The miss classification rates are shown in Fig. 12 and given as follows: where er f −1 (z) is the inverse error function and can be extended by the Maclaurin series [38]. We observe that the optimal value for threshold is γ > 0.5. Figs. 9 and 12 show that the performance of detecting spoofed signals depends on the threshold γ . Thus, to obtain a balanced performance criterion, we define a new metric , the effective detection rate, taking into account P MD and P FA as follows: with where λ MD and λ FA are weights defining the importance of P MD and P FA , respectively. As a higher P MD is typically more harmful, we chose λ MD = 0.7, λ FA = 0.3, for a P FA < 0.8. The corresponding plots for versus threshold values are shown in Fig. 13. For P FA > 0.8, we observe that in Fig.  13(a) the minimum required threshold value is γ > 0.54 while that for Fig. 13(b) the minimum required threshold value is γ > 0.42. Similarly, as d i increases the peak of the curves moves to the right, i.e., the optimal threshold value increases. Thus, the upper bound on γ is defined by the curve corresponding to d i = 0.01 m. This shows the acceptable range for γ is [0.54 0.89] and [0. 42 1.40] after taking the signal envelope as Rayleigh distributed and normally distributed, respectively. The value of the performance criterion after assuming the signal envelope as a Rayleigh or Normal distribution for different d i s is given in Tables 1(a) and (b), respectively. We observe that the best value of is obtained using a threshold value which depends on d i . Therefore, using a fixed threshold to detect attacks in different scenarios may not lead to the best performance, i.e., the threshold needs to be adjusted according the the attacker distance from the Rx.
The I & Q samples of signals captured from attacks at various distances are shown in Fig. 16. This shows that the variance of the envelope of received signals may not be a good predictor as the variance between attacks from d i > 0.2 m does not have clear boundaries. This is also obvious from Fig. 15(a). To maximize the performance of the proposed technique in detecting attacks from different distances, we analyzed the dispersion of the received signals to devise an optimal threshold value in Fig. 15(b). The dispersion value is defined as the distance between the confidence bounds of variances. As shown in Fig. 14, the variance of attacks at closer distances is scattered over a larger area. We fitted the variance using an exponential curve with 95% confidence bounds. The distances between the confidence bounds are marked in Fig. 14. In order to have a clear view on all scenarios, the variance of attacks at 1 m, 5 m, and legitimate scenario are drawn separately. We observe that the dispersion value drops from 7.7056 × 10 −9 to 6.4072 × 10 −10 when increasing the attacking distance from 1 m to 5 m, while the legitimate scenario value is 5.4716 × 10 −10 . Fig. 15(b) shows that the dispersion value decreases when the attacker moves away from the Rx. We define the optimal threshold that leads to a best performance as γ * . For this purpose we fit a curve as follows: where γ is the threshold, d v is the dispersion value, and a, b, c are the coefficients. Note that the coefficients are dynamically renewed according to the distance of the attacker from the receiver. In Fig. 17, the optimal thresholds are plotted against the dispersion values. The fitted curve is generated by (23) with 95% confidence. For detection based on Rayleigh distribution  Fig. 17, the resulting coefficients are a = 1.29 × 10 3 , b = 1.982 × 10 −4 , and c = −1.289 × 10 3 , and the fitted curve iŝ For detection based on normal distribution in Fig. 17(a) Fig. 18 shows the value of plotted against the log of d i for the optimal threshold and fitted threshold. We observe that the performance of the fitted threshold values is approximately the same as the optimal threshold values. We also present the plot for the worst performance generated using the lower bound of the threshold range, i.e., γ = 0.54 and γ = 0.42 for the Rayleigh distributed and Normally distributed signal envelope, respectively.
To summarize these results, Table 2 presents the comparison of the fitted threshold performance with the optimal threshold and worst threshold performance. We observe that the fitted thresholds have significant performance gains while the performance loss is not significant. We summarise the performance of spoofing detection using the optimal threshold and the dynamic threshold obtained after curve fitting in Table 3 to compare the proposed scheme using a Rayleigh distribution method and Normal distribution method. We observe that overall using the Normal distribution to characterize the received signal envelope results in better performance. Moreover, the performance of using the optimal threshold and dynamic threshold is approximately the same. However, the Rayleigh distribution method is more suitable for situations where the attacker is extremely close to the receiver.

VI. COMPARISON WITH EXISTING TECHNIQUES
In this section, we compare the proposed detection methods with the existing works.  Table 4 lists measurements on the related works in terms of complexity, efficiency, update requirement, data source, and the technique basis. In Table 4, there are two works [19] [19] detects spoofing by monitoring the direction of arrival using one direction antenna. While [18] employs distributed multiple directional antennas to analyse the different pseudo-range residuals to estimate the spoofed time error. Other detection methods require firmware updates. Among these, there are four works [42], [43], [44], [45] based on the data from power grids. [42] uses the inherent hardware oscillator in power grids as the frequency state reference and does spoofing detection by monitoring the state changes. [43] uses the rotor angles of generator buses of power grids as features to train a Neural network. [44] uses multiple features of power grids, such as bus voltage magnitudes, phase angles, and generator speed, to estimate a quasi-dynamic estimation for spoofing classification. Similarly, [45] employs rotor angle, rotor speed, and internal voltage to do a generalized likelihood ratio-based hypotheses classification. These power grids data based techniques and [40] are less efficient in terms of processing, since they need to wait for the information from other sources. Although [40] does not use additional antennas or data from power grids, it collects GPS signals from multiple receivers and uses the extracted P(Y) signal to form a network for spoofing detection. Only [26], [30], [34], [39], [41] are based on solely the received GPS signals. However, [39], [41] build neural network models for signal processing. Although they are accurate in predicting spoofing, they require extra resources for data collection and training to generate a fitted model. [34] proposed a method based on sensing the distortion of signal correlation peaks and power.

A. GENERAL COMPARISON
To further evaluate the proposed technique, we compare it with the works in [26] and [30], both papers are based on the correlation process during tracking signals. The technique in [26] detects spoofing by monitoring the first-order derivation of the S-curve Bias (SCB). In [26], the non-coherent discriminator, I 2 E + Q 2 E − I 2 L + Q 2 L , are considered as the code loop discriminator (CLD) in the tracking loop, and the VOLUME 4, 2023 167 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. output of CLD are collected as an S-curve. As shown in Fig.  19, theoretically, when the local replica is promptly aligned with the incoming signal (offset of code phase is zero), the value of the S-curve is zero. Due to the noise and distortion by front-end processing, the offset of code phase usually are not at zero for zero value S-curve. [26] invites this offset deviation as SCB. As mentioned in [26], the SCBs fluctuate around zero without spoofing attack while they fluctuate falling or increasing significantly with a spoofing attack. Hence, a proper threshold is expected to include the first-order derivation of SCBs without spoofing and exclude that with spoofing. The technique in [30] discriminates the correlation peaks based on least absolute shrinkage and selection operator (LASSO). The incoming signals are multiplied with the local replicas for code phase selection. As shown in Fig. 20, there is only one peak when authentic signal correlate with local replicas. However, the received signal are a combination of authentic signal and signals from multipath or spoofing which leads the correlation result with many peaks, as shown by the orange curve. This correlation result can be broke down to the summation of correlation of authentic signals in different delay. To calculate the optimal combination of authentic correlation results, [30] uses LASSO to solve the convex optimization problem. The output are coefficients of each early or late authentic signal replica. In the legitimate scenario, the coefficients of authentic signal replicas are significantly smaller than the coefficient of authentic signal, since multipath signals usually experience more attenuation. In the attack scenario, the coefficients that correspond to spoofing signals are noticeably greater than the coefficients that correspond to multipath signals, since the spoofing signal will be transmitted at higher power to get tracked by a victim receiver. Hence, as proposed by [30], the spoofing attack can be detected by monitoring the ratio of coefficients.

B. MISCLASSIFICATIONS
For comparison, we consider the worst case in our result where the P FA is around 70%. When applying the SCB method [26] to our data set, 1.1 × 10 − 3 is used as the detection threshold for P FA ≈ 70%. When applying the LASSO method [30] to our data set, the threshold is 0.73. Other settings are same with [26] and [30]. The detection results are listed in Table 5. The proposed method outperforms the others in all attacking scenarios. Moreover, the detection methods in [26] and [30] make use of the correlator output of the tracking loop while the proposed method does not have this requirement. This leads to significant reduction of the detection time in the proposed technique.

C. COMPUTATIONAL COMPLEXITY
The proposed technique is based on comparing the variance of the signal envelope against a threshold. Therefore, the only thing that needs to be computed to detect spoofing is the variance of the signal envelope. Many algorithms for calculating the variance for a set of samples with size n have a worst case running time of O(n) [46]. As the computational complexity of comparing two values is given by O(1); Thus, the proposed technique's computational complexity can be given by O(n + 1) = O(n). Table 6 presents a comparison of the proposed technique with those techniques in Table 4 that have a low computational complexity. We observe that the proposed technique clearly outperforms the existing lightweight techniques for spoofing detection.

VII. CONCLUSION
This paper presented a light-weight technique for detecting GPS spoofing attacks. The proposed technique is based on an analytical model of the distribution of a signal's envelope. The variance of the received signal's envelope is shown to be significantly different for an attack and legitimate scenarios. Thus, the proposed technique uses a threshold for the variance of samples in a signal envelope. We also observed that the threshold for variance is sensitive to the distance of an attacker. Therefore, we presented a technique to dynamically select the threshold based on the dispersion value of the variance. Experiments on actual hardware show the effectiveness of the proposed technique. We observe that the proposed technique can detect GPS spoofing with probability of detection greater than 90%.