SPARROW: A Novel Covert Communication Scheme Exploiting Broadcast Signals in LTE, 5G & Beyond

—This work 1 proposes a novel framework to identify and exploit vulnerable MAC layer procedures in commercial wireless technologies for covert communication. Examples of covert communication include data exﬁltration, remote command-and-control (CnC) and espionage. In this framework, the SPARROW schemes use the broadcast power of incumbent wireless networks to covertly relay messages across a long distance without connecting to them. This enables the SPARROW schemes to bypass all security and lawful-intercept systems and gain ample advantage over existing covert techniques in terms of maximum anonymity, more miles per Watts and less hardware. The SPARROW schemes can also serve as an efﬁcient solution for long-range M2M applications. This paper details one recently disclosed 2 vulnerability in the common random-access procedure in the LTE and 5G standards This work also proposes a rigorous remediation for similar access procedures in current and future standards that disrupts the most sophisticated SPARROW schemes with minimal impact on other users.


I. INTRODUCTION
Covert communication can be a suitable term for referencing a wide array of security threats, such as data exfiltration, remote command-and-control (CnC) and espionage.The parties engaging in covert communication strive to stay anonymous and circumvent security and lawful-interception systems that actively inspect the incumbent means of communication.The rapid adoption of connectivity solutions has made covert communication schemes an integral part of most advanced security threats [2].The published literature on covert communication schemes can be logically split between two dominant viewpoints: exploiting existing software protocols and designing new radio access solutions.
The first viewpoint includes data exfiltration and CnC techniques that are well-known topics in the cybersecurity community.Data exfiltration involves covertly extracting and communicating sensitive information from a compromised system.Exemplary CnC implementations include malicious software or hardware agents that are configured to covertly communicate across an Internet protocol (IP) network.These techniques usually involve exploiting application or network protocols to tunnel messages between two hosts connected to the Internet.Famous examples include ICMP and DNS tunneling [3] [4].To counter such threats, the cybersecurity industry constantly monitors emerging techniques and adopts countermeasures to detect and block them.Once disclosed, these techniques rapidly lose their potency as vulnerable systems and security devices install software updates that implement the countermeasures.
One the other hand, designing radios for covert communication has long been of research interest in the wireless community.Covert communication devices usually access the radio spectrum without a license acquisition and generally employ low-power, ad-hoc radios that use PHY-layer technologies such as spread-spectrum.Low power commercialized ad-hoc technologies such as LoRA and ham radios can be engineered for covert communication, but unlike commercial radios, these systems usually sacrifice transmit power and data-rate in favor of defeating spectrum monitoring and jamming systems [5].These power and data-rate limitations, along with a lack of access to elevated antennas or high transmission power, significantly reduces the operation range of these devices, particularly in indoor-to-outdoor communication scenarios [6].
To counter this type of security threat, spectrum monitoring and intelligence systems are constantly evolving to disrupt and locate the radios used for covert communication.Drawing upon elements of these previous approaches, this work introduces a novel and elegant methodology for identifying and mitigating covert communication referred to herein as exploit-ing radio infrastructure.Section II presents a comprehensive framework to identify and exploit vulnerable procedures in MAC protocols of Wireless Carrier Networks (WCN), such as cellular and satellite communication.It shows how WCN user devices can be co-opted and transformed into SPARROW devices that exploit the broadcast power of a WCN radio access node for the purposes of covert communication.Figure 1 illustrates the implicit communication channel between SPARROW devices and a victim cellular station, which acts as an unwitting message relay.
Section III details a new responsibly disclosed security vulnerability example, which involves LTE/5G random-access procedures.This vulnerability enables SPARROW mobile devices to exchanged short messages within a cell without connecting to the cellular network.The SPARROW devices can be made identical to other user devices as far as size and transmit power concerned.Detailed in Section IV, the SPARROW covert communication scheme can be shown to outperform existing covert techniques in terms of: maximum anonymity, operation range and hardware footprint.These techniques can bypass all current security and lawful-interception systems, as well as all current spectrum monitoring and intelligence systems.This enables SPARROW attack techniques to be used in a wide variety of covert communication scenarios.Nevertheless, it will be appreciated that these techniques can also be used in good faith with the consent of incumbent network operators in scenarios such as connection-less M2M communication and disaster recovery efforts.
Random-access procedures, which facilitate wireless link establishment, are common in many wireless MAC protocols and, in fact, the vulnerable one in 5G/LTE has been implemented in the standards for over a decade.This fact was one of the primary motivations for developing a rigorous remediation scheme capable of hardening random-access procedures with respect to SPARROW threats in current and future wireless standards.Section V explores possible remediation strategies and makes the case to move forward with a novel entropyleveraging strategy which involves the use of content obfuscation in unprotected broadcast messages.Section VI provides the mathematical foundation to analyze this entropy-leveraging strategy and understand the inherent trade-off between protection and performance.This section culminates with the analysis of two simple examples of the entropy leveraging scheme and their respective limitations in combating more sophisticated SPARROW attack schemes.Section VII proposes an enhanced entropy leveraging scheme called ELISHA (entropy-leveraged irreversible security hashing algorithm), which is used to efficiently disrupt most advanced SPARROW attacks with minimal impact on the network performance for other users.This minimally disruptive performance is achieved through a novel combination of cryptographic hash functions and random bit operations.A rigorous analysis of this advanced technique and the associated efficacy and overheads is provided to support the consideration and potential adoption of the disclosed SPARROW mitigation approach(es) by the relevant wireless standard setting organizations.The numerical results presented in Section VIII illustrate how to optimize the design parameters in an ELISHA-based mitigation scheme to achieve desired levels of protection against SPARROW attacks while concurrently preserving user performance.Finally the concluding remarks are presented in Section IX along with acknowledgment notes.

II. GENERAL EXPLOITATION MODEL
This section provides an overview of Wireless Carrier Networks (WCNs) architectures and their associated resources.Also provided is a novel methodology that can be used to identify weaknesses in the MAC layer protocols utilized in these networks, which can be exploited by malicious actors to leverage a WCN's broadcast resources for SPARROW covert communication.

A. Overview of WCNs
WCN is the general term adopted in this work to reference technologies such as cellular (3G/LTE/5G), WiMAX and Satellite Internet.WCNs are deployed by service providers (operators) to offer secure wireless data connectivity to a large number of users in wide geographical areas, often via a subscription model.WCNs differ from end-user wireless technologies, such as wireless local area networks (WLANs), in several key aspects including: resources, architecture and user control.
A WCN can often be broken into two components: the Radio Access Network (RAN) and the Core Network (CN).The RAN consists of a network of radio access nodes (cellular stations in cellular terminology), which provide wireless connectivity between user devices and the CN.The CN hosts servers, which manage RAN operation and connect user devices to other networks, such as the Internet.The standard governing a WCN typically involves several protocol layers, collectively called control-plane, which define the interaction of user devices with the RAN and CN components.The control-plane procedures are abstracted away from user applications, which are only concerned with data connectivity to the Internet (data-plane).Related to the scope of this work, the Medium Access Control (MAC) protocol layer defines the interaction of a single radio access node with the user devices in its coverage area.
To enhance signal radiation, the radio access nodes can have antennas mounted on tall structures or be installed on satellites.It enables them to clearly communicate with user devices miles away over radio frequency (RF) signals that are otherwise blocked by terrestrial clutter, such as foliage and buildings.High transmit power, licensed spectrum and higher altitudes are luxuries available to WCNs compared to other technologies, such as WLAN, that offer limited coverage indoors or under terrestrial clutter.Nowadays, it is hard to find a neighborhood without WCN service coverage.Nevertheless, these WCN operators must comply with numerous government regulations concerning their resources, infrastructure, and the users' activity.
In most WCNs, the user devices have to authenticate with a CN entity (e.g.AAA servers) before accessing any of the network services.In addition to user-credential registries, there are CN servers constantly collecting user activity metadata, such as service usage, location, etc..The metadata are then consumed internally by the WCN operator or shared with government authorities in compliance with Lawful Intercept (LI) regulations [7].It should be noted that the MAC layer protocols implemented by WCNs prohibit the user devices from engaging in untraceable peer-to-peer wireless communications, which this work aims to prove is currently still possible.

B. Exploitation Scenario
For the ease of illustration, a generic unidirectional covert communication scheme formulated for the following hypothetical scenario exploiting cellular technology.It can be extended to other WCN technologies without loss of generality: Scenario 1. Trudy intrudes a cyber air-gaped facility under heavy surveillance and wishes to send a set of covert messages to her counterpart Ricky with a passive receiver outside.To maintain their cover, both agents cannot connect to any IP networks.They also cannot use any ad-hoc radios due to spectrum surveillance and insufficient signal range.However, both are equipped with low-power radio devices that can interact with a nearby cellular station.Trudy programs its device to exploit the vulnerability in its MAC layer protocol to implicitly relay messages to Ricky without authenticating with the carrier's network.
MAC layer protocol procedures can be expressed as a flow of messages exchanged between each user device and a WCN radio access node.In this case, Trudy and Ricky are assumed to have constructed a code-book for their communication scheme.This code-book consists of a set of possible messages (codewords) that Trudy can transmit to Ricky.Let M = {m 1 , m 2 , • • • , m 2 M } denote the code-book including a set of 2 M distinct MAC layer messages, where each m i encodes Mbits of information.The messages in M can trigger the set of distinct response messages 3) Anonymous Uplink: Sending messages in M does not reveal transmitter's identity to the WCN.4) Stateless Uplink: The protocol does not mandate any correlation between consecutive uplink messages in M, The condition for passive reception guarantees anonymity for Ricky implying that the cellular station broadcasts messages with the most basic modulation and coding scheme at the PHY layer.The second condition requires sufficiently strong bijective correlation between the messages in Trudy's code-book and their resulted downlink broadcast messages.It would be ideal to have a deterministic bijectivity discovered in the protocol standard documentation.Although, the same cellular station is serving other users who may confuse Ricky by accidentally triggering broadcast messages in B. Since the presence of Ricky is unknown to the cellular station, he cannot rely on the reliability mechanisms built in most wireless standards.It is recommended the codewords in M having integrity-check features to minimize degradation of bijectivity due to conditions caused by other users' activities and PHY layer noise.
Fig. 1 depicts how the vulnerability outlined in Proposition 1 leads to the execution of Scenario 1. Trudy and Ricky decide their optimal code-book and the target cellular station.Then Trudy transmits a codeword m i at a time that triggers distinct broadcast signal b i from the cellular station received by Ricky.This notifies Ricky about the transmission of m i from Trudy according to the bijectivity condition.Thus, a virtual low power covert communication channel (dotted red line) is established from Trudy to Ricky.This channel takes advantage of the relatively high broadcast transmit power of the cellular station.It also allows Trudy to bypass network security devices, LI mechanisms in the operator network, and spectrum monitoring systems.The anonymous uplink condition could be relaxed in some other scenarios.If required, it limits the search space for M to the early control messages that devices exchange with cellular station before authentication and data connection.Assuming Ricky is able to correctly decode the messages in near real-time, the stateless uplink condition allows Trudy to send a message in every τ seconds.Therefore, they can achieve data rates in order of M τ bits per seconds.

III. SPARROW SCHEME IN LTE & 5G
This section shows a simple example of a SPARROW attack scheme, which exploits the random-access (RA) procedure in the LTE and 5G protocol standards (defined in section 5.1.5 of 3GPP specification documents TS36.321 [8]).It leads to a realization of Scenario 1 using the resources of any currently deployed LTE or 5G cellular stations (world wide).Adopting the 3GPP standard terminology moving forward, e/gNB refers to an LTE or 5G target cellular station, specifically the serving sector of the cellular station covering both Ricky and Trudy.Any user device interacting with an e/gNB is called a UE (User Equipment).This procedure has been present in the 3GPP standard since the early releases of LTE (Release 8) and may possibly be used in other non LTE/5G wireless technologies, as well.
A. Normal RA Procedure Fig. 2: Random-access (RA) procedure in LTE/5G including the contention resolution.
Figure 2 illustrates the initial messages exchanged between a UE trying to connect to an e/gNB for the first time.The first four messages (M sg1 to M sg4) are of particular interest, as they do not involve any authentication or encryption.Regardless of its type or registration identity, any UE can send M sg1 and M sg3 to any e/gNB that responds with M sg2 and M sg4 in the basic transmission mode (like broadcast SRBs).Thus, M sg2 and M sg4 are passively receivable a cell coverage area.
After downlink synchronization and decoding system information broadcast, a visiting UE starts RA by sending M sg1 that contains a randomly selected RACH preamble sequence identified by RAPID (RACH preamble ID).There is a limited set of RACH preambles allocated to each cell.Upon receiving M sg1, the e/gNB allocates a RA-RNTI (RA radio network temporary identity) to UE which is directly computed from M sg1 transmission time-slot.It then uses RA-RNTI to signal M sg2 to the UE which is actively decoding DCI blocks (downlink control information) associated with its pre-computed RA-RNTI.The psuedo-random property of RACH-preambles enables an e/gNB to estimate TA (timing advanced).The UE has to synchronize its uplink according to TA for any subsequent messages.The e/gNB releases RA-RNTI and allocates TC-RNTI (temporary cell RANTI) to the UE.It includes TA, TC-RNTI and some other configuration messages in M sg2 sent to the UE.
The UEs without prior network connection have to engage in a procedure known as contention resolution involving M sg3 and M sg4.To confirm the successful reception of M sg1, the UE transmits a randomly-generated Contention Resolution Identity (CRI) in M sg3 according to sections 6.1.3.3-4 in [8].More precisely, CRI is 48-bits datum containing 40-bits of randomly selected bits.The e/gNB acknowledges M sg3 by broadcasting the received CRI value in M sg4 which is signaled via TC-RNTI sent in M sg2.The UE compares the CRI value in M sg4 against its randomly selected value.If they match, it assumes RA successful and proceeds with the next steps to connect to the network.Otherwise, the UE has to back off and retry RA procedure.Per section 5.1.4in [8], a UE can freely re-attempt RA after a randomly selected backoff time.However, there is no practical way for the e/gNB to enforce the backoff value.Consequently a UE can always select a minimum value between consecutive RA attempts.The purpose of contention resolution is further detailed in Section V-A.

C. Implementation 3
Trudy can break longer messages into chunks of 40-bits (or less) and transmit them in consecutive attempts.There are resources providing an average estimate of RA-procedure duration times (from M sg1 to M sg4) including [10] expecting it to be around 30ms in typical LTE deployments.Taking this estimate and accounting for additional 10ms of backoff between multiple attempts, Ricky and Trudy can achieve near 1kbps throughput in this scheme.The offered throughput suits IoT and M2M (machine-to-machine) applications that currently use low-power technologies such as LoRA [6].However, SPARROW scheme can achieve longer range in cluttered environment without any direct access to RF spectrum.Section IV expands further into its features and the applications.
It will be appreciated that the RA procedure is agnostic to the PHY layer frequency band.However, the lower frequency bands in LTE and 5G WCNs better suits the objectives of Scenario 1.As far as RA concerned, the cell range depends on the PRACH preamble zero-correlation-zone configuration (Ncs) of the e/gNB (illustrated in section 24.8 of [11]).For typical outdoor LTE macro cells, Ncs is set to 9 or larger values that enables UEs to perform RA as far as 5 miles from the cell.5G-NR (new radio) standards enable utilizing higher frequency bands above 6 GHz (FR2) that rely on beam-forming and multiple-antenna transmission modes.Nevertheless, the underlying RA procedure PHY layer is still very similar to LTE in sub-6 Ghz (FR1) and therefore, more promising.Depending on the application, SPARROW UEs (Ricky and Trudy) can exploit multiple cells for throughput or operational range enhancements.Figure 4 shows how two cells can be exploited to achieve parallel covert communication channels.With the exception of very rural environments, UEs within the range of a few miles can be covered by multiple overlapping LTE or 5G sectors, which can be exploited for more throughput or a reverse link from Ricky to Trudy.
Figure 5 depicts a more interesting case involving a relay UE to extend the operational range beyond a single cell coverage.Relay UEs are placed in the handover (coverage overlap) region between adjacent cells.These relays can be configured to act as a proxy for Ricky, receiving a message in one cell and transmitting it in another adjacent cell.The SPAR-ROW UEs are effectively low-power cellular modems that can operate off of rechargeable batteries.Thus a rechargeable relay UE can operate from any inconspicuous location in between cells.One can create a wide-area IoT mesh using relay UEs communicating via SPARROW.[12].They can autonomously operate on batteries or harvest energy from the environment.Thus, they deserve to be named after Sparrows!
It is worth mentioning that the operational range of SPAR-ROW UEs has the potential to dramatically increase with the emergence of new satellite-based technologies such as 5G-NTN [13] should the same vulnerable contention resolution procedure remain in the standard.Bypassing most of known measures against covert-communication, the SPAR-ROW scheme can remain a potential threat until the standards patch the MAC protocol procedures subject to Proposition 1. Prominent threat scenarios include: Having the consent of the incumbent network operators, SPARROW scheme can also be used in good faith scenarios such as: • Connection-less M2M Communication: Some M2M (machine-to-machine) application require extremely low latency and power consumption.The SPARROW scheme can compliment the existing solutions such as [14] that enables the devices to communicate via encoding information in LTE M sg1 and M sg2.• Disaster Recovery: SPARROW devices can be used to take advantage of partially functional access radio nodes to exchange critical messages in disaster situations without requiring authentication or back-haul connection to a core network.

V. SPARROW REMEDIATION STRATEGIES
The contention resolution procedure facilitated by M sg3 and M sg4, which was exploited in Section III, plays an essential part of cellular RA procedure and may have analogues in other WCN protocol standards.Therefore, the presented remediation strategies focus on thwarting their exploitation while preserving their role in contention resolution procedure.This is a more challenging problem than offering a generic remediation to the generic scheme in Section II.After explaining the constraints of contention resolution mechanism, it will be clear that the remediation strategies will be limited.

A. Contention Resolution Mechanism
There are a limited number of RACH-preambles (e.g.64 for LTE) available at each cell for random selection by UEs attempting to access the network.Depending on duplexing configuration mode, there is a limited set of PRACH resources associated with RA-RNTIs.Most low-frequency LTE/5G bands operate in FDD mode (frequency division duplexing) where there is only 10 PRACH resources [11].Consequently, there is always a real possibility that multiple UEs end up with the same RA-RNTI and that the random preambles cause a resource contention event.In most cases, the cellular station can only decode a single preamble transmission and is oblivious of any underlying contention event.To avoid any subsequent uplink interference between the UEs, the cellular station needs a mechanism to immediately signal the UE with successful M sg1Msg1 to proceed and all other unsuccessful UEs to back off and retry RA.In this early stage of RA, there are no unique identities assigned to the UEs and the contending UEs follow the same protocol procedure in isolation.Going back to M sg3 and M sg4 in Figure 2, the only plausible resolution is having each UE to test the success of M sg1 by sending a purely random N -bits within CRI in M sg3.Then the cellular station most likely receives M sg3 from the UE with successful M sg1 and acknowledges it with rebroadcasting its CRI in M sg4.Since the cellular station does not have any knowledge of distance or channel conditions of all contending UEs, it transmit M sg4 similar to other cell broadcast messages which are receivable everywhere in its coverage area.The value N also has to be large enough (N = 40 in current 3GPP standards) to minimize the probability of identity collision, otherwise the contention may drag beyond RA procedure.Contention is ultimately resolved when each UE compares CRI in M sg4 against what they transmitted in M sg3.

B. Strategy Paths
Before presenting the proposed strategy of this work, it is worth enumerating the following strategies that can have significant performance overhead, risks, or offer limited protection against SPARROW in contention resolution procedure: exploitation by monitoring the random-access activity patterns in the cell.However, the cell station does not have a reliable way of differentiating uplink messages from SPARROW UEs.The SPARROW UEs can adopt various evasion techniques, such as slowing their activity, routinely changing their CRI code-book and using different random access initial resources.Thus, this strategy can risk the cellular station service availability without offering quantifiable levels of protection.
The remainder of this work focuses on the entropy-leveraging remediation strategy.Compared to other strategies, it can practically prevent exploitation with minimal performance impact and no collateral security risks.In the context of Proposition 1, it aims at mitigating the deterministic bijectivity condition between M sg3 and M sg4.The contention resolution procedure in a vulnerable MAC protocol should be modified as follows: • Entropy-Leveraging: This strategy allows the UEs to randomly select the CRI content in M sg3.However, it requires the cellular station to obfuscate the content received in M sg3 with a random pattern before broadcasting it in M sg4.The M sg4 broadcast has two components the obfuscated message and some helper information, i.e. a hint.Each UE should process the received M sg4 and their M sg3 by a decision function defined in the standard to determine the next RA step.The hint plays an essential role in the decision function.It is designed to ensure the intended UE proceeds while the contending others back off.On the other hand, the obfuscated content of M sg4 should prevent the SPARROW UEs to form stable codebooks M and B.

VI. ANALYSIS OF ENTROPY-LEVERAGING STRATEGY
This section is dedicated to analyzing potential remediation schemes following the entropy-leveraging strategy.It first seeks to quantify its impact on the contention resolution performance and explores the theoretical trade-off between the remediation objectives: disrupting SPARROW scheme while preserving contention resolution performance.Section VII then details the elements needed to build a practically optimal scheme.

A. Formulation
Expanding the notation in Section II and Scenario 1, the following steps detail contention resolution process in a entropyleveraging scheme: 1) Uplink Message: The random N -bits identities selected by contending UEs form a set of independent identically distributed (i.i.d) random variables with a uniform distribution on support set Lets X i ∼ U(2 −N ) be the discrete random variable denoting M sg3 transmitted by the i-th contending UE in the cell.Analyzing a single exchange, the time has been omitted for brevity.On the other hand, random variable X ′ denotes Trudy's M sg3 transmission from codebook M ⊂ U N .2) Obfuscated Broadcast: The cellular station receives only one of the M sg3 transmissions that is denoted by X ∈ {X 1 , X 2 , X ′ }.The cellular station cannot detect the source of X.It then derives M sg4 modeled by random variable Y = [B(X), h], where B is the broadcast obfuscation function defined in the standard along with the hint value h.In order to facilitate contention resolution, the cellular station includes h in the M sg4 broadcast, where h is a parameter that is intended to help UEs make correct RA decisions.Depending on the desired level of protection, there could be multiple pre-defined choices of B in the standard where a cellular station announces its choice in the periodic broadcast signals or M sg2.This will ensure the UEs adjust their decision functions accordingly to process M sg4. 3) Downlink Processing: Any choice for B should be accompanied with a well-defined UE decision function D = D(Y, X i ) ∈ {0, 1}, where 0 and 1 are respectively interpreted as RA success or failure commands for the i-th UE.Given Y , an ideal choice of D should almost surely evaluate to 1 for no more than one of the contending UEs.The decision function should also have the following property to eliminate the possibility of a livelock where all UEs arrive at failed RA decision.

P r(D(Y, X
Knowing the choice for B and code-book M, Ricky attempts to recover X ′ from M sg4 by devising an estimation function.Let X ′′ = E(Y ) be the random variable representing Ricky's estimated codeword.According Proposition 1, Ricky should design E(Y ) to minimize its estimation error probability, P r(X ′′ = X ′ ).It also has to keep the code-book small enough to distinguish between broadcast triggered by Trudy and other UEs.There is always a chance to have another UE in the cell randomly select X i ∈ M. As described in Section V-A, the performance of contention resolution process requires low identity collision probability to ensure only one of the contending UEs succeeds in RA.In practice, having more than two UEs simultaneously attempting RA using the same preamble is a rare event that may occur while a cellular station recovers from a maintenance outage.Thus, moving forward, the contention scenarios is considered to only involve two UEs as the most probable scenario, i.e. i ∈ 1, 2. The identity collision probability, P C for this scenario defined as follows: Here we assume both UEs can decode M sg4 error free.Considering the uniform i.i.d property of X i , the expression in ( 2) can be further expanded to It implies that 2 −N is the minimum achievable value for P C when B(X) is an injective function.For instance, the current state of the standard described in SectionIII retrofits in this model with B(X) = X and the identity check decision function D(X i , Y ) = δ(X − X i ).

B. Trade-Off Bounds
Introducing any obfuscation function should mitigate the achievable error-free data rate for SPARROW UEs while maintaining low P C .The data rate of SPARROW UE depends on their strategy in selection of code-book M and the estimation function E(Y ) to overcome channel entropy introduced by B(X).Theoretically, it is desired to minimize SPARROW UE's maximum achievable data rate called channel capacity in the context of information theory.SPARROW UEs may achieve the channel capacity by employing sophisticated forward error-correcting (FEC) code-books in the long run.The channel capacity may not be achievable in practice, but certainly highlights the inherent trade-off between protection and performance.Given X = X ′ , the channel capacity for SPARROW UEs is the maximum mutual information quantity as defined below: where H(.) denotes Shannon entropy.So, the remediation schemes should aim at designing B(X) so that H(X|Y ) is maximized.On the other hand, applying Fano's inequality to the expression in (3), it can be shown that H(X|Y ) directly contributes to a lower bound on P C [16].Hence, the entropy-leveraging strategy is bound to a trade-off between the contention resolution performance (low H(X|Y )) and blocking the SPARROW UEs (high H(X|Y )).
The same trade-off can also be derived from a more intuitive viewpoint.The SPARROW UEs send M sg3 similarly to the other UEs in the cell.There is also a noticeable correlation between the ways they process M sg4 in the entropyleveraging strategy.Ricky optimizes the estimation function E(Y ) to processes M sg4 to recover Trudy's message among all the candidates in the code-book M. Now consider the rarest contention resolution situation where 2 M normal UEs are in contention and each happen to pick a distinct M sg3 identities from M. To resolve the contention, D(Y, X i ) should evaluate to 1 only for the intended UE and 0 for the rest of them.Thus, the collective outcome of D(Y, X i ) serve the same purpose as E(Y ) in this hypothetical scenario leading to a very tight trade-off to design the broadcast obfuscation function B(X) to disrupt the functionality for E(Y ) while preserving the resolution condition for D(Y, X i ).
In reality, we consider the contention scenario between as few as two UEs that will relax the trade-off.Also, this trade-off does not account for the fact that M should be small enough to reduce the chance of the M sg4 intended for other UEs being misinterpreted by Ricky as legitimate messages from Trudy.The channel capacity derivation in (4) does not account for the minimum viable information needed per attempt for synchronized reception.Ricky will need to reliably identify messages sent from M in every attempt.In this context, the presented trade-off implies that the entropyleveraging schemes impact the performance of contention resolution rather than dismissing their feasibility.
Remark.It is worth noting that using Cryptographic Hash Functions (CHF) to obfuscate M sg4 will not serve as a proper remediation since H(X|Y ) = 0.Even with random salting, M sg4 has to include the salt as the hint for the normal UEs to repeat the same computation with their M sg3 in the downlink processing step.For any given M, Ricky can compute the hash value for all of its elements (preimage table) that forms B of all possible M sg4 bijectively correlated to M. Using CHF will only imposes some modest computational complexity to Ricky.

C. Solution Examples
Examples of entropy-leveraging scheme should involve reducing H(X|Y ) in (4).The following simple schemes are inspired by known binary noisy channel models in communications theory [17]: • K-errors: Similarly to a Binary Symmetric Channel (BSC), the cellular station induces bit errors at K random positions of the UE identity X in M sg3.Each time it generates a random N -bits error mask e K ∈ U N with Hamming weight (number of set bits) K that is used to derive the M sg4 broadcast message: where ⊕ denotes bit-wise XOR operator.The value of K should be previously signaled to UEs or included as a hint to facilitate the decision function of the normal UEs: implying that each UE computes the Hamming distance M sg4 with its M sg3 denoted by d H (., .)and compares the results against K. • K-erasures: Inspired by Binary Erasure Channel (BEC), the cellular station omits bits at K random positions of the UE identity X in M sg3.Each time it generates a random N -bits erasure mask e K ∈ U N with Hamming weight (number of set bits) K that is used to derive the M sg4 broadcast message: where ⊘ denotes bit-wise bit erasure operator producing N −K remaining from X.It is worth to note that the size of M sg4 is 2N + K that depends on K as depicted in diagram in Figure 6.The contending UEs will need e K as a hint to extract corresponding bits from their M sg3 in their decision function: The value of K in both schemes does not have to be random for each message and it can be selected to balance the trade-off between remediation and contention resolution performance.
Using (3), it is straightforward to derive the identity collision probability for each scheme as: K-errors: These are plotted for different values of K in Figure 7.For both schemes it can be shown that the SPARROW channel capacity is inversely related to P C conforming to the design trade-off in Section VI-B.
SPARROW UEs can use FEC codes which is a well-studies topic in these scenarios.For example, to circumvent the Kerasures schemes, Trudy can retransmit the messages multiple times to increase the chance Ricky recovering all of its randomly erased bits.To survive a K-errors scheme, Trudy can construct a code-books with minimum hamming distances larger than K whose code words can still be uniquely distinguishable by Ricky despite the errors.This can be impractical if the K-errors scheme employes K = N/2.It will be appreciated that this is not a good choice for contention resolution performance since it leads to large P C ≈ 0.1 as shown in Figure 7.After all, the subpar effectiveness of these schemes in blocking SPARROW UEs versus their impact on the contention resolution performance still does not make them very attractive.

VII. ENTROPY-LEVERAGED IRREVERSIBLE SECURITY HASHING ALGORITHM (ELISHA)
The effectiveness of the schemes in Section VI-C can be significantly improved by preventing SPARROW UEs from employing FEC code-books.The entropy-leveraging scheme proposed herein achieves this goal by taking advantage of irreversible properties of randomly salted (nonced) CHF.

A. Design Architecture
Compared to previous schemes, ELISHA applies random bit-erasures (or bit-errors) to a CHF digest of the M sg3. Figure 8 illustrates the elements of ELISHA broadcast obfuscation function built based on the K-erasure scheme.Moving forward, this will serve as the reference model for the ease of analysis.The received N -bits identity in M sg3 is processed through a CHF denoted by C(X, s) with an optional randomly generated salting nonce s (S-bits size) to produce an L-bits hash digest.The CHFs are designed to maintain bijectivity (unique output for unique inputs) and be computationally irreversible.Here the salting refers to mixing the input with a random s before computing the hash, so the same input produces a different hash digest every time.There are a variety of choices for C(X, s), ranging from sophisticated SHA family to simpler MD family that usually result in L > N .The choice of C(X, s) should be communicated to UEs in prior broadcast messages.Choosing the right CHF involves other practical considerations that are beyond the scope of this work.
The CHF output then undergoes a K-erasures process to generate the obfuscated broadcast message of size L − K.The cellular station uses a randomly-generated L-bits erasure mask e K of Hamming weight K (K set bits) every time.
Both s and e K are encoded in the hint section of M sg4 resulting to a broadcast message of total size in order of 2L + S − K bits.The increase in broadcast message size can be addressed during the implementation, although it is insignificant for most modern wireless technologies operating with large transmission bandwidths.The i-th contending UE computes B(X i ) using the hint information and proceeds if it equals to B(X) value received in M sg4: Using C(X i , s) with negligible hash collision probability, the impact of this scheme on contention resolution performance is similar to the K-erasures scheme in (9).It can shown that the identity collision probability depends on the choice of L and K as follows

B. Remediation Strength
As discussed in Section VI-C, Trudy and Ricky seek to construct the code-book M of M sg3 identity messages with FEC properties to recover their messages through K-erasures (or K-errors) and approach the maximum theoretical bit rate in (10).However with ELISHA scheme, they will lose their ability to employ FEC coding schemes as described in the following remark.

Remark. Given any choice of SPARROW code-book
The irreversible property of C(X, s), makes it computationally infeasible to exert any control over the elements in C M , including FEC properties against bit-erasure (or bit-error).Exploiting the scheme illustrated in Figure 8, communicating M bits of information per attempt requires almost all elements in C M to produce 2 M unique B(X) output symbols through K-erasures process.Let denote the SPARROW communication disruption rate (probability) due to symbol aliasing.The value of P D is imposed to Ricky and Trudy regardless of their choice of M. They have to sacrifice their bit rate, which is determined by M , to barely reach P D ≪ 1 for any reliable communication.
Another advantage of ELISHA is the ability to derive the protection metric P D based on the design parameters.This derivation can be later used to balance the trade-off between low P C and high P D depending on the cell load and required level of protection against SPARROW.Calculating P D relied on the fact that C M is a collection of random L-bits strings that lose K bits at randomly selected positions through Kerasures process.The entire U L space (all possible L-bits strings) is randomly divided into 2 L−K cut-sets that each contain 2 K strings that produce the same K-erasures output symbols.The definition of P D indicates the compliment event of selecting 2 M elements from U L where each appears in a distinct cut-set.This can be computed as that can be further reduced to

VIII. NUMERICAL RESULTS
Computing P D in ( 15) is challenging for large values of L and M and may require numerical approximations.Thus all the following results are computed for L = 40.This can also represent a practical example of ELISHA with N = 40.One may use random-permutation as a proven collision-free alternative to a randomly salted CHF.It randomly permutes its input string producing and output of the same size.Considering N !possible permutations, it requires at least O(N logN ) additional bits to encode the permutation parameter as a hint in M sg4. Figure 9 indicates how increasing M rapidly increases P D particularly for large K.It demonstrates how ELISHA forces SPARROW UEs to cut their data rate to an impractical level to achieve reliability.For K ≥ 20, they significantly lose the ability to reliably communicate a single byte that is desired for Ricky to somehow distinguish Trudy's messages from other cell activity.
Figure 10 demonstrates the key design trade-off as expected: imposing higher P D to SPARROW UEs will result in higher P C .Setting a desired P C will determine K from (13).The current standard provides SPARROW UEs with M = N = 40, P D = 0 and keeps P C ≈ 10 −12 which is more than sufficient to ensure successful contention resolution even in a massive cell reboot event.For P D = 0.1 as the maximum tolerable disruption by SPARROW UEs, they have to operate at much lower data rate with M = 16 at a slight expense of P C ≈ 10 −10 .It is appreciated that all of the graphs reach a plateau around P C ≈ 10 −5 where P D approaches 1 (impossible communication).The maximum tolerable P C can vary depending on the cell traffic and the level of desired protection against SPARROW.
Figure 11 demonstrates the strength of ELISHA in abating the throughput of SPARROW UEs defined by M that is  The SPARROW UEs have to use smaller codewords of size M = N bits and the only cost would be direct P C increase.The purple dotted line in Figure 11 shows the performance of such a system.However the results confirm the advantage of ELISHA over such a scheme in terms of much more protection (lower M ) for the same sacrificed performance (increased P C ).The presented results seems to indicate that ELISHA is an efficient solution to protect contention resolution procedures in LTE/5G and other technologies against SPARROW exploitation schemes.Instances of ELISHA can be potentially adopted in the WCN protocol standards as a secure contention resolution option that is enabled on access radio nodes in vicinity of targets sensitive to covert communication.

IX. CONCLUSION
This work proposed a novel framework to identify and exploit vulnerable MAC layer procedures in commercial wireless technologies for covert communication.In this framework, the SPARROW schemes use the broadcast power of incumbent wireless networks to covertly relay messages across a long distance without connecting to them.This enables the SPARROW schemes to bypass all security and lawful-intercept systems and gain ample advantage over existing covert techniques in terms of maximum anonymity, longer range and less hardware.This paper detailed CVD-2021-0045 disclosed through GSMA coordinated vulnerability disclosure program [1].This vulnerability has been in the common random-access procedure in the LTE and 5G standards for a long time.Hence, this work investigated remediation strategies tailored for this procedure including ELISHA which is a rigorous remediation that can suit other protocols as well.It can effectively disrupt the most sophisticated SPARROW schemes with a manageable system performance overheads.
Researchers are encouraged to investigate the SPARROW vulnerability conditions, outlined in Proposition 1, in other wireless MAC protocols or other aspects of LTE/5G.Also the framework can be expanded beyond just the broadcast signals to include other measurable implicit changes in the cell operation state that can be controlled by one SPARROW device and detected by another one.Finally, it is recommended to incorporate this framework in security evaluation of the emerging non-terrestrial wireless standards such as 5G-NTN that can be potentially exploited for very long range covert communication.

Fig. 3 :
Fig. 3: SPARROW scheme exploiting RA procedure in LTE/5G.According to Proposition 1, the M sg3 and M sg4 meet all of the conditions for a SPARROW scheme to work.The codebook M can be any collection of 40-bits binary data in M sg3 CRI.Transmitting M sg3 does not require a network connection or revealing Trudy's identity.Upon receiving M sg3, the victim e/gNB broadcasts the same CRI in M sg4 implying B = M.As illustrated in Figure 3, a passively scanning device can recover CRI from M sg4.To be more specific, Ricky and Trudy have a prior agreement on RAPID and RA-RNTI.Ricky then passively scans and decodes DCI values with the expected RA-RNTI for M sg2.Upon receiving a matching M sg2, it extracts its TC-RNTI content to detect and decode the subsequent M sg4.Trudy can use all or portion of 40-bit CRI content to encode its data.She also can employ some integrity check mechanism to help Ricky filter out M sg4 transmissions belonging to other UEs in the cell.

from the cellular station. Let the random variable X t ∈ M stand for the message Trudy sends at time slot t. The cellular station response to X t can be modeled with another random variable
Y t+τ ∈ B, where τ is the time lapse from the moment Trudy intends to send X t until the cellular station broadcasts the response message.
t = m

•
Data Exfiltration: as outlined in Scenario 1, , SPAR-ROW attack schemes can be an effective alternative to known data exfiltration techniques by leveraging vulnerabilities in existing network access protocols.Made in a small form factor, SPARROW devices can easily be used to smuggle data out of restricted facilities.• Command & Control: SPARROW devices can anonymously communicate with remote malicious IoT devices to trigger unwelcome events using apparently benign WCN radio signals.• Clandestine Operations: agents can anonymously communicate with SPARROW devices in hostile areas without broadcasting noticeable signals or directly accessing the incumbent networks.