Performance Analysis of a Sound-Based Steganography Wireless Sensor Network to Provide Covert Communications

: Given the existence of techniques that disrupt conventional RF communication channels, the demand for innovative alternatives to electromagnetic-based communications is clear. Covert communication, which claims to conceals the communication channel, has been explored using bio-inspired sounds in aquatic environments, but its application in terrestrial areas is largely unexplored. This work develops a mathematical analysis of a wireless sensor network that operates stealthily in outdoor environments by using birdsong audio signals from local birds for covert communication. Stored bird sounds are modified to insert sensor data while altering the sound minimally, both in characteristics and random silence/song patterns. This paper introduces a technique that modifies a fourth-level coefficient detail with a wavelet transform, then applies an inverse transform to achieve imperceptible audio modifications. The mathematical analysis includes a statistical study of the On/Off periods of different birds’ songs and a Markov chain capturing the system’s main dynamics. We derive the system throughput to highlight the potential of using birdsong as a covert communication medium in terrestrial environments. Additionally, we compare the performance of the sound-based network to the RF-based network to identify the proposed system’s capabilities.


Introduction
Wireless Sensor Networks (WSNs) are composed of a set of sensors with the capability to collect and transmit data in a determinate area to sense a specific phenomenon.WSNs can be applied to a great number of fields, to name a few: e-health, e-logistics, e-agriculture, etc. [1].These networks are crucial for tracking environmental parameters, enabling early detection of changes and efficient resource management.Given this, the use of WSNs stands out as a critical technical development, providing real-time remote data collection and analysis to improve our comprehension of dynamic ecological systems and assist preventative conservation measures [2][3][4].
The primary goal of this study is to develop a covert communication system for signaling alerts about environmental threats to protected ecosystems, such as illegal logging and species extraction.This system transmits acoustic signals from various nodes within the network to a central base station, which decodes the messages and relays them to the designated environmental protector.It ensures regular monitoring while providing immediate alerts during unauthorized activities.
Most WSNs communicate via electromagnetic signals, such as radio frequency (RF) or infrared [5], using commercial protocols like Bluetooth and IEEE802.However, attackers can disrupt RF communications to avoid detection.Therefore, using alternative communication methods, such as sound, provides an additional security layer, making the system more robust against attacks.There is a need to explore and analyze new alternatives to electromagnetic-based communication to ensure covert communications.Covert communication aims to hide the communication channel to prevent eavesdropping and jamming attacks [6,7].
Covert communication plays a critical role in ensuring the security and reliability of sensitive data transmission.The primary significance of covert communications lies in their ability to conceal the existence of the communication itself, thereby preventing detection by unauthorized parties.This is particularly crucial in scenarios involving protecting critical infrastructure, military operations, and environmental monitoring, where the interception of communications can lead to severe consequences.One of the main challenges in covert communications is maintaining a balance between data hiding and communication efficiency.The communication system must ensure that the hidden messages are not detectable by adversaries while also guaranteeing timely and reliable data transmission.This involves sophisticated techniques to embed data within carrier signals in an imperceptible yet robust manner against various types of attacks.Another significant challenge is the potential for increased complexity and energy consumption.Covert communication systems often require additional processing to encode and decode hidden messages, which can strain the resources of sensor nodes in WSNs.Therefore, designing energy-efficient methods that minimize the impact on node performance while maintaining effective covert communication is essential.
This study focuses on implementing an undetected communication system to prevent attackers from disrupting critical environmental alerts.By maintaining the element of secrecy, the system aims to provide an uninterrupted line of defense against activities that could harm the ecosystem, allowing timely responses from guardians.
In this work, we focus on bird sounds as the main communication element because they have an omnipresent quality in natural environmental conditions during daylight time, and in the majority of cases, the size of the birds and their ability to fly lets them hide in their surroundings.To effectively conceal information in the bird's sounds, we first characterize the periods where birds remain silent (Off period) and when they make sounds (On periods) in such a way as for nodes to generate the previously recorded sounds closely following the statistical properties of these On/Off periods and transmitting data packets only during the On periods.
At present, between the different proposals for creating a covert communication channel, there exists an approach that is focused on using bio-inspired sounds [8][9][10][11][12][13][14][15][16][17][18], in which they present the efficiency of the proposed communication schemes through computer simulations or physical prototypes.In these works, the authors often only consider a single sender and receiver without delving into how such a scheme would behave in a networked setting.Conversely, we develop a mathematical analysis of a communication system that not only considers a single sender and receiver but also provides network metrics such as the impact of the number of nodes in the system.Also, in our work, we consider the statistical main characteristics of the bird's specific sounds to hide the information better, i.e., to have the prerecorded sounds have the same statistical behavior as the birds.Specifically, we model these songs as an On/Off process using phase-type distributions to vary the Coefficient of Variation accordingly, and we consider that the information from nodes is transmitted only during the On periods while nodes remain silent during the Off periods.The use of the phase-type distributions, such as the Erlang and Hyper-Exponential distributions, allows the use of Markov chains to model the system.This produces a covert channel that closely follows the specific patterns of the bird's songs, making the communication system imperceptible to potential attackers.Besides, this natural On/Off process entails an important energy reduction in the system that we can measure using the derived mathematical model.
In addition to closely matching the sound/silent (On/Off ) periods to conceal the data transmission effectively, we also avoid distortion in the bird's sounds by means of steganographic techniques, which are a set of processes to hide a message in a container signal, such as text, images, audio, video, or in a specific protocol, and is employed to protect the exchanged data from malicious attacks [19].In the majority of works mentioned before, they make use of synthetic sounds that closely resemble the original sounds to use as a carrier medium.However, in our case, we are using the original audio as a container of the message hidden on it.Within the techniques used to perform steganographic methods, there is a modification over the frequency domain using the discrete wavelet transform (DWT), which provides a resolution similar to the human ear time-frequency perception [19,20].The DWT produces 2 sets of coefficients: the approximation coefficients (cA) and the detail coefficients (cD), which are related to low and high frequencies, respectively.
To hide the information sent by the nodes in the WSN, which in our case comprises a small number of bits to indicate the presence of unusual activity through the sound of shrill sounds like an electric saw or human detection sensors, we consider the time when birds are emitting their characteristic sounds (which we call songs throughout this work) inserting data in such a way as to modify them as less as possible.To this end, we develop a teletraffic analysis that captures the main dynamics of the system, such as system throughput, idle time probability, collision probability, and an energy consumption estimation.In this sense, a network model is developed below based on Continuous Time Markov chains where pre-recorded bird songs are used as carrier signals, which are modified with steganographic techniques for sharing messages using sound waves as the transmitting medium.In this sense, using a novel technique by applying the wavelet transform, the audio signal is decomposed into coefficients [21], in which a modification is made over the detail coefficients.Then, the signal is reconstructed by using the inverse wavelet transform.After these, by using the statistical variables of the song and silence times of the previously modified audios, we associate them to a phase distribution, which are based on the exponential distribution, hence they have memoryless properties, which means that the current state depends uniquely on the previous state.This distribution accurately model the behavior of the nodes in the sensor network system determined to reproduce a bio-inspired behavior.
Random medium access protocols involve N nodes transmitting data on a single channel, with potential signal distortions from simultaneous transmissions.Timing can be continuous or slotted.Detection of ongoing transmissions may or may not be possible.The simplest protocols are Pure ALOHA, allowing transmission at any time, and its variant, Slotted ALOHA, which aligns transmissions with timed slots for increased efficiency.More complex protocols like Carrier Sense are not here.The S-ALOHA protocol is simple and widely used in Ad hoc networks and sensor networks.It is also used in competing 802.11 hotspots where they cannot sense each other's presence.This protocol is wellstudied, with the purpose of identifying critical processes such as collision rates, free slots, throughput, and estimation of energy consumption.It is used for system analysis and to consider new alternatives for communication.
Building on this, the performance of the system is evaluated considering that the nodes transmit their data packets using the S-ALOHA protocol, which functions as a MAC protocol where the shared channel is slotted, i.e., nodes can only transmit at the beginning of the slot [22].In our case, it provides communication capabilities over a hidden channel, providing a different form of communication than radio frequency.The system evaluation transmission rates above the rates reported in previous works that make use of bio-inspired audios; additionally, the proposed analysis allows the system administrator to predict the system's behavior under different conditions, such as different number of nodes and transmission rates.
Summarizing, this work introduces an innovative approach to wireless sensor network (WSN) communications by utilizing bird song audio for covert communication through audio steganography techniques, thus offering a unique alternative to traditional radio communication methods.Diverging from existing methods that focus on specific insertion points in natural sounds, our project stands out by altering the entire spectrum of bird song audios for embedding secret messages.This strategy not only diversifies the communication mediums used in WSNs but also provides a solution for combating electromagnetic jamming attacks, a prevalent issue in wireless networks.Additionally, the project proposes a method based on statistical behavior for energy efficiency in the Medium Access Control (MAC) layer.This method, inspired by the sound and silence patterns in bird songs, aligns the sleep/awake cycles of network nodes with these natural patterns.By integrating the unique properties of natural sounds into technological applications, our approach marks a distinct departure from conventional techniques in covert communication and WSN functionality.Building on this, the main contributions of this work are as follows:

•
We conduct a comprehensive statistical analysis of bird sounds to determine the Coefficient of Variation (CoV) of active and inactive periods.This allows us to accurately model and account for effective data transmission within a covert channel communication system.

•
We perform a teletraffic analysis to calculate the optimal number of nodes that can transmit simultaneously.We align our model closely with the natural behavior of bird sounds by employing phase-type distributions, such as Erlang and Hyper-Exponential distributions.

•
We develop a novel steganography technique that inserts information into bird sounds with minimal distortion.This technique ensures that the covert nature of the communication channel is maintained, avoiding detection by potential attackers.

•
We investigate the Slotted ALOHA protocol to derive key performance metrics for a WSN utilizing a covert channel based on sound.This analysis provides insights into the proposed system's throughput, transmission rate, and energy consumption.
The rest of the paper is organized as follows.Section 2 presents a survey of related work.In Section 3, we review a reference framework for our study on information insertion and wireless communication.Then, in Section 4, we introduce a mathematical model based on Markov chains.We also discuss the insertion of information in this section.In Sections 7 and 8, we present our test and results, where we analyze different metrics that were mentioned earlier.Finally, Section 9 concludes this paper and outlines future research directions.

Related Work
Our project is intended to create an alternative to radio communication in WSN by using covert communication using audio steganography techniques.In the literature, various studies have focused on concealing information in audio for covert communication.For instance, Kasetty et al. [23] used Discrete Wavelet Transform (DWT) and Singular Value Decomposition (SVD) to hide dialogue within musical audio.They applied SVD to the approximation coefficients, effectively embedding secret audio signals.Similarly, Alsabhany et al. [24] proposed Adaptive Multi-Level Phase Coding (AMPC) to address performance and security issues, achieving 24 Kbps at 32 dB SNR using Interval-Centered Quantization (ICQ).Bharti et al. [25] developed a robust method against LSB removal and resampling attacks, demonstrating that their approach maintains audio quality and withstands Gaussian white noise.
In terrestrial environments, researchers have explored two natural sounds for covert communication.Jiang et al. [8] used cricket sounds, achieving a rate of 46.9 bit/s by encoding information into pulse sequences.Additionally, Jiang et al. [16] proposed a method for covert communication by mimicking the songs of the Verditer Flycatcher, achieving a transmission rate of about 58 bps with high covertness and low bit error rates in forest environments.However, they did not consider specific On/Off periods.Bird songs, particularly from Passeriform birds, offer a higher frequency range (2-7 kHz) compared to cricket sounds (around 4.3 kHz), making them suitable for our purposes [26][27][28][29].
In underwater scenarios, researchers have used aquatic animal sounds for covert communication.Liu et al. [10] mimicked dolphin whistles and clicks for synchronization and information encoding, achieving 37 bit/s.They also used whale sounds in a Direct Sequence Spread Spectrum (DSS) signal for covert communication [12].Jia et al. [11] used sea lion sounds, achieving a baud rate of 20 bit/s.Jiang et al. [9] and Li et al. [17] used sperm-whale call pulses, achieving transmission rates of 37.5 bit/s and 27 bit/s.These works focused on synchronization and information codes, while our approach modifies the entire spectrum of bird songs during On periods.Xie et al. [18] also proposed a method to enhance underwater communication security by mimicking dolphin whistles, achieving a rate of 45.28 bps.
Ahn et al. [13] used machine learning to classify dolphin whistles for covert communication, achieving low Bit Error Rates (BER).They later modeled Right Humpback Whales' whistles for information hiding [15].Xie et al. [18] used Time-Frequency Contour techniques to improve signal-to-noise ratios in dolphin whistles.Unlike these works, we use real bird sounds and modify original recordings to insert information.
The jamming attack disrupts communication channels by introducing external electromagnetic energy.Mihajlov et al. [30] showed how jamming attacks affect WSNs using MAC protocols.Pirayesh et al. [31] highlighted the vulnerability of wireless networks to jamming attacks due to open wireless channels.Our work proposes an alternative communication method to avoid electromagnetic jamming attacks.
Kafetzoglou et al. [32] proposed an energy-saving method in the MAC layer using sleep/awake schedules, synchronizing nodes only with their parent and child nodes.Our approach proposes a statistical behavior of sleep/awake periods based on the natural behavior of bird sounds to better conceal the covert channel.
In summary, while significant progress has been made in the field of covert communication using various natural sounds across different environments, our approach stands out by leveraging real bird songs for terrestrial wireless sensor networks.Unlike in previous works, that have focused on synthetic signals or water environments, our method changes original bird recordings by inserting information into them, thus enhancing the covert nature of the communication channel.By overcoming weaknesses such as those associated with electromagnetic jamming attacks and minimizing energy consumption by making use of natural behavior patterns, our technique presents itself as a viable solution for security in WSNs.

Wireless Sound-Based Channel
For this work, we considered the birds that are usually found in the Meztitlan desert region or the Oak Pine Forest, namely: Pheucticus melanocephalus, Myadestes occidentalis, Turdus migratorius, Toxostoma curvirostre, Campylorhynchus brunneicapillus, Cardinalis cardinalis.In this region, the aforementioned species of birds are omnipresent, also according to the species taxonomy, we consider the passeriform order was considered because the birds that belong to it are known to be birds who sing.We collected many recordings of these birds and analyzed both the statistical properties of the active/silent (On/Off ) periods and the frequency components of the sounds they emit to use them as carrier signals for the communication method.The birds songs' are structured as follows: The simplest sound made by a bird can be called a "note", a series of them are called "syllables" and the set of "syllables" are named as "phrases" [33].In our work, the period corresponds to the syllables.To this end, we obtained the histograms of both the On and Off periods where the On period starts when the first note is audible.We then approximate the probability density function of these periods considering an exponential distribution when the Coefficient of Variation (CoV) is equal to 1, to an Erlang distribution when the CoV < 1 and to an Hyper-Exponential distribution when CoV > 1.The reason for the use of these phase-type distributions is to propose a teletraffic analysis using Continuous Time Markov Chains to obtain the probability of having different sounds simultaneously as explained in a subsequent section.

Audio Selection
The features of the sounds that we considered have different characteristics of the duration of the song and silence periods, frequencies, and amplitudes.Hence, our work considers a two-level steganography system that hides information in an efficient manner both in terms of the dynamics of the songs (active/inactive periods) and the signal characteristics.This leads to producing a model that is not focused on specific characteristics of a single bird, and it englobes the species considered.The recordings of the bird's sounds were extracted from the xenocanto.orgwebpage, which is a collaborative project, dedicated to uploading recordings of different bird species, where the labels of the birds are verified.However, these recordings do not always provide high audio quality and the majority of them include environmental noise.

Wavelet Transform
Unlike the Fourier transform, which decomposes a signal solely into its frequency components to analyze the spectral content, the wavelet transform breaks down a signal into various scales, reflecting the level of resolution, and time locations, which represent the temporal location of the wavelet function along the time-axis.This approach provides a comprehensive analysis that encompasses both the frequency and time domains [34].Wavelet transforms achieve high temporal resolution at the cost of frequency resolution at higher frequencies.Conversely, at lower frequencies, they offer high-frequency resolution but with reduced temporal accuracy, a behavior similar to that of the human ear [20].
This analysis is enabled by the wavelet transform's inherent scaling and translation properties.Specifically, the scaling function facilitates a detailed examination in terms of frequency, while the translation function permits temporal analysis.Following this logic, the wavelet series can be expressed as shown in Equation (1): where c jk are the wavelet coefficients at each of the scales and temporal positions and ψ jk are the wavelet function scaled and translated, as given by Equation ( 2): To calculate the wavelet coefficients c jk , it is essential to select a wavelet function along with the corresponding low-pass and high-pass filters.For our analysis, we have chosen the Haar wavelet due to its simplicity and capacity to detect transients or sudden changes in an audio signal.The associated filters for the Haar wavelet, which belongs to the orthogonal wavelets family, are as follows in Equations ( 3) and ( 4): Here, the low-pass filter computes the sum of adjacent elements, while the high-pass filter calculates their difference.The resultant of the high-pass filter produces the detail coefficients (cD), and the post-filtered signal contains residual noise.Passing the signal through a reconstruction high-pass filter produces the signal details (D).On the other hand, the result of the low-pass filter is the approximation coefficients (cA).In these coefficients, the smoothed signal is stored and when passing it through a low-pass reconstruction filter, the approximation of the signal (A) is obtained [21].
And the Equation (1) can be rewritten as Equation (5): where ϕ jk (t) and ψ jk (t) are the scaling and translation function that corresponds to the cA and the cD, respectively.

Information Coverage
In this section, we explain in detail how the audio is processed as a stereo signal with two channels, both of which were used for the coverage signal.For the wavelet analysis, we employed the Haar wavelet function over each of the arrays of each audio channel, as previously mentioned, which generates the cA1 and cD1 coefficients.Subsequently, we used the resulting cA1 coefficient for further analysis.This process involved applying another wavelet analysis to the cA's coefficients, continuing up to four levels of analysis.Figure 1 illustrates this multilevel analysis.The final cA4 coefficients serve as the carrier signal in which the information is concealed.The hiding procedure for the information of the nodes starts by selecting a percentage of modification in each of the audios, this percentage varies for each audio, Table 1 presents a set of the percentages selected for some of the audios in the dataset.The audio signal gets distorted and provides less coverage at a higher modification percentage, none of the audios could support more than 2% of modification over the detail coefficient at the fourth level.The percentage selected is transformed into the number of bits to modify (C) at the fourth detail coefficient.The coefficient is segmented into C chunks.The last value of each of the chunks is modified to the result of x new = (x original + m) × α (see Figure 2), the probe message to be covered was a message of "1's" of length = C, so m = 1.The α value is set to achieve greater discretion to the modification and it was obtained by empirical methods to achieve a coefficient that allows a modification without being noticed, for this probe α = 0.009.The inverse transform is made by replacing the original 4th detail coefficient with the modified one (Figure 3).The results of the modified audios are reported in Table 1.Finally in Figure 4, we present the reconstructed signal separated by channel.It is important to note that the sounds used in this study are real and have not been simulated or synthesized.

Network Description
The system comprises N nodes distributed in the region of interest to detect certain events.These nodes transmit data packets regarding the environmental conditions (movement, presence of people, etc.) to a sink node in a single hop using the Slotted ALOHA protocol and sound as the communication channel instead of the usually used RF signals.Building on this, nodes have preregistered bird sounds (preferably birds from that same region where the WSN is installed to avoid potential suspicions).In those sounds, the nodes will insert their data using the abovementioned procedure to avoid any perceptible signal distortion.Additionally, nodes will emit the bird's sounds (and, thus, send the packet data) and remain silent according to the statistical characteristics of these active/inactive periods, effectively implementing an On/Off scheme.
As such, we have N On active nodes and N O f f = N − N On inactive nodes at any instant of the system operation, where 0 ≤ N On , N O f f ≤ N.These nodes perform their state change (On/Off ) that follows poisson process behavior represented by λ On to activate the node, and λ O f f to deactivate it.This variables are calculated from analyzing the expected value (E[x]), variance (σ 2 x (x)) and the variation coefficient (CoV).The transmission of data packets is measured using the following teletraffic metrics: throughput ( S), the probability of having a free slot ( FS), and the probability of packet collisions (P collided ).These metrics depend on the probability τ, which represents the likelihood of a transmission occurring in the current slot, the probability that only one node transmits during the current slot (P S ), and the probability of an idle slot (P F ).
To calculate the system's lifespan, we determine the total energy available to the system (Total_Energy), the energy consumption of transmitting nodes (Energy_tx and tx respectively), the energy consumption of nodes that are turned on but not transmitting (Energy_On), and the energy consumption of nodes in an idle state (Energy_O f f ).
Finally, we assess the system's capability to transmit bits per second (bps) by considering the bps capacity that each audio file can hide and the system's throughput.

Characterization of the On/Off Periods
In this section, we first characterize the active and inactive periods of the selected birds' sounds.
We first obtain the absolute value of the amplitude of the spectrogram as shown in Figure 5. Observing this bird song, we can see that even in the silent period, the sound is present, different from 0, although clearly, the bird is not singing during this time.As such, we have to establish a threshold to digitalize the signal to obtain the lapses of time when the bird is singing or in silence as we show in Figure 6a where we show the analog signal magnified and in Figure 6b we show the digital representation where the On/Off periods are now clearly identified.Note that the digital representation is only used to characterize active and inactive periods, not to digitalize the bird's song which will be done in a subsequent section.We measured each of these active (silent) times, which to us, they represent the times when sensor nodes will be able to transmit data packets (remain silent), effectively following a close behavior of the natural bird song, concealing the system to potential intruders.From these measurements, we obtain the histograms, and from the histograms we obtain the probability distribution function (pdf).
From this, we can characterize the random variables of both the active/inactive times by measuring all the On and Off times of the bird's sounds and then we calculate the mean ((6) and ( 7)), variance (( 8) and ( 9)), standard deviation (10), and Coefficient of Variation (11) obtained respectively as: By identifying the statistical characteristics of these periods, we now propose different 361 phase-type distributions to approximate to these empirical distributions.Namely, we 362 By identifying the statistical characteristics of these periods, we now propose different By identifying the statistical characteristics of these periods, we now propose different phase-type distributions to approximate to these empirical distributions.Namely, we use the Exponential distribution, given by (12), to approximate distributions that have a CoV = 1; the Hyper-Exponential distribution, depicted in (15) to approximate probability distributions with a CoV > 1; and an Erlang distribution, given by (20), to approximate distributions that have a CoV < 1.To characterize these distributions we match the first moment and the second central moment of these distributions to the observed measurements.
And the second central moment (variance) is calculated as ) 2 , then we have Equation ( 13): Hence, the CoV is given in Equation ( 14): The Hyper-Exponential is composed by two exponential distribution, with parameters λ 1 and λ 2 which are selected with probability p 1 and p 2 respectively where p 1 = p, and As such, the pdf of the Hyper-Exponential distribution can be expressed as Equation (15): and the mean is obtained as in Equation ( 16): To calculate the variance we first obtain E[x 2 ] and then we substitute this value to find σ 2 x and then we obtain σ x , as demonstrated by Equation ( 17).
For the case of the Erlang distribution corresponds to the sum of k exponential distributions.Then the pdf is obtained as the (k − 1)th convolution of exponential distributions with parameter λ.Thus, the Erlang distribution is obtained as Equation (20).
The mean of an Erlang random variable is calculated as k times the mean of the exponential distribution, 1/λ, presented in Equation ( 21) and ] of the exponential distribution, then we get Equation ( 22): From this, we can obtain the CoV for the Erlang distribution as presented in Equation ( 23): Note that we already have the values of the mean, variance and CoV of the empirical data, directly obtained form the recordings of the bird's sounds.Then, we can obtain the parameters of these distributions by solving for λ when the CoV = 1, for λ 1 , λ 2 , and p when the CoV ≥ 1 and for k and λ when the CoV ≤ 1.Once we have found the parameters of the proposed distributions to approximate the On/Off distributions, we calculate the accuracy of the approximation using the Chi-square method as shown in Equation ( 24): The Chi-squared analysis, is evaluated by a p value , which is dependent of the χ 2 and the freedom degrees, which are the number of independent variables minus one (IV − 1).Then, the accuracy of the fit with the proposed phase-type distributions is obtained as follows: if the p value is less than the critical value, the hypothesis should be rejected.In our project, we selected a p value = 0.05.
We analyzed the audio recordings by categorizing them into statistical subgroups, which we refer to as Stratum.The 'General Stratum' encompasses all analyzed audios.The 'Pine-Oak Forest Stratum' includes recordings of the bird species Pheucticus melanocephalus, Myadestes occidentalis, and Turdus migratorius.The 'Meztitlan Stratum' comprises audios featuring Toxostoma curvirostre, Campylorhynchus brunneicapillus, Campylorhynchus brunneicapillus, and the Cardinalis cardinalis.
The parameters obtained for different birds and the distributions used to approximate to both the On and Off periods were calculated and for all the birds considered, both active and inactive times have a CoV > 1.Hence, we propose the Hyper-Exponential distribution to characterize both On and Off times.Where the minimum value for the CoV reported was for the Pheucticus melanocephalus in the Silence times with CoV = 1.065180; and the major CoV reported was for the General stratum in singing times with CoV = 2.264716.Additional data have been included in Appendix A.

Teletraffic Analysis of the WSN
In this section, we develop a mathematical analysis based On Continuous Time Markov Chains (CTMC) that captures the main dynamics of the system, such as the nodes that become active (inactive).according to the statistical characteristics of the particular birds.As we described in the previous section, the CoV of all the selected bird's sounds are higher than 1.Hence, we use a Hyper-Exponential distribution to approximate to the On/Off periods of the nodes in the WSN.However, we first develop the mathematical analysis considering the exponential case since the Hyper-Exponential case is an extension of this exponential model.
From this, the Markov Chain corresponds to an irreductible chain with valid states where N is the total number of nodes in the system, N On (N O f f ) is the number of active (inactive) nodes, i.e., nodes in the On (Off ) mode.Note that N = N On + N O f f .From state (N On , N O f f ) the valid transitions are as follows (See Figure 7, and This Markov chain is numerically solved to find the probability that there are N On and N O f f nodes in the WSN, i.e., the steady-state probabilities, π N On ,N O f f . For the Hyper-exponential case, when the CoV > 1, recall that this distribution is formed by two exponential distributions, with parameters λ On and λ O f f , with probabilities P On and P O f f respectively.Then, a node in phase 1 (2) of the On (Off) state can transit to phase 1 (2) of the Off (On) state and vice versa.Building on this, the Markov Chain also corresponds to an irreducible chain with valid states {Ω N (1) On ,N On ))} where N is the total number of nodes in the system, N On = N (1) O f f ), and is the number of active (inactive) nodes, i.e., nodes in the On (Off) mode.Note that N = N (1) O f f ) the valid transitions are as follows (See Figure 8, and Table 3):

•
To state (N (1) On , N O f f ) when a node is active in phase 1 and turns off in phase 1 of the Off mode.This occurs with rate of N (1) O f f × P O f f , this is the reason why there is one node less in state On with phase 1 and one more node in the Off mode with phase 1.

•
To state (N (1) On , N O f f , N O f f + 1) when a node is active in phase 1 and turns off in phase 2 of the Off mode.This occurs with rate of N (1) , this is the reason why there is one node less in state On with phase 1 and one more node in the Off mode with phase 2.

•
To state (N On , N On − 1, N O f f ) when a node is active in phase 2 and turns off in phase 1 of the Off mode.This occurs with rate of N (2) O f f × P O f f , this is the reason why there is one node less in state On with phase 2 and one more node in the Off mode with phase 1.

•
To state (N On , N On − 1, N O f f + 1) when a node is active in phase 2 and turns off in phase 2 of the Off mode.This occurs with rate of N (2) is the reason why there is one node less in state On with phase 2 and one more node in the Off mode with phase 2.

•
To state (N (1) On , N O f f ) when a node is turned off in phase 1 and turns on in phase 1 of the On mode.This occurs with rate of N (1) On × P On , this is the reason why there is one node less in state Off with phase 1 and one more node in the On mode with phase 1.

•
To state (N (1) On , N Off , N O f f − 1) when a node is turned off in phase 2 and turns on in phase 1 of the On mode.This occurs with rate of On × P On , this is the reason why there is one node less in state Off with phase 2 and one more node in the On mode with phase 1.

•
To state (N On , N On + 1, N O f f ) when a node is turned off in phase 1 and turns on in phase 2 of the On mode.This occurs with rate of On × (1 − P On ), this is the reason why there is one node less in state Off with phase 1 and one more node in the On mode with phase 2.

•
To state (N On , N On + 1, N O f f − 1) when a node is turned off in phase 2 and turns On in phase 2 of the On mode.This occurs with rate of On × (1 − P On ), this is the reason why there is one node less in state Off with phase 2 and one more node in the On mode with phase 2.
And for the general case, the valid configurations follow Equation ( 25): where 4 belongs to the base case of all the nodes are in the same state with the same phase, 6(N − 1) when there are only 2 states and phases that hold all the nodes in the system, 2(N − 1)(N − 2) when all the nodes are distributed in 3 of the states and phases, and 3) when all the nodes are distributed in the all the four possible states and phases.And finally, for completeness reasons, since no bird song has either On or Off modes determined by a CoV < 1, we also develop the Erlang (Hypo-exponential) model considering that there may be bird songs that follow this Erlang distribution for the On/Off periods that can be used in future works.This model also corresponds to an irreducible, Markov Chain with valid states {Ω N (1) On ,...,N On ))} where N is the total number of nodes in the system, N On = N (1) and is the number of active (inactive) nodes, i.e., nodes in the On (Off) mode.Note that O f f .Note that the Erlang model corresponds to the sum of k On (k O f f ) exponential random variables with parameter λ On (λ O f f ).As such, in the On (Off) state, nodes have to transit k On (k O f f ) exponentially distributed random times before going to the Off (On) state.Hence, valid transitions from state O f f , . . ., N k O f f )) are as follows (See Figure 9, and Table 4): On − 1, N On + 1, . . ., N On , N O f f , . . ., N O f f ) when a node is active in phase 1 and becomes active in phase 2 of the On mode.This occurs with rate of N On × λ On , this is the reason why there is one node less in state On with phase 1 and one more node in the On mode with phase 2.

•
To state (N On , . . ., N On + 1, N O f f , . . ., N O f f ) when a node is active in phase k − 1 and becomes active in phase k of the On mode.This occurs with rate of N (k−1) On × λ On , this is the reason why there is one node less in state On with phase k − 1 and one more node in the On mode with phase k.

•
To state (N On , . . ., N On − 1, N O f f ) when a node is active in phase k and becomes inactive in phase 1 of the Off mode.This occurs with rate of N (k) On × λ O f f , this is the reason why there is one node less in state On with phase k and one more node in the Off mode with phase 1.

•
To state (N On , . . ., N On , N O f f ) when a node is inactive in phase 1 and becomes inactive in phase 2 of the Off mode.This occurs with rate of this is the reason why there is one node less in state Off with phase 1 and one more node in the Off mode with phase 2.

•
To state (N On , . . ., N On , N O f f , . . ., N O f f + 1) when a node is inactive in phase k − 1 and becomes inactive in phase k of the Off mode.This occurs with rate of N this is the reason why there is one node less in state Off with phase k − 1 and one more node in the Off mode with phase k.

•
To state (N On + 1, . . ., N On , N O f f , . . ., N O f f − 1) when a node is inactive in phase k and becomes active in phase 1 of the On mode.This occurs with rate of this is the reason why there is one node less in state Off with phase k and one more node in the On mode with phase 1.

Throughput Analysis
Now we can obtain the main performance metrics of the system, namely, throughput, idle slot probability, and collision slot probability.Hence, we first calculate the probability that there are N On active nodes, which are the nodes that can transmit data to the sink node.This probability is given by the steady-state probability of the Hyper-Exponential Markov Chain described above.Recall that the packet transmission success probability considering an S-ALOHA random access protocol, occurs when only one of the active nodes transmits As it can be seen, it first determines the energy decrease based on the different operations of the nodes.Namely, in the On state, nodes activate the sensing, processing, and communication capabilities, while in the O f f mode, nodes consume a small amount of energy-related mainly to minimal electronic systems to allow the node to wake up in the following seconds.Additionally, in the On mode, when nodes transmit a packet, with probability τ, nodes consume an extra amount of energy.Hence, Initially, the algorithm sets up the simulation environment, initializing parameters like energy levels and slot time, and preparing the initial states of the nodes.It also initializes counters for tracking the number of free slots, successful transmissions, and collisions.
When the simulation begins evaluating a system with no nodes, a try-catch sequence is implemented to automatically terminate the program due to the absence of nodes to perform any actions, returning results based on the Result variable.Otherwise, it enters a loop that runs for a predefined number of iterations or until a specified slot count threshold is reached.During each slot, the algorithm calculates the number of transmissions, taking into account the active nodes and the transmission probability τ.It updates the energy consumption for both transmitting and non-transmitting nodes and adjusts the counters for free slots, successful transmissions, and collisions according to the transmission count.Progress is printed at regular intervals to monitor the simulation.
Finally, the algorithm concludes by returning free slots, collisions, and successful transmissions relative to the total number of slots, or the average energy consumption per slot.

Algorithm 1: System Simulation
Input: valid_con f igurations, τ, Energy_tx, Energy_On, Energy_O f f Output: Simulation results Total_Energy, Free_channel, Success, Collision ← 0, 0, 0, 0; for i ← Algorithm 1 aims to evaluate the energy consumption and communication performance of different configurations in a network.It takes as inputs a set of valid configurations, a transmission probability threshold (τ), and energy consumption values for transmission, active, and idle states.The algorithm initializes counters for total energy, free channel occurrences, successful transmissions, and collisions.For each valid configuration, it runs a large number of iterations to reach a steady state (100,000) and to simulate the network behavior.In each iteration, it calculates the number of active nodes and randomly determines if each active node transmits based on τ.Depending on the number of transmissions, it updates the success, collision, or free channel counters and adjusts the total energy consumption accordingly.Finally, it returns the average energy consumption, the proportion of free channels, successful transmissions, and collisions over the simulated iterations.This simulation provides insights into the efficiency and reliability of different network configurations in terms of energy usage and communication success rates.
The overall complexity of the algorithm is the product of the complexities of the loops, the outer loop is O(N 3 ), the middle loop is O(1) as we know that it runs a fixed number (100,000), and the inner loop is O(N).Thus, the overall time complexity of the algorithm is After validating the behavior of the system, the objective is to determine the efficiency and reliability of the birdsong-based communication system in real-world scenarios.By analyzing the previous simulation and the analytic results of the network metrics: throughput, collision rates, and free slot rates.The efficiency of the system is measured by evaluating the lifespan and throughput capabilities, given the transmission probability (τ) and the number of nodes (N).This is represented as li f espan(τ, N) × Throughput(τ, N).
Furthermore, energy consumption tests are conducted to evaluate the power efficiency of the proposed method.By measuring the energy consumption of the sensor nodes that has intermittence compared to a system that remains constantly in the On state.However, we have to emphasize that a system that is always turned on, implies that the nodes are constantly emitting the bird sound, which we believe would be very suspicious to people in the surroundings effectively weakening the concealment of the network.
Additionally, comparative tests are conducted to analyze the throughput capability of the birdsong-based communication system against the ZigBee protocol, which is a shortrange wireless communication protocol.This test aims to highlight the difference of the proposed approach and provide insights for further improvements of the communication system.The inclusion of hidden information in the analysis is achieved by considering the maximum amount of bps that can be concealed in each audio file, as reported in Table 1.This value is then multiplied by the system's throughput, which is determined by its specific transmission probability (τ) and the number of nodes (N).

Numerical Results
In this section we present the main results obtained through the analytical model developed in this work.First, we validate our analytic results comparing them to simulation results.Then, we obtain the main system performance parameters, such as throughput, time slot collision, and idle probabilities.
Figure 10a-c, show the system throughput, idle and collision time slot probabilities respectively for both analytical and simulation results.We can observe a very good match between these results providing a validation of the mathematical model.
To validate the differences that exist between both representations of the system, the figures presented in Figure 11 show the mean square error (MSE) for each of the graphs previously displayed.The maximum error reported is 13% in the Idle Slot graph (Figure 11b).For the Collision MSE, at a probability of τ = 0.9, there is a 10.5% error between both analyses.The Throughput MSE is the measure that fits best, with only a 0.17% difference between the simulation and the analytical solution of the model.Now, we analyze the behavior of the proposed system where nodes turn On and Off according to the statistical characteristics of the bird's sounds.Figure 12a, shows the throughput behavior when varying the number of nodes over the network or the transmission probability.This graph presents a surface of the proposed model, where nodes turn On and Off according to the characteristics of the bird's sounds, in black, and the case when all the nodes are always active in the On mode, in red.We can see that the throughput is very close among these schemes when the transmission probability is low (lower than 0.3).This is due to the fact that even if the nodes remain in the On state all the time, they do not transmit, which generates a traffic very similar to the case when a node turns On and Off.However, as τ increases, nodes in the On state attempt many more transmissions, consequently increasing the throughput.As such, when nodes do not go to the Off mode, there are many more transmissions, but in the same way, there are many more collisions, as observed in Figure 12c, especially as τ approaches 1. Building on this, we can clearly see that a higher throughput does not always entail good system performance since there are many packet collisions and high energy consumption with non-useful data transmission.Also note in Figure 12b that as τ and N increases, the probability of finding empty slots reduces considerably, which is important to consider for a concealed channel since nodes would constantly generate sound exposing the presence of a surveillance system.
The results presented in Figure 13a compare the lifespan of the system due to its energy consumption for the proposed On/Off scheme and the case when nodes are always active.It can be seen that even though the difference in the throughput (shown in the previous figure) is higher at certain points for the always On scheme, the energy consumption is lower for all the cases in our proposed scheme.Indeed, the conceal channel approach has the main objective of turning Off nodes according to the natural behavior of birds, and it also achieves an important energy consumption reduction (higher system lifetime).Figure 13b presents the efficiency analysis, which examines the relationship between throughput and the system's lifespan.This relationship is analyzed by multiplying the throughput and lifespan results of both systems (Li f espan(N, τ) × Throughput(N, τ)).It can be observed that efficiency increases as the probability of transmission increases.However, an increase in the number of nodes in the network has the opposite effect, causing efficiency to decrease.Now, we compare the performance of the sound-based system to a conventional RF-based network.We can see in Figure 14a that the proposed covert channel achieves a lower transmission rate compared to a classic RF channel using ZigBee at 20 and 200 Kbps.However, this lower transmission rate occurs for a system where communications are hidden using a different communication medium (sound) that is robust against conventional jamming attacks which are expected in many surveillance applications, while the ZigBee based systems would certainly collapse in case of such attacks while our proposal remains operational even under DoS-based attacks.Hence, these results clearly show the system capacity of the covert system, in such a way as to provide clear guidelines for the system administrator to know in advance the transmission rate.For instance, if the system is based on small packet transmissions, such as presence detection or variations in lighting or sound, the achieved data transmission of 978 bps would be more than enough to provide adequate monitoring services.However, if the system has to transmit images or video HD of 5 Mbps, then this proposed WSN would not be adequate due to the low transmission rate.
The main limitations of this work are related to the transmission distance, which is significantly lower (hundreds of meters) compared to RF-based transmissions (several kilometers) as well as the transmission capacity, as discussed before, where only small packets can be transmitted.As such, the practical implementation of this system should consider a hierarchical architecture such as clustered-based WSNs, where nodes only transmit to a near Cluster Head, while the cluster head transmits using RF signals to a further sink.Another alternative could consider the use of mobile sinks (motorized sinks such as terrestrial robots or drones) that recover the information using sound passing close to the nodes.However, these issues fall outside the scope of this work.

Conclusions
In this work, we proposed, analyzed, and studied a WSN using a covert channel based on bird sounds to hide the presence of the monitoring system.The proposed system closely follows the natural On/Off behavior of bird's songs, effectively hiding the monitoring capabilities of the WSN by only transmitting when the sound is present while turning the nodes to the Off mode when the bird's recording is silent.To achieve this, we propose the use of phase-type distributions to model these On/Off periods.Also, information is inserted in the sound using the Wavelet transform in order to reduce the distortion of the original audio signal in such a way as to generate sounds with underlying data with almost no perceptible modifications compared to the original audio.This approach, however, imposes limitations on the transmission rate in three significant ways: sound operates at a much lower frequency than RF signals; nodes can only transmit when sound is present; and bits are only inserted as long as they do not cause significant distortion to the recorded audio.In view of this, our mathematical analysis provides important performance metrics, clearly demonstrating the system's capabilities in terms of throughput, transmission rate, and energy consumption.Specifically, the data hiding process is contingent on the audio coverage rather than the duration of the audio.For instance, a 10-s audio file could hide 2060 bps, a 13-s file could conceal 3980 bps, whereas a 55-s file might only conceal 57 bps.While all the stratums studied follow a hyper-exponential hypothetical distribution, further studies are necessary to determine if all bird species conform to this distribution.
Moreover, the proposed On/Off node scheme distributed across the network improves the system perfromance as the number of nodes and/or the probability of transmission (τ) increases.For example, while at 1 node and a probability τ = 1, the always On scheme reaches 90% throughput, adding one more node (N = 2) at the same transmission probability yields a maximum throughput rate of 49.9%, compared to 18% for the always On scheme.This method conserves energy, extends the network's lifespan, and proves more effective in power consumption and network throughput analysis.
Therefore, despite conventional RF systems achieving higher transmission rates, our proposal is robust against classical jamming attacks and operates imperceptibly to unauthorized intruders searching for RF transmissions.
For future work, we plan to conduct practical tests to measure the transmission capacity in physical applications.We aim to verify the feasibility and limitations of this solution, which appears promising thus far, and to explore alternatives to address the significant reduction in transmission rates compared to RF transmissions.for all the birds considered, both active and inactive times have a CoV > 1.Hence, we propose the Hyper-Exponential distribution to characterize both On and Off times.

Figure 1 .
Figure 1.Signal decomposition in four levels after applying a wavelet transformation.The coefficient in gray is the Detail coefficient that hosts the modifications.

Figure 2 .
Figure 2. The cA coeffient is divided into C chunks, then we select the last value of each of the chunks, and the insertion is made by calculating x new = (x original + m) × α.

Figure 3 .
Figure 3. Signal reconstruction after modifying the Detail coefficient at a fourth transformation.The spread of the modification is signalized with the degradation of the modification.

Figure 4 .
Figure 4. Signal reconstructed of the XC3951Solitario audio recording is shown before modification (top part of the figure) and after modification (bottom part of the figure), separated by channel.

Figure 5 .
Figure 5. Spectrum analysis and absolute value of the amplitude of the spectrum of the Myadestes occidentalis birdsong.(a) Audio spectrum of Myadestes occidentalis birdsong.(b) Absolute value of spectrum of Myadestes occidentalis birdsong.

Figure 6 .
Figure 6.Sampled spectrum of birdsong and spectrum digitalized signal of the Myadestes occidentalis birdsong

Figure 6 .
Figure 6.Sampled spectrum of birdsong and spectrum digitalized signal of the Myadestes occidentalis birdsong

Figure 6 .
Figure 6.Sampled spectrum of birdsong and spectrum digitalized signal of the Myadestes occidentalis birdsong.(a) Sampled spectrum of the Myadestes occidentalis birdsong.(b) Spectre digitalized of the Myadestes occidentalis birdsong.

Figure 7 .
Figure 7. Markov chain for the case when the On/Off periods are exponentially distributed.

Figure 8 .
Figure 8. Markov chain of the transition process of a node which depends on the birdsongs behavior, Hyper-exponentional case.

Figure 9 .
Figure 9. Markov chain of the transition process of a node which depends on the birdsongs behavior, Erlang case.

Figure 10 .Figure 11 .
Figure 10.Comparison between analytical and simulation results for (a) Throughput, (b) Idle time slot probability, and (c) Collision probability.

Figure 12 .Figure 13 .Figure 14 .
Figure 12.Comparison between the system with nodes turning On/Off and nodes always in On mode (a) Throughput, (b) Idle time slot, and (c) Collision.

Table 1 .
Report of Modifications.

Table 2
On + 1, N O f f − 1) when an inactive node becomes active and N O f f > 0. This occurs with rate N O f f × λ On .As such, the mean time of the node in the Off state is considered to be 1/λ On .• To state (N On − 1, N O f f + 1) when an active node becomes inactive and N On > 0. This occurs with rate N On × λ O f f .As such, the mean time of the node in the On state is considered to be 1/λ O f f .

Table 2 .
Description of the possible transitions in the proposed network model.

Table 3 .
Description of the possible transitions on the chain proposed.

Table 4 .
Valid State Transitions for the Erlang Model.

Table A1 .
Statistical Values by Stratum of Singing Times.

Table A2 .
Statistical Values by Stratum of Silence Times.

Table A3 .
Proposed values for λ 1 , λ 2 and p for each of the singing and silence times, it also presents the p values as p val , as the result of the Chi-square method.