Biologically Inspired Intercellular Slot Synchronization

The present article develops a decentralized interbase station slot synchronization algorithm suitable for cellular mobile communication systems. The proposed cellular ﬁreﬂy synchronization (CelFSync) algorithm is derived from the theory of pulse-coupled oscillators, common to describe synchronization phenomena in biological systems, such as the spontaneous synchronization of ﬁreﬂies. In order to maintain synchronization among base stations (BSs), even when there is no direct link between adjacent BSs, some selected user terminals (UTs) participate in the network synchronization process. Synchronization emerges by exchanging two distinct synchronization words, one transmitted by BSs and the other by active UTs, without any a priori assumption on the initial timing misalignments of BSs and UTs. In large-scale networks with inter-BS site distances up to a few kilometers, propagation delays severely a ﬀ ect the attainable timing accuracy of CelFSync. We show that by an appropriate combination of CelFSync with the timing advance procedure, which aligns uplink transmission of UTs to arrive simultaneously at the BS, a timing accuracy within a fraction of the inter-BS propagation delay is retained.


Introduction
Slot synchronization is an enabling component for cellular systems. It is a prerequisite for advanced intercellular cooperation schemes, such as interference suppression between neighboring cells, as well as multicast and broadcasting services. The problem of intercell slot synchronization is to align the internal timing references of all nodes, so that base stations (BSs) and user terminals (UTs) agree on a common reference instant that marks the start of a transmission slot. In the context of cellular systems a slot is composed of a number of successive uplink and downlink frames, referred to as superframe.
Network synchronization in cellular systems is commonly performed in a master-slave manner: BSs synchronize to an external timing reference, known as the primary reference clock, and transfer this timing to UTs. This reference clock can be acquired through the global positioning system (GPS) or through the backbone connection. The first method requires the installation of a GPS receiver at each BS, which increases costs and, more importantly, does not work in environments where GPS signals cannot be received.
For high accuracy, the second method requires precise delay compensation, and the accuracy severely decreases when clocks are chained [1].
Over-the-air decentralized intercell slot synchronization that avoids the need for an external timing reference was pioneered in [2], and further elaborated in [3,4]. Its basic principle is summarized as follows: a BS emits a pulse indicating its own timing reference and is receptive to pulses from surrounding BSs; internal timing references are adjusted based on the power-weighted average of received pulses. Conditions for convergence were derived in [5], which reveals that convergence and stability are tightly linked to the intersite propagation delays between neighboring BSs. This is a critical issue, as inter-BS propagation delays are not known a priori. Furthermore, in [2], direct communication between BSs is required, and for the exchange of synchronization pulses, a separate frequency band is assumed to be available.
In the present paper a different approach is taken based on the theory of pulse-coupled oscillators (PCOs), which is commonly used to describe self-organized synchronization of biological systems such as swarms of fireflies, heart cells, or neurons. Mirollo and Strogatz [6] derived a theoretical framework for the convergence to synchrony. Various aspects regarding the application of the PCO model to wireless networks are addressed in literature: radio effects such as propagation delays [7], channel attenuation, and noise [8,9], and allowing for long synchronization words [10]. The rules that govern the PCO synchronization model are intriguingly simple and serve as a basis for inter-BS synchronization. The proposed cellular firefly synchronization (CelFSync) algorithm adapts the PCO model to account for constraints of cellular networks. CelFSync operates over-the-air, in a decentralized manner; no constraints are imposed on the availability of an external timing reference. As BSs and UTs typically transmit on successive downlink and uplink frames, two groups need to be distinguished; the BS group transmitting on the downlink and the UT group transmitting on the uplink. To facilitate the formation of two groups, two synchronization words are specified, one associated to BSs and the other to UTs. UTs transmit an uplink sync word based on their internal timing reference, which is received by BSs to update their own timing; in return UTs adjust their timing reference upon reception of downlink sync words from neighboring BSs. Thus, unlike [2], no separate frequency band is required as sync words are transmitted in-band with data. Moreover direct communication among BSs is not mandatory as synchronization is performed by hopping over UTs. As the downlink sync word is mandatory for conventional cellular systems to align the timing of UTs with the BS, the only overhead for inter-BS synchronization is the insertion of the uplink sync word. Thanks to the proposed strategy, the network is able to synchronize starting from an arbitrary misalignment, and propagation delays only affect the achieved accuracy but do not compromise the convergence to synchrony.
When considering a scenario where BSs are separated by several hundred meters up to a few kilometers, propagation delays severely affect the attainable timing accuracy. We propose to combine CelFSync with the timing advance procedure, which ensures that UT uplink transmissions arrive simultaneously at the BS. Compensating intracell propagation delays with the timing advance procedure, as well as selecting cell edge users to participate in CelFSync, are effective means to substantially improve the achieved interbase station timing accuracy.
The remainder of the paper is structured as follows. In Section 2 the PCO model and its achieved synchronization accuracy in the presence of delays are presented. In Section 3 CelFSync is developed by adopting the rules that govern the synchronization of PCOs to cellular networks, and Section 4 combines CelFSync with timing advance to compensate the effects of propagation delays. Practical constraints regarding the implementation in cellular networks are addressed in Section 5, and simulation results are presented in Section 6 that investigate the time to convergence and the achieved accuracy for an indoor office environment as well as an urban macrocell deployment composed of hexagonal cells.

Synchronization of Pulse-Coupled Oscillators
Pulse-coupled oscillators (PCOs) describe systems where individual nodes periodically emit pulses and adjust their internal time reference upon reception of pulses from neighboring oscillators. In this section the rules that govern the PCO model [6] are summarized, and the achieved accuracy in the stable state is elaborated.

Phase Function. A PCO is described by its phase function
where N is the number of oscillators. This function evolves linearly over time with natural period T: Whenever φ i (t) = 1 at reference instant t = τ i , the PCO is said to fire, it transmits a pulse and resets its phase to 0. Then φ i (t) increases again linearly, and so on. Figure 1(a) plots the evolution of the phase function (1) during one period with initial condition φ i (0) = 0. The phase function can be seen as an internal counter that dictates the emission of pulses. In the following, we consider that all nodes have the same dynamics, that is, clock jitter is considered negligible.

Synchronization
Rules. The goal of slot synchronization is to align the internal time references of all nodes, so that all PCOs fire simultaneously. To do so, the phase φ i (t) is adjusted when a pulse is received. When coupled to others, an oscillator i is receptive to the pulses of its neighbors and adjusts its phase φ i (t). When node j fires at instant τ j , the phase of node i instantly increases by a value Δφ that depends on its current value φ i (τ j ): The phase increment Δφ is determined by the phase response curve, which in [6] was chosen to be a linear function: where the coupling parameters α and β determine the coupling between oscillators. Figure 1(b) plots the time evolution of the phase when receiving a pulse at t = τ j . The received pulse causes the oscillator to fire early. Provided that α > 1 and 0 < β < 1, a system of N identical oscillators coupled all-to-all is always able to synchronize, so that all PCOs agree on a common reference instant, independent of initial timing misalignments [6].

2.3.
Convergence. An example of the synchronization of pulse-coupled oscillators is shown in Figure 2. Initially all nodes start with a random phase, which increments according to (1) until one phase reaches the threshold. At this instant and each time a phase reaches 1, neighboring nodes increment their phase according to (3). Over time, order emerges from a seemingly chaotic situation where nodes fire randomly, and in Figure 2, all nodes fire in synchrony within five periods.
A key feature in the synchronization of PCOs is that, over time, nodes cluster into groups of oscillators. This phenomenon is referred to as absorption and occurs when a pulse forces nodes to exceed their firing threshold, causing them to fire immediately. The absorption limit φ is derived from (3): As nodes have the same internal dynamics and if they are coupled all-to-all, absorptions remain permanently (see Figure 2). Therefore nodes following the PCO rules first gather into groups that gradually absorb one another, and after some time, always coalesce into one synchronized group.
In [11] Lucarelli and Wang extended the demonstration of [6] to remove the all-to-all assumption. Under weak coupling assumptions, that is, α close to 1 and β close to 0 in (3) (no proof for strong coupling exists), equivalent phase deviation variables are derived for each node (each variable represents the mean local interactions over one period) and are shown to asymptotically converge to the same value [11].
Unfortunately the analysis in [11] is not applicable when delays are introduced. Izhikevich showed that there is no equivalent phase deviation variable when interactions are delayed [12]. As the proposed inter-BS synchronization scheme always delays interactions (see Section 3), an analytical convergence study appears infeasible. Convergence is consequently studied through simulations in Section 6.

Impact of Delays.
When delays are introduced, such as propagation delays, the coupling between two nodes i and j is delayed by ν i j . In the presence of coupling delays a network of PCOs may become unstable, and the network is unable to synchronize [13]. Stability is regained by introducing a refractory period of duration T refr after reference instant τ i [7]. In refractory, when φ i (t) < φ refr with φ refr = T refr /T, no phase increment is possible, so that received pulses are not acknowledged. The duration of the refractory period needs to be at least twice the maximum delay between two nodes, so that echos are not acknowledged [7]: Because of delays nodes are no longer able to perfectly align their reference instants τ i [7]. Nevertheless nodes converge to a stable state where reference instants are spread within an interval limited only by the coupling delays ν i j , as detailed for networks of two and three nodes in the remainder of this section. Further discussion on the achieved accuracy of the PCO scheme in the presence of delays is available in [14].

Two Nodes.
The accuracy limits for a network of N = 2 nodes is bounded by the interval of reference instants leading to a stable state [7]. Suppose that the reference instants of two nodes i and j are aligned such that τ j > τ i + ν i j ; then node i is the forcing node that imposes its delayed reference onto node j. After coupling, node j is pulled to the delayed timing of node i, τ j = τ i + ν i j (as shown for nodes i = 1 and j = 2 in Figure 3), as long as the pulse of node i falls within the absorption interval (4) of node j, that is, If τ i > τ j + ν i j , the roles are reversed, in the way that node j imposes its delayed timing onto node i, so that after coupling On the other hand, if the reference instant of node i is within the range the pulses from node j fall into the refractory period of node i, and vice versa, and are thus not acknowledged. This corresponds to the stable state where the phases of both nodes are not adjusted. According to (6) the achieved accuracy is bounded by the propagation delay ν i j and is given by [7]: The introduction of a refractory period thus may result in a state where one node imposes its timing onto the other, 4 EURASIP Journal on Wireless Communications and Networking in a similar way to a master-slave synchronization scheme. However, the achieved state is random: it depends on the initial condition and on interactions with other nodes in the network. Therefore the role of the forcing node is arbitrary, and PCO synchronization is still considered decentralized.

Three
Nodes. The analysis of [7] is extended to a network of N = 3 nodes in the following. Two cases are distinguished.
(i) The forcing node is directly connected with all nodes.
(ii) The forcing node is the edge node of a line topology and imposes its timing to the other edge node by hopping over the center node.
Considering (i), suppose that node 1 is the forcing node that imposes its delayed timing onto nodes 2 and 3. This state is shown in Figure 3: node 1 fires at instant t = τ 1 , which causes nodes 2 and 3 to increment their phases at instants τ 1 + ν 12 and τ 1 + ν 13 , respectively. Assuming that their phase exceeds the absorption limit (4), nodes 2 and 3 fire at instants τ 2 = τ 1 + ν 12 and τ 3 = τ 1 + ν 13 , and subsequently enter refractory. No further phase increments occur because the pulses from nodes 2 and 3 are received when nodes are in refractory (5). Therefore the network is in a stable state, and the achieved accuracies of node 1 relative to node 2 and 3 amount to 12 = ν 12 and 13 = ν 13 , respectively. Interestingly, the accuracy between nodes 2 and 3 is equal to the difference in delays with forcing node 1, that is, 13 = |ν 12 − ν 13 |. Thus this achieved accuracy does not depend on the direct delay ν 23 but on the delay difference with the forcing node 1.
In case (ii) the considered nodes form a line topology, where the edge nodes 1 and 3, cannot communicate directly. Suppose that node 1 is the forcing node that imposes its timing onto node 3 via the center node 2. As the accuracy between adjacent nodes is bounded by (7), that is, 12 ≤ ν 12 and 23 ≤ ν 23 , the resulting accuracy interval over two hops between edge nodes 1 and 3 amounts to the sum of delays: 13 ≤ ν 12 + ν 23 .

Decentralized Intercell Synchronization
This section presents an adaptation of the PCO model to perform intercell synchronization. To facilitate reliable exchange of reference instants in the presence of signal fading, interference, and noise, long synchronization sequences that are transmitted in-band with data are considered instead of pulses. Furthermore, half-duplex transmission is considered, which implies that nodes cannot receive whilst transmitting. To this end, when two nodes transmit sync words that partially overlap, both nodes are unable to detect the sync word sent by the other node, referred to as deafness between nodes. Hence both nodes are effectively uncoupled, an effect which may severely disrupt intercell synchronization. Further accounting for constraints in cellular systems, the frame structure does not allow for overlapping downlink and uplink slots. Thus synchronized BSs and UTs should not transmit simultaneously. The proposed cellular firefly synchronization (CelFSync) scheme takes into account these fundamental constraints, by resorting to an out-of-phase synchronization regime, introduced in Section 3.1. CelFSync relies on two synchronization sequences, one transmitted by BSs to adjust timing references of UTs, and a second one transmitted by UTs to adjust timing references of BSs, based on rules that are established in Section 3.2. The detection of the two distinct synchronization sequences in an asynchronous environment is discussed in Section 3.3. For ease of explanation, propagation delays are neglected in this section and are treated specifically in Section 4.

Synchronization Regimes.
A system of PCOs is said to be synchronized when all nodes have reached a stable state where their internal timing references are aligned, constrained to the considered synchronization regime [15]. The synchronization regime is characterized by the phase difference Δ = τ 1 − τ 2 between two synchronized groups in the stable state, where members of the same group are perfectly aligned. Depending on the phase difference Δ, three synchronization regimes are distinguished [15], as illustrated in Figure 4. If there is no phase shift, Δ = 0, the regime is said to be in-phase. If the phase shift is exactly equal to half a period, Δ = T/2, nodes have reached an antiphase synchronization regime. Finally if the phase difference between oscillators is Δ / = 0 and Δ / = T/2 between the first and second groups (and T − Δ between the second and first groups), then oscillators are out-of-phase synchronized.
The in-phase regime is the most common form of synchronization; pacemaker cells pulse simultaneously to pump the heart, fireflies emit light at the same time. Antiphase synchronization is also familiar; when walking, our legs are antiphase synchronized: the left foot touches the ground half a period after the right one, and vice versa.
Following the frame structure of cellular systems composed of successive downlink and uplink frames, BSs are to be synchronized out-of-phase with UTs. Out-of-phase EURASIP Journal on Wireless Communications and Networking Figure 5: Cellular network topology with two BSs and one UT. synchronization ensures that uplink and downlink transmissions in the steady state do not overlap, so that detrimental effects of deafness between nodes, inherent to half-duplex transmission, are mitigated.

Cellular Firefly Synchronization.
The goal of CelFSync is to synchronize in time the transmission slots of a cellular network, so that neighboring BSs mutually align the start of the superframe preamble. The timing information between BSs is conveyed by implicitly hopping over mobiles close to the cell edge, as exemplified in Figure 5. Hopping on the UT enables to extend the reception range of sync words, and thus allows for robust intercell synchronization, even when neighboring base stations do not hear one another. CelFSync adapts the PCO synchronization model to establish an out-of-phase synchronization regime. The desired stable state is illustrated for one user terminal UT i and one base station BS a in Figure 6. Unlike the PCO model, instead of pulses, nodes transmit long synchronization sequences denoted by UL Sync and DL Sync of duration T UL,Sync and T DL,Sync , respectively. For slot synchronization three states are distinguished: transmission of the sync word, the refractory period, and the listen state. Transmission starts when a node fires (see τ UT,i for UT i in Figure 6). Halfduplex transmission is considered: when a node transmits, its receiver is switched off. After transmission of the sync word nodes enter the refractory period, where detected sync words are not acknowledged. In listen state nodes maintain a phase function, that is, adjusted upon detection of a sync word. Key to separating nodes into two predefined groups is achieved by three types of interactions as follows.
UT-BS Coupling. Base station BS a estimates the reference instant of UT i by detecting its sync word UL Sync; the estimate of this reference instant is denoted by τ UT,i . In order to establish the desired out-of-phase synchronization regime, BS a adjusts its phase function φ BS,a exactly Δ seconds after UT i has fired, at instant θ UT,i = τ UT,i + Δ. If the coupling instant θ UT,i falls within the listen state of BS a , the receiving BS increments its phase: The phase response curve Δφ BS is chosen according to (3), such that phase increments are strictly positive: The coupling parameters are chosen in accordance to the PCO synchronization model: α BS > 1 and 0 < β BS < 1.
The BS decoding delay T BS,dec , shown in Figure 6, specifies the interaction delay between the instant UL Sync detected at τ UT,i + T UL,Sync and the coupling instant θ UT,i = τ UT,i +Δ. It is an important parameter for two reasons. Firstly T BS,dec allows for a processing delay at the receiver in order to perform link level synchronization. Secondly T BS,dec needs to be appropriately chosen, so that the desired out-of-phase synchronization regime is reached. As BSs fire Δ after UTs, the BS decoding delay yields BS-UT Coupling. The considered user terminal UT i estimates τ BS,a , the reference instant of BS a . If the reception of DL Sync from BS a at instant θ BS,a = τ BS,a + T − Δ falls within the listen state of UT i , the receiving UT increments its phase: Again the phase response curve for BS-UT coupling Δφ UT is chosen according to (3): with the coupling parameters α UT > 1 and 0 < β UT < 1. The UT decoding delay that enforces UTs to fire T − Δ after BSs is equal to (see Figure 6): Thanks to this strategy, the formation of two groups is controlled. Starting from an arbitrary initial misalignment, where all reference instants τ UT,i , τ BS,a are randomly distributed within [0, T], by following simple coupling rules, reference instants of UTs and BSs separate over time into two groups; all BS fire Δ after UTs, and all UTs fire T − Δ after BSs. This state corresponds to the synchronized state shown in Figure 6. Convergence is verified through simulations in Section 6; by appropriately selecting the coupling parameters, it is shown that synchronization is always accomplished.
To speed up the convergence of CelFSync, two enhancements are possible, namely BS-BS and UT-UT couplings and the selection of active UTs.

BS-BS and UT-UT Coupling.
In case BSs can communicate directly or UTs are placed close to one another, convergence may be accelerated by allowing coupling between nodes of the same group. Moreover, the occurrence of deafness between nodes decreases because the number of nodes that are potentially coupled is increased. As half-duplex transmission is considered, BS-BS and UT-UT couplings are useful only during the coarse synchronization phase, that is, among nodes whose reference instants are misaligned by more than the sync word length.
Phase adjustments are made similarly to (8) and (11) for BSs and UTs; however decoding delays are different, as nodes need to align in time with other nodes from their own group. Therefore the interaction delay upon detection of DL Sync and UL Sync needs to be equal to one period T, giving a decoding delay of T BS-BS,dec = T − T DL,Sync for BSs and T UT-UT,dec = T − T UL,Sync for UTs.
Active UT Selection. Since uplink sync words UL Sync should be heard by multiple BSs, it is reasonable to select a subset of UTs close to the cell boundary to participate in intercell synchronization. Therefore, in each cell, the base station selects the N UT UTs with the largest propagation delay among N UT,tot total UTs in the cell. The remaining N UT,tot − N UT UTs are not active in CelFSync and follow the timing reference dictated by their closest BS, by aligning their local clocks based on DL Sync.

Synchronization Word Detection.
CelFSync relies on the detection of transmitted DL Sync and UL Sync sequences. In the following, we assume that uplink and downlink sync words are two different random sequences, each composed of M symbols. Sync word detection is carried out by the link-level synchronization unit, which cross-correlates the received signal stream x(t) with the sync word s(t), where s(t) = s UL (t) if uplink sync words are to be detected, and s(t) = s DL (t) otherwise. The output of the link-level synchronization unit i is denoted by r i (t) = x(t−τ)s * (τ)dτ. The correlator output produces a series of peaks, in a similar way to the emission of pulses in the PCO model, and detection of a sync word is declared when r i (t) exceeds the detection threshold R [16].
Signal fading may attenuate the received signal x(t), which may result in a missed detection. The probability that reference instants τ UT,i and τ BS,a are correctly detected is defined as [17] where H is the hypothesis that a sync word is present at the receiver. On the other hand, as sync words are transmitted in-band, cross-correlation of s(t) with other sync words, payload data or noise produces spurious peaks, so that detection of a sync word may be declared although no sync word is present, giving rise to a false alarm. The false alarm probability is defined as [17] where H , the hypothesis that no sync word is present at the receiver, is the complement of H . The Neyman-Pearson criterion is used to design the sync word detector [17]: the detection threshold R is set according to the desired false alarm rate P fa ; once R is set, the detection rate P d is determined. The impact of false alarm and detection rates on an adaptation of the PCO model to ad hoc networks was studied for a multicarrier system in [18]. It was shown that false alarms have a higher impact on the convergence than missed detections 1 − P d . Hence, it is necessary to maintain a sufficiently low false alarm rate [18].
The reliability of the link-level synchronization unit can be enhanced by increasing the length of the sync word M. Increasing M improves the detection rate for a given false alarm rate, at the expense of higher overhead [18].

Compensation of Propagation Delays
The accuracy of CelFSync is limited by propagation delays, similarly to the PCO model discussed in Section 2. In an indoor environment where distances between nodes are typically small, propagation delays are negligible. However, for cellular systems where the inter-BS distance is up to a few kilometers, Section 4.1 reveals that propagation delays cannot be ignored. A common procedure to align uplink transmissions is the timing advance procedure, described in Section 4.2. Timing advance is combined with CelFSync in Section 4.3 to achieve a timing accuracy within a fraction of the inter-BS propagation delays.

Achieved Accuracy in the Stable State.
After CelFSync converges and reaches a stable state, reference instants of BSs and UTs are out-of-phase synchronized (see Figure 6), and no phase increments occur. In the following discussion a sufficient refractory period (5) is assumed; then stability is maintained and the achieved timing accuracy in the stable state between any two nodes is bounded by (7). In the presence of propagation delays, the stable state condition (6) in terms of the reference instants of BS a and UT i translates to where ν ai is the propagation delay between BS a and UT i . When the upper bound in (16) is approached, then τ BS,a = τ UT,i + Δ + ν ai , UT i is the forcing node that imposes its timing onto BS a . Likewise, (16) approaches the lower bound, τ UT,i = T − Δ + τ BS,a + ν ai , when BS a is the forcing node that imposes its timing onto UT i . The effect of propagation delays on the achieved inter-BS accuracy in the stable state is analyzed with the aid of a case study, where two BSs are synchronized via one UT, as depicted in Figure 5. This case study resembles the discussion for a network with N = 3 nodes presented in Section 2.4.2. Clearly, the worst case inter-BS timing misalignment is encountered when one BS is the forcing node. Then the two end nodes BS a and BS b synchronize by hopping over UT i , so that the timing misalignments over two hops add up. Applying the bound (16), the inter-BS accuracy is upper EURASIP Journal on Wireless Communications and Networking 7 bounded by the sum of the BS a to UT i and UT i to BS b propagation delays: Given that in cellular networks the inter-BS distance is up to a few kilometers, propagation delays have a major impact on the achieved accuracy in the stable state.

Timing Advance Procedure.
As UTs are arbitrarily distributed within the cell, the distance d ai between UT i to BS a varies. Since propagation delays are distance dependent through ν ai = d ai /c, where c is the speed of light, the observed timing reference of BS a measured at different UTs, denoted τ BS,a = τ BS,a + ν ai , are mutually different.
To ensure that uplink transmissions arrive simultaneously at their own base station, timing advance is a common procedure in current cellular systems [19] and in wired telecommunication systems [20]. For timing advance UT i advances its transmission by ν ai , the propagation delay to its serving BS, taken to be BS a (see Figure 5). The uplink reference instant of UT i including timing advance is given by The propagation delay ν ai may be determined by estimating the round trip delay between BS a and UT i [21]. Upon reception of DL Sync from BS a , UT i responds with the transmission of a random access preamble (RAP) at τ RAP,i = τ BS,a +T RAP . Since T RAP is a constant known to BS a , the round trip delay 2ν ai is determined by detecting the received timing of the RAP at BS a . In addition, the RAP identifies UT i , so that BS a can distribute the estimate of ν ai to UT i .

CelFSync with Timing Advance.
In order to combat propagation delays, we propose to combine CelFSync with the timing advance procedure. If UT i knows the propagation delay to its serving base station BS a , the corresponding round trip delay of 2ν ai can be compensated. Owing to the multipoint-to-point topology specific to cellular networks, BS a of cell A typically serves several mobiles UT i , i ∈ A, each with a specific propagation delay ν ai . Hence, all timing inaccuracies, the propagation delays from BS a to UT i and back from UT i to BS a , must be compensated for at the mobile UT i . This is accomplished by advancing both, the transmitted UL Sync and the coupling of the received DL Sync at UT i , by the BS-UT propagation delay ν ai .
For the following discussion, suppose that UT i has carried out the timing advance procedure with BS a , but its UL Sync transmission is received by BS b .
UT-BS Coupling. For CelFSync with timing advance, UT i sends the uplink sync word UL Sync at the advanced reference instant τ UTA,i = τ UT,i − ν ai in (18). Then a phase increment occurs at BS b at instant θ UTA,i = τ UTA,i + Δ + ν bi , so that (8) is transformed to with Transmit

UL Sync
Transmit UL Sync Figure 7: Combination of CelFSync with timing advance. (11), we propose to also advance the coupling by the propagation delay. So given that UT i is timing aligned to BS a , but receives DL Sync from BS b , the mobile UT i advances its coupling by ν ai . Then the received DL Sync from BS b leads to a phase increment at UT i at instant θ BSA,b = θ BS,b − ν ai , so that (11) changes to

BS-UT Coupling. For BS-UT coupling
with Figure 7 summarizes the proposed combination of CelF-Sync with timing advance: UT i starts transmision at τ UTA,i = τ UT,i −ν ai , so that the coupling at BS a occurs exactly at τ BS,a = τ UT,i + Δ; in return, BS a starts transmission of its sync word, whose decoding time is reduced at UT i by ν ai so that UT i fires exactly T − Δ after BS a . Hence, all entities within one cell are perfectly timing aligned, and thus, the only remaining source of timing inaccuracies is between entities of neighboring cells.
In the synchronized steady state, sync words observed at θ UTA,i and θ BSA,b must fall into the refractory period, such that τ BS,b ≤ θ UTA,i < τ BS,b + T refr for UT-BS coupling, and τ UT,i ≤ θ BSA,b < τ UT,i + T refr for BS-UT coupling. The steady state accuracy between BS b and UT i is bounded by the two extreme cases when either BS b or UT i is the forcing node. In case UT i is forcing, the observed timing at BS b yields τ BS,b = τ UT,i + Δ + ν bi − ν ai . Otherwise, if BS b is forcing, the timing imposed on UT i amounts to τ UT,i = τ BS,b −Δ+ν bi −ν ai . This means that the achieved accuracy in the steady state between BS b and UT i is bounded by Therefore combining timing advance with CelFSync always achieves an accuracy, that is, bounded by the difference of UT-BS propagation delays. In order to analyze the achieved inter-BS accuracy, the case study depicted in Figure 5 and discussed in Section 4.1 is revisited. Given that UT i is time aligned to BS a , that is, τ UTA,i = τ UT,i − ν ai , the only remaining source of inaccuracies 8 EURASIP Journal on Wireless Communications and Networking is the link from UT i to BS b , so that the UT-BS accuracy bound (21) can be directly applied. Substituting τ UT,i = τ BS,a − Δ into (21), the inter-BS accuracy between BS a and BS b over two hops is bounded to Provided that UT i is located near the cell boundary, its propagation delays to BS a and BS b are similar, so that the difference |ν ai − ν bi | is much smaller than the individual delays ν ai and ν bi . This is in sharp contrast to the achieved accuracy without timing advance in (17), which is bounded by the sum of propagation delays. Increasing the UT density per cell N UT,tot increases the probability of selected UTs to be close to the cell edge, which has the appealing effect that the inter-BS accuracy (22) improves. The accuracy bound is extended to multiple UTs in the Appendix. The working principle of CelFSync including timing advance is summarized as follows.
(i) UT i connects to the BS with the strongest received signal strength, assumed to be BS a .
(ii) UT i aligns its timing to BS a by carrying out a timing advance procedure, as described in Section 4.2.
(iii) If identified as active, UT i emits UL Sync at reference instants τ UTA,i in (18) and adjusts its phase φ UT,i upon reception of DL Sync according to (20).

Implementation Aspects
In order to integrate CelFSync into a cellular mobile radio standard, several practical constraints need to be taken into consideration. Constraints regarding the frame structure and the chosen duplexing scheme are addressed in this section.

Frame Structure.
CelFSync is implemented and verified based on the frame structure taken from the specifications of the Wireless World Initiative New Radio (WINNER, URL: http://www.ist-winner.org.) system concept [22]. Consecutive downlink and uplink slots constitute one frame, and a number of successive frames form one super-frame of duration T. One uplink and one downlink sync words UL Sync and DL Sync are placed into the superframe with a relative spacing of Δ, as illustrated in Figure 4. The downlink sync word DL Sync allows UTs to synchronize to its BS and is therefore essential for cellular networks. Unlike DL Sync, the insertion of the uplink sync word UL Sync adds overhead, as UL Sync is typically not required in current cellular networks. Fortunately, this overhead is modest as UL Sync is typically transmitted with low rate. For the WINNER system the respective durations for superframe and UL Sync are 5.8 ms and 45 μs. Hence the resulting overhead is less than 1% [22].

Acquisition and Tracking
Modes. An intrinsic property of PCO synchronization is that coupling between nodes effectively shortens period T. However, cellular systems typically rely on a fixed frame structure, which specifies the way uplink and downlink slots are arranged to exchange payload data. To this end, whilst the reception of payload data is still ongoing, CelFSync may shorten the period of two successive reference instants to T ≤ T, which effectively shortens the duration of the superframe.
As long as the effective period T is only slightly shortened, such that T − T ≤ ε, insertion of a guard time with duration T G > ε ensures that reception of payload data is completed before a sync word is transmitted. The condition T − T ≤ ε corresponds to the tracking mode in the steady synchronization state, where small offsets due to clock skews, leading to deviations of the natural oscillation period T between nodes, are compensated.
In case of coarse timing misalignments between cells, so that T − T > ε, the network is in acquisition mode. Potential conflicts in acquisition mode are avoided by (i) suspending payload data transmission while intercell synchronization is in progress; (ii) shortening the superframe duration to T sf < T.
Scheme (i) does not allow for exchange of payload data before CelFSync has reached a steady state. Given that a steady state is likely to be maintained for hours or even days, while CelFSync typically converges within a fraction of a second or so, the loss in system throughput due to suspended data transmissions may be acceptable. For instance, scheme (i) is applied to facilitate the synchronization procedure in the wireless LAN standard 802.11 [23,24]: periodically, data transfer is preempted, and the access point transfers its clock value, known as timing synchronization function (TSF), to the networks participants.
Scheme (ii) avoids conflicts by forcing the effective period T to be at least as long as T sf . By doing so, continuous exchange of payload data is maintained, at the expense of reducing the throughput during acquisition by about (T − T sf )/T.

Duplexing Scheme.
CelFSync is applicable to both time division duplex (TDD) and frequency division duplex (FDD). Nodes adjust their internal clocks based on received sync words; whether the uplink and downlink sync words are transmitted on different frequency bands or not is irrelevant. The discussion in this paper targets half-duplex transmission, where nodes cannot receive and transmit at the same time, applicable to TDD and half-duplex FDD. Full-duplex FDD benefits CelFSync, since nodes can transmit and receive simultaneously, which eliminates deafness due to missed sync words whilst transmitting.

Imposing a Global Timing
Reference. An inherent problem of any distributed synchronization procedure is that nodes agree on a relative time reference, that is, valid only among the considered nodes and has no external tie. Such a relative reference is opposed to a global time reference such as the Coordinated Universal Time, which is provided by GPS for example. Furthermore, as the size of the network increases, it becomes increasingly difficult to synchronize the entire network in a completely decentralized manner. To avoid this difficulty, in [25] a scenario was considered where only a few nodes have access to a global time reference. The PCO model was extended such that these master nodes impose a global time reference to the entire network, even though the number of master nodes was only a small fraction of the total number of nodes in the network. Furthermore, the behavior of normal nodes that do not have access to a global time reference is not modified at all.
Applied to CelFSync a subset of BSs get access to a global time reference. These master BS emit downlink sync words DL Sync with a slightly shortened period T ma < T, and are not receptive to sync words from other nodes [25]. Neighboring cells then align their reference instants following the synchronization rules outlined in Section 3.2. It was demonstrated in [25] that for 0.9T ≤ T ma < T, arbitrarily large networks are reliably synchronized. By doing so the problem of synchronizing large networks with a distributed algorithm is reduced to synchronizing a number of cells (typically up to 2 or 3 tiers) around a master BS.

Performance Evaluation
To evaluate the performance of CelFSync two deployment scenarios are considered: first an indoor office scenario in Section 6.1; and second a macrocell deployment modeled by an hexagonal cell structure in Section 6.2 [26]. All nodes transmit with the same power P t . The propagation channel between nodes i and j is modeled as a distance-dependent pathloss channel. Node j receives the transmission of a node i at a distance d i j with power P t d −χ i j , where χ is the pathloss exponent. The signal-to-noise-plus-interference ratio (SINR) of a received sync word is composed of the received power of the sync word, divided by the level of interference plus thermal noise with power N 0 . The detection threshold is set for a given false alarm rate, which enables the computation of the detection probability P d for each received sync word as a function of the current SINR (see Section 3.3). Unless otherwise stated, the parameters shown in Table 1 are used in the simulations.
Both environments impose different strains on CelFSync. In the indoor environment, sync words are subject to a high level of interference from other transmitting UTs. In the outdoor environment, the large distance between UTs and BSs results in higher channel attenuations, creating a more sparsely connected network, which implies that network synchronization is to be carried out over multiple hops.
In both scenarios, Monte-Carlo simulations are conducted for 5000 sets of initial conditions: all BSs initially commence with uniformly distributed internal timing references, while UTs are locally synchronized to their closest BS. Synchronization is declared when two groups have formed, so that reference instants of UTs are aligned and out-of-phase synchronized with reference instants of BSs, with a relative timing difference of Δ.

Indoor Office
Environment. An indoor office with two corridors and ten offices on each side is considered. This  setting was defined for the local area scenario in WINNER [27]. The network topology with N BS = 4 BSs and N UT = 15 UTs participating in CelFSync is depicted in Figure 8. The selected UTs (marked as bold circles) can communicate directly with all BSs (marked as squares). UTs that do not participate in the network synchronization procedure do not transmit UL Sync and adjust their slot oscillator based on received DL Sync.
Results plotted in Figure 9 elaborate on the time taken for the entire network to synchronize. The time to synchrony T sync is normalized to the duration of a superframe T. Figure 9 plots the cumulative distribution function (CDF) of the normalized time to synchrony for different values of the BS-UT coupling factor α UT .
The performance of the proposed inter-BS synchronization scheme can be controlled by the coupling factor α UT . For a high coupling value, α UT > 1.3, synchronization is reached quickly, but convergence to a synchronized stable state is not always achieved. The fraction of initial conditions that do not converge to this state is due to deafness among nodes: some part of the network transmits partially overlapping DL Sync and UL Sync sequences, and due to the half-duplex assumption, some nodes are thus not able to synchronize. The deafness probability increases with the coupling factor α UT , and for α UT = 1.5, it is approximately 10%. If the coupling is low, α UT ≤ 1.3, synchronization is always reached within T sync = 10 periods, and for α UT = 1.3, 80% of initial conditions lead to synchrony within T sync = 5 periods. This is encouraging given the fact that deafness among nodes does not occur when α UT ≤ 1.3, even though nodes start with a random initial timing reference. Setting α UT sufficiently low reduces the absorption limit (4), which allows nodes to receive more sync words in the synchronization phase. This lowers the deafness probability, and enables the network to synchronize starting from any initial timing misalignment.

Macrocell Deployment.
For cellular networks, an hexagonal cell structure is considered as shown in Figure 10. One or two tiers of BSs are placed around a center BS, resulting in a network of N BS = 7 and N BS = 19 BSs, respectively, each of radius of d cell = 1 km. The number of active UTs per cell, N UT , specifies the number of UTs that participate in CelFSync. Among the N UT,tot UTs randomly placed in each cell, the N UT UTs closest to the cell edge are selected as active.

Time to Synchrony.
In a similar manner to Figure 9, results plotted in Figure 11 depict the time to synchrony of CelFSync in an hexagonal cell deployment for N BS = 7 BSs and N BS = 19 BSs. Coupling among UTs is also considered with strength α UT-UT = 1.05.
As expected, networks of N BS = 19 BSs converge less rapidly than smaller networks of N BS = 7 BSs. This degradation is due to the increase in network diameter from 4 hops to 8 hops. Moreover, the number of UTs per cell participating in CelFSync, N UT , does not significantly change the time to synchrony, and a synchrony rate of 80%  is achieved within 12T when N BS = 7 BSs and within 25T when N BS = 19 BSs. In all cases, a synchronization rate of 100% is achieved within T sync = 50 periods, which means that deafness between nodes, due to partially overlapping sync words, does not corrupt the convergence of CelFSync.

Achieved
Inter-BS Accuracy. While in an indoor environment propagation delays are typically negligible, the opposite is true for the macrocell deployment (17). The achieved inter-BS accuracy ab = |τ BS,b − τ BS,a | of CelFSync including timing advance is verified in Figure 12 for various node densities N UT,tot . Simulations are conducted over 100 random network topologies, each with 200 sets of initial conditions. It is assumed that UTs are timing aligned with their closest BS, and that the number of active UTs per cell is set to N UT = 3 UTs per cell. As the accuracy bound (22) suggests, the inter-BS accuracy ab is significantly improved as the node density N UT,tot increases. Augmenting N UT,tot increases the probability for selected UTs to be close to the cell edge, which decreases the delay difference ν bi − ν ai in (22). For a UT density equal or higher than N UT,tot ≥ 25 UTs per cell, the achieved accuracy is bounded by ab < 0.5 μs. This is a significant achievement as the propagation delay for an inter-BS distance of 2 d cell = 2 km is ν ab ≈ 6.67 μs.

Conclusion
This paper studied the application of self-organized synchronization inspired from the theory of pulse-coupled oscillators to cellular systems. The original algorithm was modified to align the timing references of base stations to simultaneously transmit on downlink frames, and of user terminals to simultaneously transmit on uplink frames. With the proposed decentralized cellular firefly synchronization (CelFSync) algorithm, a local area wireless network composed of 4 base stations and 15 user terminals is always able to synchronize within 10 periods. In large-scale networks where propagation delays are typically non-negligible, the timing advance procedure, common in current cellular networks, was combined with CelFSync to combat the effect of propagation delays. By compensating intra-cell propagation delays with timing advance together with selecting cell edge users to participate in CelFSync, the detrimental effects of large propagation delays are substantially reduced. Simulation results demonstrated that the achieved inter-BS timing accuracy is always below 1 μs when at least 10 users are randomly distributed per cell, which corresponds to approximately 15% of the direct propagation delay for an inter-BS spacing of 2 km.