1 Introduction

Bluetooth low energy (BLE) is the power-optimized alternative to the basic rate/enhanced data rate (BR/EDR) Bluetooth protocol [1]. Developed by the Bluetooth Special Interest Group (SIG) and officially introduced in the Specification V4.0 in 2010,Footnote 1 BLE was conceived with the aim of achieving an ultra-low power consumption performance, suited for applications characterized for involving devices fed by limited power sources.

Over the past few years, numerous studies that explore the capabilities of BLE have been carried out. For example, Kamath [3] and Kindt et al. [4] both studied the energy consumption, Gomez et al. [5] modeled the maximum throughput, Kalaa et al. [6] analyzed channel utilization and the implemented adaptive frequency hopping scheme, and Mikhaylov [7, 8] and Liu et al. [9] characterized neighbor devices discovery and connection establishment procedures. Results have shown that the protocol offers a far superior throughput performance [10] and significantly lower energy consumption [11] compared to those observed in other widely used low-energy wireless protocols, such as ZigBee, and provide compatibility with a broad spectrum of mobile devices. In addition to the abovesaid, support of IP was included in the version 4.2 of the protocol [12] which, has recently turned BLE in a potential candidate for a vast range of applications that include: health care, wearable devices, home automation, Internet of Things (IoT) [13, 14], and more recently, industrial wireless sensor networks (IWSN) and Industrial IoT (IIoT) [15]. Each specific application presents different challenges and requirements in regards to performance metrics such as throughput, energy consumption, reliability, delay [16], to name a few.

In this study, we focus on the suitability of BLE in Industrial implementations of wireless sensor networks, which present highly stringent Quality of Service requirements. In particular, the upper bound in transmission delay, also known as worst-case transmission delay, must be deterministic and predictable, and cannot exceed the limitations for the regular operation of the system. In contrast to the considerably high fault tolerance and relative flexibility in terms of latency of classic wireless sensor networks, IWSN must ensure reliable real-time communication among the devices involved in the network. Therefore, in IWSN data transmission is mission and time critical, with the potential of resulting in severe systems failures, or even threat to human safety, when the delay bounds are violated [17].

In BLE, however, a maximum transmission delay cannot be fixed. This limitation is analyzed by Rondón et al. in [18], extending on previous research by Arzad et al. [19] and Xhafa et al. [20]. In this paper, a thorough model of the delay performance of BLE is presented. In the model, the effect of the occurrence time of an Application Layer (AL) event on the overall behavior of the transmission process, as well as consecutive retransmission of a failed packet, are taken into consideration to describe the mathematical representation of the average transmission delay. Rondón et al. considered an unbounded retransmission scheme in which all packets are retransmitted until success. A 100% reliable behavior is therefore achieved but, in turn, no transmission delay boundaries can be predicted.

So far, the effect of modifications on the BLE retransmission scheme structure in the reliability and timeliness performance has not been analyzed. Expanding upon the mathematical model of the average packet transmission process presented in [18], this article is the first to explore the potential of the protocol of meeting the real-time requirements found within the IWSN and IIoT field, more specifically, in industrial process automation applications. For this purpose, three different bounded retransmission schemes are evaluated using modified versions of the aforementioned model. The obtained results, in terms of packet loss rate and worst-case transmission delay, are compared against typical demands of the targeted IWSN in order to analyze the feasibility of the proposed modified retransmission models under realistic configurations.

The remainder of this paper is organized as follows. Section 2 offers an overview of the BLE protocol stack and an insight of the protocol operation mechanism under connection-oriented scenarios. Section 3 summarizes the analytical model of the BLE average delay performance presented in [18], which is the base for the rest of this article. In Sect. 4, three possible bounded adaptations of the BLE retransmission mechanism are proposed, studied and validated, with the goal of showing the BLE suitability for time-critical industrial applications. Furthermore, the obtained results are analyzed and compared against stringent requirements commonly found in applications within the process automation field. Finally, conclusions and future work ideas are presented in Sect. 5.

2 BLE Protocol Overview

Analogously to BR/EDR Bluetooth, the BLE stack comprehends two main components: the Host and the Controller. The Controller is often integrated into a System-on-Chip (SoC) and encompasses the Physical Layer and the Link Layer. The Host is usually found on a separate chip and is in control of the higher level functionalities: the Logical Link Control and Application Protocol (L2CAP), the Attribute Protocol (ATT), the Generic Attribute Profile (GATT), the Security Manager Protocol (SMP) and the Generic Access Profile (GAP). The uppermost layer of the stack is the Application Layer (AL), which includes all the non-core profile, not outlined by the Bluetooth specification (see Fig. 1).

BLE provides a bit rate of 1 Mbps and operates in the 2.4 GHz Industrial Scientific Medical (ISM) band, using 40 channels spaced 2 MHz apart from each other. There are two types of BLE RF channels: advertising channels, which are reserved for device discovery, connection establishment, and broadcasting, and data channels, which are used for the communication between two connected devices. To reduce interference with IEEE 802.11-based applications, an Adaptive Frequency Hopping (AFH) mechanism is provided.

Fig. 1
figure 1

BLE protocol stack

A device can be assigned four different roles: peripheral, central, broadcaster and observer. Devices with broadcaster and observer roles make use of the advertising channels to respectively transmit and receive broadcast data. A device in the central role is responsible for the creation and management of connections with devices in the peripheral role. After a connection between two devices is initiated, they acquire the roles of master or slave according to their role during the connection establishment procedure. The communication is then carried out in separate and periodic Connection events, spaced accordingly to the connInterval parameter, which is defined as a multiple of 1.25 ms in the 7.5 ms to 4 s range. For the sake of energy efficiency, a slave can discard a number of Poll frames sent by the master, determined by the connSlaveLatency parameter. Finally, it is possible to detect a connection loss due to interference or out-of-range position if a period longer than connSupervisionTimeout passes without receiving any packet. The timeout value can range between 100 ms and 32 s.

The slave waits for \(1+connSlaveLatency\) Connection Events, then must wake up and start listening for a Poll frame sent by the master on the chosen channel. Subsequently, the slave has to respond either with a Data frame or a Null frame. The master receives the response and sends and acknowledgment in the case of a Data frame or closes the Connection Event if the response is a Null frame. As long as two devices have data to send, a Connection Event stays open. Its duration is constrained by the connInterval minus one Inter Frame Space \(T_{Ifs}\).

Two bits of the Packet Data Unit (PDU) are reserved for data flow control: the SequenceNumber (SN) and the NextExpected SequenceNumber (NESN). These values are increased or decreased following each transmission/reception, according to the success status of the frame. Thanks to these values, it is possible to determine if it is needed to perform a retransmission during the same Connection Interval in which the frame was sent. That said, a Connection Event could still end ahead of time due to a failed transmission.

The format of a BLE PDU is depicted in Fig. 2. In order to verify the PDU integrity, the Access Address (AA) field is first evaluated and then a Cyclic Redundancy Check (CRC) of order 24 is computed to detect bit errors. In case an invalid AA is found, or two frames in a row fail because of a corrupted CRC, the transmission fails and the Connection Event is closed. The PDU has then to be retransmitted in the following Connection Event, thus increasing the overall over-the-air delay with respect to the probability of a premature end of a Connection Event and, eventually, to the ConnInterval value.

Fig. 2
figure 2

BLE PDU format

3 BLE Average Delay Performance

In this article, the analysis of the suitability of BLE for time-critical applications expands on results previously obtained in [18]. In the latter, an analytical model that accurately predicts the average effective delay performance of BLE for connection-oriented scenarios was presented. The model, which was validated through extensive simulation, describes the effective delay that accounts for the delay before transmission, dependent on the processing capabilities of the transmitting device, as well as the physical transmission delay. In addition, a critical time interval before reception of a Poll frame, in which an arriving event experiences a further delay before the first transmission attempt, was defined.

With probability \(P_{CritArrival}\), a PDU of an AL event that arrives in the critical interval, in addition to the mean inherent waiting time, experiences a further delay of one Connection Interval due to timing issues in the polling process.Footnote 2 The total delay introduced as a result of the event occurrence time can be written as

$$\begin{aligned} \scriptstyle D_{Slave} = \scriptstyle {\left\{ \begin{array}{ll} T_{wait_{min}} &{} \scriptstyle with\,(1-P_{CritArrival})\\ T_{wait_{min}+ConnInterval} &{} \scriptstyle with\,(P_{CritArrival}) \end{array}\right. }, \end{aligned}$$
(1)

where \(T_{wait_{min}}\) corresponds to the minimum mean waiting time until first transmission attempt takes place. After this, the involved devices begin an alternating exchange of packets until at least one of the conditions that produce the end of the Connection Event is met, either due to error conditions or because there is no more data to be sent.

No fixed limits are stated by the Core Specification for BLE in regards to the number of transmissions within a single Connection Event, although the last frame must be received at least \(T_{Ifs}\) before the start of the following Connection Event. A single transmission is composed of one Data frame, one ACK frame, and the correspondent \(T_{Ifs}\). The minimum time for a complete transmission, or round trip time, is defined in (2), where \(T_{Data}\) and \(T_{ACK}\) are the time duration of a Data and an ACK frame, respectively.

$$\begin{aligned} T_{min}=T_{Data}+T_{Ifs}+T_{ACK}+T_{Ifs} \end{aligned}$$
(2)

Both data and ACK frames must be correctly received for a transmission to be considered successful. The bit errors in a PDU are then classified into two scenarios: the AA field contains corrupted bits, and the AA field is error-free but instead bit errors reside in the CRC. In both cases, the transmission fails and the packet is retransmitted, however, different effects are observed. When the AA contains errors, the Connection Event is immediately terminated and the PDU is retransmitted after the next Poll frame. In contrast, if only the CRC shows corrupted bits, the retransmission attempt is immediately performed, since the Connection Event remains open in this case.

Fig. 3
figure 3

Graphical representation of the possible outcomes of a single transmission

A connection Event can also be prematurely closed as a consequence of consecutive failed retransmissions, in case the two failures were caused by the same frame, i.e., two consecutive failed packets were received from the same device. In order to describe the transmission delay, the possible outcomes of a single transmission, which are: successful transmission, unsuccessful transmission that resulted in the end of the current Connection Event, and unsuccessful transmission that did not force the end of the current Connection Event, were thoroughly studied and expressed analytically in the paper.Footnote 3 This is shown in Fig. 3.

The Bluetooth specification does not set a limit on the number of retransmissions for a given frame, therefore, packets are retransmitted until success and, as a result, the communication is reliable.Footnote 4 In the mathematical modeling and simulation process this fact was respected, hence only the average performance was considered.

Fig. 4
figure 4

Markov chain representation of the model

The round-trip transmission process of a given PDU was then represented with a Finite State Markov Chain with \(N=N_{Max}\) transient states and one Absorbing State. In the resulting chain, depicted in Fig. 4, states 1, 2,..., N represent a single transmission attempt and the absorbing state, represents the success of the transmission. Expressing the transition matrix P in the canonical form, submatrices R and Q can be identified. These enclose the information relative the transitions from transient to absorbing states and the transitions from transient to transient states, respectively. The steady behavior is given by the limiting matrix \(P^\infty = lim_{n\rightarrow \infty } P ^n\).

$$\begin{aligned} P = \begin{bmatrix} I&\vdots&0\\ \dots&\dots&\dots \\ R&\vdots&Q\\ \end{bmatrix} \qquad P^\infty = \begin{bmatrix} I&\vdots&0\\ \dots&\dots&\dots \\ FR&\vdots&0\\ \end{bmatrix} \end{aligned}$$

The product of the submatrices F and R can be found in \(P^\infty\), where F is known as the fundamental matrix and its entries F(ij) represent the expected number of periods spent in the jth transient state before absorption, given that the starting transient state is the ith. F can be obtained in the following way

$$\begin{aligned} F= \sum \limits _{n=0}^{\infty } Q^{n} = (I-Q)^{-1} \end{aligned}$$
(3)

The number of times that a Connection Event ends without delivering a successful transmission is equivalent to the number of times that state 1 of the Markov Chain is visited after the initial transmission. Each time this state is reached, the transmission delay is increased by another ConnInterval time slot. Defining state 1 as the starting point, F(1, 1) represents the mean time spent in this state, and the values F(1, 2 : N) correspond to the mean time spent on each consecutive retransmission state. Using this information of the system behavior, as well as the respective occurrence probabilities, the average transmission delay was deduced.

An event-driven simulator of the BLE communication in the connected state was developed in order to validate the accuracy of the model. The simulator takes as inputs the ConnInterval parameter and the \(P_{Bit}\) over the channel, and schedules the AL events arrivals as a Poisson process. The specifications of the inputs used are given in the original publication. The resulting average over-the-air transmission delay can be seen in Fig. 5.

Fig. 5
figure 5

Average round-trip delay for different values of ConnInterval and Pbit

It was observed that for the lowest simulated \(P_{Bit}\) (\(10^{-6}\)), the reported average round-trip delay did not exceed 2 ms for any of the possible ConnInterval values, but in contrast, in presence of a higher \(P_{Bit}\), the use of a larger ConnInterval greatly increases the average delay, resulting in a maximum average value over 1 s for a \(P_{Bit}\) of \(10^{-3}\).

This tendency is also seen in the results obtained by Gomez [21], and is also congruent with the previous mathematical formulation. With a greater \(P_{Bit}\), Connection Events end prematurely more frequently and, consequently, it is more likely that several Connection Events are required for a single successful transmission. The choice of the ConnInterval is then critical for the delay performance and also strongly influences the power usage. A lower average delay is achieved with the use of smaller ConnInterval, even for the worst \(P_{Bit}\) but, on the other hand, this presents a more energy demanding challenge.

Due to the consideration of an infinite number of retransmission attempts allowed per packet, no packet loss is observed in the presented results and the reliability is, therefore, 100%. However, in time-critical applications, limits have to be set depending on the desired performance metrics. One important remark of this study is that the fundamental matrix F, derived from the absorbing chain shown in Fig. 4, reveals that in the long run, the average time spent in the states beyond the eighth retransmission is zero for the highest \(P_{Bit}\).

4 BLE Adaptation for Real-Time Applications

Hereafter, the mathematical model of the BLE transmission delay, described in the previous section, will be used to analyze the effect of limitations in the retransmission process on the performance metrics. Since the focal interest of this article is to explore the suitability of BLE in time-critical IIoT applications, the study will be evaluated using as indicators the packet loss and the maximum, or worst case, round-trip delay.

4.1 Determinism Versus Reliability

Determinism and reliability are two of the most critical factors when implementing IIoT applications. Opportune and timely end-to-end data transmission is mandatory for real-time systems since the data loses its relevance after a certain time period, rendering the gathered information obsolete and, in turn, degrading the overall performance of the system. This is the motivation to study the determinism that can be achieved with BLE. Industrial applications are not tolerant of deadline missing, hence, if a predictable upper bound for the transmission delay cannot be identified, the protocol will not be a good fit for the domain of interest in this study, regardless of the excellent energy consumption efficiency that the protocol offers.

On the other hand, if data packets are often lost, the protocol will not be suitable for industrial applications, even when a fixed maximum transmission delay can be guaranteed. The main cause of having a non-deterministic behavior comes from the retransmission schemes. The most influent factor for the transmission delay is the additional time delay needed to retransmit and, in the cases in which the transmission failure caused the end of the Connection Event, to wait for the next Poll frame.

When the retransmission process is unbounded, which is the case for BLE, a deterministic performance cannot be achieved. The clear solution for mitigating this problem is to limit the number of retransmission per data packet. By doing this, the maximum transmission delay can be predicted, however, this also results in the rise of packet loss. If for example, retransmissions are not allowed for any data packet, the resultant behavior will show a considerably less reliable performance. Finding the best setting requires knowing the specific requirements of the desired application since some systems are more fault tolerant while others are more flexible from the timeliness point of view.

Table 1 Typical application requirements in the domain of process automation

4.2 IIoT Applications Requirements

IWSN can be implemented in a broad range of applications with different environmental and technical challenges and requirements in each case [22]. Since most industrial processes are relatively complex, there is an inherent requirement for the use of communication systems that not only link the various elements of the industrial process but are also tailor-made for the specific industrial environment. Some of the common scenarios in which IWSN are employed include building automation, factory automation [23], and in recent years significant attention has been given to IIoT and process automation [24].

The study case considered in this article focuses on applications that fall under the industrial process automation range.

4.2.1 Industrial Process Automation

IWSN can be applied to enable condition-based maintenance and remote management of industrial equipment and processes by continuously monitoring time-critical process information, such as temperature, pressure, humidity, vibration, and energy usage. Oil tankers, automobiles, electric motors, conveyor belts and pumps are some examples in which system state information is gathered by IWSN for maintenance and monitoring purposes [25]. These applications are characterized by having a time-critical and mostly fault-intolerant behavior, therefore specific requirements must be guaranteed in order to provide a safe operation.

In [26], Åkerberg et al., provide an in-depth description of the challenges that WSN face in networks deployed for industrial automation applications. In Table 1, some of the specific requirements commonly found in process automation applications of natures such as open-loop/closed-loop control, and monitoring and diagnostics, are highlighted. These values will be used as a reference point to compare to and evaluate the results obtained with the adaptations of the retransmission scheme of BLE, introduced hereafter.

4.3 Retransmission-Bounded Schemes

For the purpose of exploring the determinism and reliability characteristics of BLE, three retransmission-bounded schemes will be introduced and evaluated. Each scheme represents a different alternative for limiting the maximum number of retransmission attempts allowed for the transmission of a single data packet, as well as the way in which those are performed. At the end of this section, the three proposed schemes will be compared in terms of worst case delay performance and data packet loss rate.

4.3.1 Retransmission-Bounded Scheme A

The first proposed retransmission scheme, referred to as scheme A, is the simplest one, allowing retransmission of a certain packet to take place only within the lifespan of the Connection Event in which it was initially transmitted.

Fig. 6
figure 6

Retransmission scheme A

Based on the mathematical model of the transmission process summarized in Sect. 3, the Markov chain representation of this scheme was derived, as shown in Fig. 6. In this case, each node has the same structure as the one depicted in Fig. 3. The main difference against the former Markov chain model is the inclusion of a new state: Packet Lost. Likewise the Success state, the Packet Lost state is an absorbing state.

It can be noticed that for Scheme A, the retransmission process takes place only during the first used Connection Event and, in the case in which the Connection Event ended without the packet being successfully delivered, the packet will no longer be retransmitted, in other words, the following Connection Events will not be involved. Consequently, the maximum transmission delay obtained with Scheme A is independent of the chosen ConnInterval.

In terms of the maximum or worst case delay, scheme A is expected to provide the shorter delay values. However, it is also expected to show the least reliable performance. The worst case delay for scheme A can be predicted with the following expression:

$$\begin{aligned} D_{Worst Case} = D_{Slave} + T_{Poll} + N \times T_{min}, \end{aligned}$$
(4)

where \(D_{Slave}\), \(T_{Poll}\), and \(T_{min}\) represent the delay before the first transmission, the time needed to transmit a Poll frame, and the minimum time required for one round-trip, respectively. N is the number of possible transmission attempts within a single Connection Event.

4.3.2 Retransmission-Bounded Scheme B

In contrast with scheme A, the retransmission scheme B includes an additional Connection Event for performing retransmission attempts of a given data packet. This is shown in Fig. 7, where \(P_{A}\), \(P_{B}\), and \(P_{C}\) are \(P_{Success}\), \(P_{Fail_{Open}}\), and \(P_{Fail_{Close}}\), respectively. The latter three variables are defined in [18] and shown in Fig. 3. In this case, since a second Connection Event is involved in the process, when the Connection Event in which the packet was originally transmitted closes without successful delivery, the packet transmission will be reattempted after the waiting period for the starting Poll frame of the next Connection Event. It is clear to see that the worst case delay will be dependent on the value of the ConnInterval parameter that is used.

Fig. 7
figure 7

Retransmission scheme B

In comparison with scheme A, retransmission scheme B is expected to show an enhancement in the reliability performance, this in expenses of a naturally higher worst case delay. Considering that the ConnInterval parameter ranges from 7.5 ms to 4 s, a worst case delay low enough for IIoT applications requirements can be attained, even when two Connection Events are needed to transmit a single packet. It is then reasonable to consider this alternative as a viable option. The opposite can be said regarding the inclusion of more than two Connection Events, since the worst case delay would exceed most limitation in real-time systems.

The worst case delay for scheme B is then:

$$\begin{aligned} D_{Worst Case}=D_{Slave}+ConnInterval+N_{2}\times T_{min}, \end{aligned}$$
(5)

where, as can be seen in Fig. 7, \(N_{2}\) represents the maximum number of retransmission attempts within the second Connection Event. Scheme B is designed to be symmetric regarding the number of retransmission within both of the involved Connection Events, i.e., the maximum number of retransmission attempts in the first Connection Event is equal to that of the second Connection Event.

4.3.3 Retransmission-Bounded Scheme C

The retransmission scheme C is shown in Fig. 8. It shares a similar structure with scheme B but in contrast to the latter, scheme C presents an asymmetric form, i.e., the number of retransmission attempts within each Connection Event is different. In particular, we considered, as an example, an even number of retransmission attempts in the first Connection Event, \(N_{1}\), which is double of the number of retransmission attempts allowed in the second Connection Event, \(N_{2}\).

Fig. 8
figure 8

Retransmission scheme C

From the point of view of worst case delay and packet loss, scheme C should expose a behavior closely similar to that of the obtained using scheme B, with scheme C being slightly faster. However, when factors such as energy efficiency and memory management are taken into consideration, scheme C can potentially be more attractive. Considering that in the second Connection Event used, the packet that is being retransmitted might not be the most recent or the one with the highest priority, therefore, allowing fewer attempts at this point makes possible for the device to empty the packet queue in a more efficient way. Also, fewer retransmissions translate to a reduced power consumption.

The worst case delay characteristic of scheme C is, as in scheme B:

$$\begin{aligned} D_{Worst Case}=D_{Slave}+ConnInterval+N_{2}\times T_{min} \end{aligned}$$
(6)

4.4 Results Comparison

In order to obtain the characteristic results of each retransmission scheme, the simulator developed for obtaining the results presented in the study discussed in Sect. 3 was modified to fit the structure of the different Markov chains resulting from the proposed retransmission schemes. The observed results are hereby shown in two parts. In the first place, in Sect. 4.4.1, the effect of the variation of the total number of allowed retransmissions on the packet loss probability is tested for schemes A, B, and C. Secondly, in Sects. 4.4.2 and 4.4.3, the three schemes are compared in terms of reliability and determinism for a given number of retransmissions.

4.4.1 Packet Loss Rate Versus Number of Retransmissions

In a similar fashion as in the study of the average delay performance, in this case, the simulation of the transmission of packets generated with a Poisson distribution between two devices, required as inputs the ConnInterval value, the variating \(P_{Bit}\) over the channel, and the maximum number of retransmissions per packet. After simulating 5,000,000 packet transmission under the parameter setting listed in Table 2, the packet loss rate, characteristic of each of the three retransmission schemes under test, for different numbers of allowed retransmissions was obtained. The results are shown in Figs. 9, 10, 11 and 12, and each figure represents the behavior for bit error \(P_{Bit}\) conditions of orders \(10^{-3}\), \(10^{-4}\), \(10^{-5}\), and \(10^{-6}\), correspondingly in that order.

Table 2 Simulation parameters

It can be concluded that scheme A provides the worst packet loss rate for all the considered scenarios, while schemes B and C, as expected, show a closely similar result. With the highest \(P_{Bit}\) evaluated, scheme A presents a packet loss rate of almost 10% and, for the same conditions, the other two schemes are below this value by one order of magnitude. This is expected since scheme B and C allow retransmission to be performed within two Connection Events, thus giving higher probabilities of successful transmission. As the bit error rate conditions improve, the superiority of the second and third schemes over scheme A is more evident, with scheme A achieving its best packet loss rate of around \(10^{-4}\) only for the best bit error rate condition of \(10^{-6}\). As can be seen in Fig. 10, schemes B, and C, already reach the same performance level for the \(10^{-4}\), and for a \(P_{Bit}\) of \(10^{-6}\), the results are below an order of magnitude of \(10^{-8}\). It is natural to conclude that scheme A does not support reliable communication requirements.

As mentioned in Sect. 3, an important finding in the study of the average delay performance is that the information contained in the Fundamental Matrix F exposes the fact, that statistically, the mean time spent in the states after the eighth consecutive retransmission within a single Connection Event is zero. This was the motivation for studying the proposed retransmission schemes in terms of the number of retransmissions allowed, ranging from 1 to 8. As depicted in Figs. 9, 10, 11 and 12, results demonstrate that even after the sixth retransmission, no remarkable enhancement is achieved. With this in mind, for the rest of this analysis, a configuration with a maximum of 6 retransmissions per packet will be used.

Fig. 9
figure 9

Packet loss results comparison for \(P_{Bit}=10^{-3}\)

Fig. 10
figure 10

Packet loss results comparison for \(P_{Bit}=10^{-4}\)

Fig. 11
figure 11

Packet loss results comparison for \(P_{Bit}=10^{-5}\)

Fig. 12
figure 12

Packet loss results comparison for \(P_{Bit}=10^{-6}\)

4.4.2 Worst Case Delay

After simulating 24 h of continuous data transfer between two devices implementing schemes A, B, and C, the worst case round-trip transmission delay achieved by each retransmission scheme was obtained. The configuration of the simulation parameters was the same as in the previous experiment, with the exception that the bit error rate was varying throughout the simulation time, ranging from \(10^{-4}\) to \(10^{-6}\), and that a maximum of 6 retransmission attempts per packet were allowed.

Figure 13 represents the worst case delay for each scheme using different values of the ConnInterval parameter. It is clear to see that scheme A provides the lowest maximum delay of around 5 ms and that, as expected, it is independent of the used ConnInterval. However, as discussed in Sect. 4.4.1, the performance obtained with this scheme cannot be considered to be reliable. On the opposite, schemes B and C show a maximum delay behavior that increases for higher ConnInterval values. This behavior is due to the waiting period until the beginning of the second involved ConnenctionEvent. Both, schemes B and C provide similar worst case delays, showing a resulting maximum delay smaller than 100 ms for a considerably large ConnInterval value of 90 ms. Referring to Table 1, it can be seen that these maximum delay metrics can be considered as a good fit for real-time IIoT applications. Given the time-critical nature of the targeted applications in this analysis, the use of ConnInterval values larger than 100 ms is not considered to be relevant.

Fig. 13
figure 13

Worst case delay results comparison

4.4.3 Packet Loss Rate

The packet loss rate provided by the three different retransmission schemes was also obtained from the above-mentioned experiment. In this case, the configuration used is the same as before, given in Table 2, for a maximum of 6 retransmission attempts per packet. Figure 14 shows the average packet loss rate of schemes A, B, and C, observed throughout the entirety of an observation period of 24 h of continuous data exchange between two devices, under a varying bit error rate in the range of from \(10^{-4}\) to \(10^{-6}\).

Scheme A, as expected and congruently with the previously exposed results, shows a remarkably high packet loss rate of \(2.8\times 10^{-2}\). Schemes B and C, on the other hand, present a considerably reliable performance, with an overall average packet loss below \(2.25\times 10^{-5}\) for both configurations.

At this point, it is clear to see that scheme A can be discarded since it does not comply with the regular reliability requirements of industrial IIoT applications. Regarding schemes B and C, both of the schemes expose a very similar behavior, however, scheme C has proven to be slightly superior. If in addition to this, resource utilization, such as memory management and energy consumption, is taken into consideration, the clear best option is Scheme C.

4.5 Suitability for Process Automation

In the previous sections, it was proven that the retransmission scheme C provided the best performance in terms of the balance between reliability and timeliness. This was deduced by simulating the transmission of data packets under general operation conditions and considering only two devices involved. The study of the behavior of BLE adopting scheme C under more realistic and application-specific parameter setting is presented in this section.

Fig. 14
figure 14

Packet loss results comparison

4.5.1 Parameters Configuration

The first consideration that has to be made is in regards of the network topology. The simplest and most energy-efficient is the single-hop star topology with a hub or coordinator, in this case, the master device. Implementing a star-topology and using the configuration proposed in scheme C, the setting of the parameters must be based on the requirements of industrial process automation applications. In the study case tested in this section, the values chosen for the simulation parameters should provide results that suit the most demanding range of applications within process automation and IIoT. In particular, we aim at achieving the most stringent set of requirements of the open-loop/closed-loop control applications. With this in mind, we now proceed with the deduction of the values to be used.

Considering that for scheme C, a maximum of two Connection Events are required for a single data packet transmission, it can be said that:

$$\begin{aligned} 2 \times ConnInterval\le {\text {Node}}\,{\text {update}}\,{\text {period}} \end{aligned}$$
(7)

Then, remembering the expression of the worst case delay that characterizes scheme C, it must be guaranteed that its value does not exceed the maximum tolerated end-to-end transmission delay. It follows that:

$$\begin{aligned} D_{Worst Case}\le & {} {\text {Maximum}}\,{\text {transmission}}\,{\text {delay}} \end{aligned}$$
(8)
$$\begin{aligned} D_{Worst Case}= & {} D_{Slave} + ConnInterval + (N_{2} \times T_{min}) \end{aligned}$$
(9)

As seen in Table 1, for the most stringent study case within the industrial process automation, we considered a maximum transmission delay of 50 ms and a node update period of 100 ms. As stated in (7), it is required that

$$\begin{aligned} 2 \times ConnInterval \le 100\,\hbox {ms} \end{aligned}$$
(10)

and substituting in (9) with \(N_2 = 2\), it follows that

$$\begin{aligned} ConnInterval \le 40\,\hbox {ms} \end{aligned}$$
(11)

Finally, considering a maximum of 6 transmission attempts per packet, thus, four transmission attempts are allowed to be performed in the first Connection Event, and that exactly two can take place during the second Connection Event, the minimum Connection Event span that complies with the condition of accommodating at least 4 round-trip times is 3.1 ms.

Fig. 15
figure 15

Delay results for scheme C

A BLE master device can have active links with as many slave devices as it can physically handle. The master device follows a TDMA approach to communicating with the slave devices, hence the inclusion of more slave devices does not change the validity of the results exposed in previous sections. The main limitation to consider is that within a time period of length ConnInterval after a Poll frame is sent, the master device must be able to communicate with all the slaves for the complete Connection Event span. Therefore, using a ConnInterval of 40 ms, the maximum number of slaves that can be supported under this configuration without violating the operation mechanics is 12. For BLE implementations, 12 is considered to be a high number of slaves. Vendors commonly set different limitations regarding the number of slaves. For the sake of staying true to the theory, 12 will be considered the maximum number of slaves.

In the following section, the resulting performance under this setting will be presented.

Fig. 16
figure 16

Delay distribution for scheme C

4.5.2 Maximum Delay and Delay Density

The results presented in this section were obtained by simulating continuous data packet transmission under time-varying bit error rate conditions in the range from \(10^{-4}\) to \(10^{-6}\), over a period of 24 h. In order to study the suitability of the proposed BLE adaptation for process automation in IIoT applications, the simulation was performed using a ConnInterval of 40 ms, and a star-topology network with 12 slave devices connected to a single master. The remaining parameters were set to the values found in Table 2.

Figure 15 shows the different delay-related metrics observed throughout the simulation time. The minimum transmission delay obtained was 4.84 ms, which corresponds to the best case scenario in which the data packet is successfully transmitted in the first attempt.

In average, a considerably low transmission delay, close to 5.6 ms, was reported. This implies that, in most of the cases, the obtained delay resides in the lower range, which is consistent with the information obtained from the Fundamental Matrix F. As explained in Sect. 3, the probability of requiring additional consecutive retransmission attempts significantly decreases with every attempt, i.e., the smaller portion of the samples will require the maximum number of retransmissions to be successful. Finally, a worst case transmission delay of 45.6 ms was observed, which proves that the condition of having a maximum transmission delay below 50 ms was, as predicted by the analytical model, never violated.

In Fig. 16 the results related to the reliability levels achieved by implementing scheme C are depicted. In order to present the behavior pattern in the clearest way possible, the reported transmission delays were grouped into three ranges: below 5 ms, below 10 ms, and below 46 ms. With a probability close to 0.984, data packets were successfully delivered within 5 ms. With a probability of 0.997, the obtained delay is lower than 10 ms, and, finally, data packets were successfully transmitted before 46 ms with a probability of almost 0.999.

In Table 1, it is stated that for open-loop/closed-loop control applications, the maximum tolerated packet loss rate is in the order of \(10^{-5}\). The observed overall packet loss rate obtained with this experiment was close to \(2\times 10^{-5}\). The viability of the proposed adaptation—scheme C—for industrial process automation applications was proven. By using a ConnInterval of 40 ms with a star topology supporting a maximum of 12 slave devices, the critical QoS metrics were satisfied. The worst case delay was lower than the upper bound set for the application, and in average, most packets arrived before 10 ms. Finally, the exposed behavior was found to be considerably reliable, with a packet loss rate almost one order of magnitude smaller than the maximum allowed.

The proposed BLE adaptation could potentially be implemented for covering non-safety-critical applications in the domain of process automation. Intrinsic limitations of the BLE technology, however, must be considered, such as the limited number of slave devices and coverage range.

5 Conclusions

In this study, a previously developed analytical model that predicts the average delay performance of BLE under connection-oriented scenarios was used as the base to explore and evaluate the potential suitability of the protocol for time-critical IIoT applications, more specifically, for the process automation domain.

For this purpose, three different schemes with a bounded maximum number of retransmissions per packet were suggested and tested. After extensive simulations, the results related to determinism and reliability were presented. It was shown that by adapting the BLE transmission process to follow a structure similar to the one observed in scheme C, the highly demanding requirements found in real-time IIoT implementations can be satisfactorily fulfilled. In this scheme, two consecutive Connection Events are allowed to be used for the retransmission of a given data packet. The example configuration used for this study permitted a maximum of 6 retransmission attempts, distributed in a such a way that most of the attempts take place during the first Connection Event and only a few during the second one. In this way, the best balance between worst case delay and packet loss rate, as well as optimized energy and resource utilization were achieved, proving that the adaptation is indeed a good fit for the aimed goal.

For future studies, it would be interesting to take into consideration the aspects that were excluded from the scope of this article, such as the buffering process of the arriving events and the preamble error detection [27], since these notably contribute to the final performance. Also, expanding upon the models introduced in the presented work, multihop configurations could be explored. By extending the solution to cover multi-hop network topologies, the range of applications would become greatly wider. Another important topic that can be considered is priority handling, in order to offer immediate wireless channel access to prioritized data-sending devices.