Simulation of the Application Layer in NarrowBand Networks with Conditional Data Injection XML Scheme Based on Universal Data Generator

In this article, we would like to deal with challenges and analysis approaches in the area of narrow band communication networks. Especially those networks which use TCP/IP protocol family. We also present a new universal data generator for OMNeT++ simulation environment. We created this generator to satisfy the evaluation, stress testing and benchmarking demands of more and more complex industrial and the Internet of Things networks. We also present the methods for evaluation and comparison of results obtained from simulated and real TCP/IP based networks in this article.


Introduction
In present time we can see a vast expansion of TCP/IP protocols and their penetration into locations previously ruled by simple binary systems and protocols (IR, Modbus, IEC 60870-5-101, etc.). One of the main reasons is the boom of the Internet of Things (IoT) and the deployment of modern communications [1] and [2] and cybernetic technologies into the industry, generally referred to as Industry 4.0 [3]. The main reason for this expansion is the deployment of an increasing number of new services that require the transmission and processing of an ever-increasing amount of data -Big data issue [4].
We can see increasing diversity in TCP/IP transmission technologies, which are used. In addition to con-ventional Ethernet technology, there are wireless radio technologies (Wi-Fi, WiMAX, LTE), Li-Fi technology, or powerline communication technologies (BPL, NPL). Also, wireless narrowband transmission technologies form a large group.
These narrowband technologies are characterized by a thin transmission channel (up to hundreds of kHz) that creates a data channel with speeds of up to tens of kb·s −1 . Latency ranges from hundreds of milliseconds to single seconds. These transmission technologies are widely used in industrial networks for automatic data transmission and, to a limited extent, for control. Due to the rapid development of SmartGrids and Advanced Metering Management technologies [5] and [6] and their global deployment, there is also a need of deploying TCP/IP services and applications on narrowband transmission technologies.
Applications and services operating in broadband networks with sufficient transmission capacity differ significantly from narrowband networks. It is common that TCP protocol regulatory mechanisms, which are based on feedback (latency, actual throughput, etc.) information, are very unreliable when deployed in narrowband networks. The reason is the usage of both transmission directions (upstream and downstream). The communication paths in both directions can be quickly congested by network traffic, which results in a significant increase in latency (up to tens of minutes) and thus the inevitable breakdown of the TCP connections. The current trend to optimize the TCP protocol rather for broadband and quite reliable lines makes the situation even worse. Without proper modification, the deployment of TCP in this type of network is very problematic and inefficient [7] and [8].
Due to a vast number of TCP/IP applications and services, it is complicated to verify their usability for a particular technology. Due to this limitation, discrete simulation tools are used, such as OPNET (see https://www.riverbed.com/gb/products/ steelcentral/opnet.html?redirect=opnet), NS-3 (see https://www.nsnam.org/Overview/ what-is-ns-3/), OMNeT++ (see https: //omnetpp.org) to verify the deployment capability of the protocol.
For the latest tool mentioned, the OMNeT++, a universal data generator (UDG) was developed to create a simple SW tool to simulate and verify a particular communication scheme on a specific communication technology. UDG can maintain dependency between protocol packets, and thanks to a certain degree of abstraction, the analysis and deployment is much less resource demanding when compared to full and complex real protocol implementation.

2.
State of the Art Simulation tools are very useful for telecommunication systems design and optimization. Also, they can be used to diagnose and simulate network problems. The accuracy of the simulation depends on a number of factors [9]. One of the most important tasks is to find and use the most accurate model that matches the real situation.
For whole telecommunication network simulation(all ISO/OSI levels), discrete simulation tools are now used exclusively. In this kind of simulation tools, the simulation status changes only after receiving the message from the last performed step. This is a fundamental difference compared to classical waveform simulators (such as Matlab or Octave) that try to approximate the simulated problem with mathematical formulas. This method of discretization allows implementing the real behavior of communication technologies at the level of hardware, as well as software equipment and communication protocols (such as TCP/IP, Ethernet).
There are currently many discrete simulation tools. One of the most famous commercial tools is Opnet. In the open source community, there are two very popular tools: NS-3 and OMNeT++. In all mentioned tools, you can find an implementation of all different kinds of technologies such as Ethernet, TCP/IP, Wi-Fi, WiMAX, Bluetooth and more. In addition to these technologies, some services are also implemented.
Simulation tools can be classified by a range of criteria starting from a number of available models to the simulation speed. The great challenge for all the simulation tools is the complexity of the models and the time required for simulation. The more complex models are, and the depth of simulation is, the higher the possibility to obtain more accurate result is. However, on the other hand, it takes much more time and computation resources to finish simulation task. One of the solutions to shorten simulation time is using less complex models and maintain just an acceptable level of accuracy. This can be achieved with data generators that do not work with a real-time communication scheme (protocol), but they inject only a specific bitrate into a network. This packet stream has a significantly simpler scheme than a real protocol or application (for example RENETO [10] or NS-3 models [11]).
The UDG is that kind of generator, which, unlike its competitors, can ensure inter-linking between messages, thus more reliably simulate real-time data traffic and still operate with a decent level of abstraction.

3.
Universal Data Generator

Description of the Generator
The UDG is developed as a simulation module for OM-NeT++. The UDG was designed as a standard TCP or UDP OMNeT/Inet application such as TCPAppBase, UDPAppBase, etc. What makes the UDG interesting is the capability of conditional event simulation that allows us to program the specific behavior of a particular protocol or application. When defining a communication profile, it is possible to make the process of sending the specific message to be dependent on the reception of the previous message. This conditional message injection into a network is not limited to use with a single profile but can be extended to use across different profiles. This conditional network message injection gives us considerable freedom to define different test scenarios, especially for application layer testing, where we can simulate the application's communication response to received messages from various sources.

Profile Files
The individual communication profiles can be combined into one complex XML file. The short example of profile definition is shown in the List. 1.
A profile definition starts with the profile ID specification, then defines the basic parameters for the connection as the protocol family ("type"), source/destination addresses ("src_ip", "dst_ip") and ports ("src_port", "dst_port"). The source and target IP addresses can be replaced with the unique "nodes" names within the OMNeT++ environment (This is very useful when autoconfiguration of IP address is used). The definition of the profile then continues with precise definitions of a size of the data messages, their sending times and their dependency. A timing of mes-sage is different in case that message depends on a reception of another message. In the case of a dependent message, a "time" parameter is treated as a processing delay of a message instead of absolute send time in case of a message with no dependency. Each message also contains information about its direction: "sd" states from source do destination direction and "ds" states from destination to source direction. The parameter "prev" control the message dependency on reception of the previous message and the "pprev" parameter control the potential dependency on a message from a different profile. Also, a payload can be specified by parameter "type". The "rnd" value means that random data are used as a payload. That allows creating very complex communication profiles.

Simulation and Analysis of Narrowband Transmission Technologies
Narrowband transmission technologies are predominantly used to collect data. Occasionally, they can be used for remote commands. The data channel is mostly asymmetric, and most of the data flows transports data in the upstream direction (collecting data from sensors, etc.). Response time range from hundreds of milliseconds to seconds. If systems use the ISM (Industrial, scientific and medical) band, then latency can reach up to units or tens of minutes (due to spectral limitations). The range of communication protocols is also very diverse, ranging from binary (M-Bus) to TCP/IP (IEC 60870-5-104, Modbus, etc.) [5] and [12].
Just for verifying narrowband communication technology models, several UDG profile files have been proposed. These profiles are applicable to determine the boundary parameters of the tested communication technologies, but also to verify their capabilities for real-world data traffic simulation: • Real communication according to IEC 60870-5-104 -communication profile is based on the realtime dump data analysis of IEC 60870-5-104 pro-tocol which is commonly used in SCADA (Supervisory Control And Data Acquisition) systems.
• Benchmarking profiles -General test profiles to verify the TCP/IP deployment parameters.
A more detailed description of each profile is given in the following chapters.

Real Communication Profile
According to IEC 60870-5-104 A special profile was prepared to approximate the simulation environment, based on a realistic capture performed on Racom's RipEX (see http://www.racom.eu/eng/products/ radio-modem-ripex.html) narrowband wireless network. This network is used to collect data using the IEC 60870-5-104 protocol. The network periodically reads data from devices, and once in a specific interval transmits the time synchronization. The communication profile shown in Fig. 1 and Fig. 2 was constructed regardless of the exact semantics of the communication protocol used -just an indication of the direction (upstream, downstream), the size of the transferred data and location in time was used. In the first figure, the suggested profile is displayed as a timeline graph, which shows the packets in upstream and downstream direction. From this, the exact location of each packet over time can be read. The sequence diagram in second figure shows exact packet dependency. Next UDG Response packet can only be send if previous one was received (UDG Request or UDG Response).

1) Dual
The first profile called "Dual" produces flow in both communication directions (upstream and downstream). The amount of concurrent communication is limited by the amount and character of the given profile. The profile specifies that at every single moment there is maximum of one packet (in any direction) which is transmitted.
Due to the strict dependency between packets within a communication profile, this type of profile is not suitable for UDP-based simulations. In the case of a single packet loss, the entire profile will fail. The profile preview is again shown in the form of the timeline and the time sequence of the graph in Fig. 3 and Fig. 4.
The profile itself is suitable for simulation of the application network load where the sending of the new request must precede the receipt of a response to the previous request (TCP flow).

2) ZigZag
The second profile is called "ZigZag". This type of profile, as well as the previous "Dual", contains a dependency between the packets, but unlike the previous one, this dependency has only a limited number of consequent messages. Also, the profile contains precisely defined points when these consequent message sequences start. When the network does not manage to handle sending or receiving messages, the number of packets in the network increases (counting all the packets that are stored in different buffers) and hence heavy network congestion arises.
This profile is more resistant (then "Dual") to a packet loss. Because when a packet is lost, a shorter message sequence is interrupted, therefore it is also suitable for use with UDP protocol.
The described profile is shown at the Fig. 5 and Fig. 6.  During normal operation (low up to moderate network load), the "Dual" and "ZigZag" data flows look very similar. Once the load of the network rises, the "Dual" flow can maintain the low level of congestion over time as this profile if self-regulated (one packet at the time maximum). The "ZigZag" profile structure is based on shorter messages sequences. When some of them are delayed because of network load, they begin to overlap with the others.

Verification and Evaluation of Simulation Profiles
To demonstrate the benefits of using UDG and designed profiles for simulation, testing, and analysis of narrowband networks, a statistical analysis of a significant number of the simulation results was performed. These narrowband protocol simulations were aimed at analyzing the success of the TCP connections and at optimizing the fair allocation of resources among the individual data streams. The simulations show how the selected profile affects the success of the data delivery, and also how the profile selection can influence the fairness of the resource allocation.
The first method is the well-known Jain's fairness index. This index can be used in various areas for general fairness evaluation. We used this index for evaluation of the fairness of resource allocation for each data stream within all measured transferred data. The index ranges from 1 n to 1. As we use the measurement scenario with 16 simultaneously active TCP sessions, the n is equal to 16. This approach is very well suited to profiles where there is a fairly predictable equitable allocation of resources ("Dual" and "ZigZag"). Because some profiles are not naturally balanced in this way (IEC 60870-5-104), application of JFI is more demanding of proper index setup.
The second method of analysis is CER/TER. This method provides a reliable determination of the success of the TCP connection or transmission of individual TCP/IP data streams within the UDG profiles we used. Compared to JFI, this method can be used the same way for naturally balanced and unbalanced communication profiles such as (IEC 60870-5-104). This method can also be parametrized via Endpoint Reuse Interval (ERI), which specifies time of connection inactivity, after which we count it as there is a new connection. This parameter allows us to use more or less critical view on analyzed data.
Both of these methods, when applied on the packet level, are rather suitable for analysis of limited data sets such as those, which are obtained from narrow band simulations. In the case of analysis of data captured from high-speed networks (hundreds or thousands of megabits), these methods are only applicable for shorter data captures. Otherwise, the amount of data for analysis can be overwhelming and impossible to process in reasonable time.

Configuration of the Simulator
The verification of generated UDG data profiles was performed within the OMNeT++ simulation environment. We used narrowband communication protocol, which was implemented using the CSMA (Carrier Sense Multiple Access) method and the TDMA (Time Division Multiple Access) method with full TCP/IP transparent transmission support. Two different medium access methods were chosen to observe differences between random access method (CSMA) and deterministic access method (TDMA). Within the CSMA approach, our analysis mostly tests congestion control mechanisms (backoff algorithm, collision detection/avoidance, etc.). Within the TDMA approach, it tests quality (responsiveness, fairness, etc.) of the scheduling algorithm. Overview of basic simulated network information: • Topology: star with retranslation -setup was based on real network topology with corresponding TX power setup and path attenuation model (see Fig. 7).
• Number of end elements: 16 (One TCP session per element).
• TCP alg: NewReno, CW = 8192, scaling disabled. For testing purposes, we created specific simulation setup. This setup consists of two different MAC (Medium Access Control) protocols labeled "Base" and "Poll" as mentioned in the list before. The measured optimization property was the TCP MSS (Maximum Segment Size) of size 256 and 536 bytes in both cases. In the case of "Base" protocol, we also measured the impact of packet buffer timeout. The label "1T0" references the timeout of one second, the label "2T0" references the timeout of two seconds, etc. We selected this particular dataset of results to demonstrate the impact of mentioned profiles on fair resource allocation among TCP/IP streams and connection stability. The results of evaluation through JFI are shown in Fig. 8,  Fig. 9, Fig. 10 and Fig. 11 and the results of evaluation through CER/TER metric are shown in Fig. 12 and Fig. 13.  compare the results of MAC protocol optimization on a statistical basis. It also allows us to compare not only the impact of protocol optimization but also impact of different protocols.

Interpretation of Achieved Results
In the case of stress testing and benchmarking of the simulated data network, we need to saturate the network with specific network traffic. It is interesting to compare the results in situations where "Dual" profiles were used as the source of the load with the results of the simulations that used the data profiles "ZigZag" as the traffic source. The Fig. 8 and Fig. 9 show a significant impact on the fair distribution of resources among the data streams based on the type of input data stream. We can also observe the impact on the connection stability with CER/TER metric output as shown in Fig. 12 and Fig. 13. It is evident that stress testing scenario consisting of "Dual" profiles can guarantee a certain degree of equality. This equality is due to the fact that for these profiles there is always a maximum of one data message present on the network for certain profile, and other messages are postponed within a single profile until the previous message is delivered to its destination. This also limits the congestion of the network to a level directly dependent on a number of active profiles. The situation is different with "ZigZag" profile. This profile can build up a fair amount of congestion in a network over time as this traffic profile is not self-limited to transmit only one packet at a time. The greater amount of congestion causes the resources to be distributed unfairly among different data streams.
The situation where the data profiles were used in their character corresponding to the variant of the IEC 60870-5-104 protocol is interesting for other reasons. To demonstrate the basic impact of specific IEC 60870-5-104 profile fair resource allocation in a network we use the same settings of JFI metric. This is suboptimal for fairness evaluation in this unbalanced profile but reveals the profile imbalance. We also used the CER/TER metric, which as expected, shows that the network does not have any major problems with the transmission of this profile and does not suffer from connection failures. The only single connection failure observed in "Poll_536" data set was caused by an error during one of the first TCP connection establishment phases at one of the connections, immediately after simulation started. No other failures were observed.

Conclusion
We introduced the universal data generator and its deployment in simulations of wireless narrowband networks. The presented profiles showed in detail the strong points of universal data generator and its importance for network discrete simulations. Due to the nature of narrowband networks and their deployment in IoT and industry, three TCP/IP data profiles were introduced. The first one was based on the IEC 60870-5-104 flow pattern. The other two were pure synthetic profile patterns ("Dual" and "ZigZag"). These profiles were applied to the narrowband radio network model and were evaluated with JFI and CER/TER metrics.
The result of the comparison showed that the transmission of TCP/IP data by the narrowband communication system is very prone to the nature of the transmitted data and their interconnection. Simulations showed that a poorly chosen structure of transmitted data could significantly affect the fairness of individual data streams (according to the JFI), but also the connection stability(according to CER/TER). We also demonstrated that the JFI needs to be well set to obtain relevant results.
In the future research, we would like to enhance our FlowPing tool [15] to take advantage of UDG XML profile description. This new functionality will enable us to use the UDG in real-world and directly compare the results with simulations. Then we will refine not only the network models we created, but we will improve and streamline the actual testing in real production environment.