A Distributed Covert Channel of the Packet Ordering Enhancement Model Based on Data Compression

: Covert channel of the packet ordering is a hot research topic. Encryption technology is not enough to protect the security of both sides of communication. Covert channel needs to hide the transmission data and protect content of communication. The traditional methods are usually to use proxy technology such as tor anonymous tracking technology to achieve hiding from the communicator. However, because the establishment of proxy communication needs to consume traffic, the communication capacity will be reduced, and in recent years, the tor technology often has vulnerabilities that led to the leakage of secret information. In this paper, the covert channel model of the packet ordering is applied into the distributed system, and a distributed covert channel of the packet ordering enhancement model based on data compression (DCCPOEDC) is proposed. The data compression algorithms are used to reduce the amount of data and transmission time. The distributed system and data compression algorithms can weaken the hidden statistical probability of information. Furthermore, they can enhance the unknowability of the data and weaken the time distribution characteristics of the data packets. This paper selected a compression algorithm suitable for DCCPOEDC and analyzed DCCPOEDC from anonymity, transmission efficiency, and transmission performance. According to the analysis results, it can be seen that DCCPOEDC optimizes the covert channel of the packet ordering, which saves the transmission time and improves the concealment compared with the original covert channel.

transmitting data to transmit hidden information [Xue, Wang, Zhang et al. (2018)]. Lampson [Lampson (1973)] originally proposed the concept of covert channels in 1973. In 1996, Handel et al. [Handel andSandford (1996); Wang, Yang, Fu et al. (2016)] introduced covert channels to computer networks for the first time. Unlike the traditional secret information transmission method, the covert channel not only hides the content of the transmission but also hides the transmission method [Luo, Qin, Xiang et al. (2020)]. The existing network covert channels are covert storage channel [Rios, Onieva, and Lopez (2012); Taheri, Mahdavi and Moghim (2018)] and covert timing channel [García, Zunino and Campo (2014); Wu, Wang, Ding et al. (2012); Zhang, Liang, Zhang et al. (2018)]. This paper mainly studies the covert timing channel. The covert timing channel refers to the sender embedding information into time-related parameters. The receiver and sender receive and send hidden information through pre-set rules, that is, time parameters such as rate of change, order, and interval. The normal network communication channel has its regularity, while the covert timing channel transmits hidden information by changing the interval of packets send. The covert timing channel may change the statistical characteristics of packets in network communication. The traditional detection method uses this statistical feature to detect the covert timing channel. Therefore, the statistical detection method is generally used to distinguish the covert timing channel and the normal channel. The current common methods for detecting the covert timing channel are information entropy method, Compressibility-Walk method, ranking method [Wang, Ju, Zhou et al. (2009)], etc. The detection methods of covert channels in the network are essentially the statistical analysis of the time distribution of data packets. The reason why these detection methods are successful is that the existing covert timing channel will change the time distribution of packets in the communication process. It makes the covert timing channel deviate from the statistical characteristics of normal network communication seriously [Liu, Zhai and Dai (2012)]. This model uses a lossless compression algorithm to reduce the transmission time and weaken the statistical probability of hidden information. The distributed system can hide data and deviate time distribution characteristics in this model.

Related works 2.1 The covert channel of the packet ordering
The principle of covert channel of the packet ordering is to obtain the corresponding secret information from the time sequence of data packets. The sender and the receiver first specify multiple transmit ports and one receive port, and they establish connections in turn. The receiver looks up the mapping table in the order in which the packets arrive to obtain the secret data. If n bits of data are transmitted in each round, and the communication parties establish m connections. There are ! m different possible situations. If the port sorting can fully express the transmitted data, it must meet the requirement of ! 2 n m ≥ . Therefore, the relationship between the transmission of n bits of data per round and the number of connections m established by the communicating parties is: (1) Fig. 1 shows an example of the covert channel of the packet ordering. According to the different arrival order of port data, the receiver obtains different data by retrieving the mapping  Figure 1: An example of the covert channel of the packet ordering

On\off covert channel
In a fixed time interval, the receiver can identify the bit "1" or bit "0" by judging whether there is a packet arriving, so as to obtain secret information. The sender and receiver must specify a fixed time interval, duration and synchronization method in on\off covert channel. Different from the covert channel of the packet ordering model, the on\off covert channel only takes advantage of whether packets arrive within a period of time to obtain the hidden information and can transmit the hidden information in a relatively secret way. On\off covert channel model, as shown in Fig. 2. The sender and receiver define a fixed time, the receiver begins to receive secret information after considering the transmission delay. When there is no data transmission within the specified time, it indicates that the hidden information transmitted by the sender is bit "0", and when there is data transmission within the specified time, it indicates that the hidden information transmitted by the sender is bit "1".
The Sender The Receiver 1 0 0 1 Figure 2: on\off covert channel model

Data compression
Data compression first appeared in the early 19th century. With the rise of computers, data compression gradually played an important role in the computer field. Compression algorithms are mainly divided into lossless compression and lossy compression. Lossless compression takes up more space and has lower compression than lossy compression, but it retains all the original information without any data loss. Some image formats, such as PNG, use lossless compression. Lossless compression is suitable for situations where the compressed and uncompressed data must be consistent with the original data. Lossy compression refers to discarding nonessential information and sacrificing some quality to reduce data volume and improve compression ratio. Lossy compression is mainly used in streaming media and Internet telephone. After years of development, various compression algorithms come out in an unending flow. Typical lossless compression algorithms include Huffman coding [Najmabadi, Tran, Eissa et al. (2019)], LZ77 compression coding, LZMA compression coding, etc. The compression algorithm can reduce the storage space and transmission time of secret information and weaken the statistical probability of secret information. When part of the secret information is intercepted, the content of the secret information cannot be analyzed and predicted [Xiang, Wu, Li et al. (2018)]. The commonly used lossless compression codes can be divided into entropy coding, dictionary coding and run-length encoding. In the process of string processing, entropy coding usually needs to know the distribution of data in the whole string file in advance.
In other words, entropy coding knows how often characters appear before compressing strings or files. But dictionary coding can be compressed more efficiently without knowing the files. Run-length encoding is often used to process binary images.

Distributed system
The distributed system is a loosely coupled system in which communication lines interconnect several processors. From one processor in the system, the other processors and the corresponding resources are remote, and only its resources are local. So far, the definition of distributed system has not formed a unified view. Generally speaking, the distributed system should have the following four characteristics: (a). Distribution. Distributed systems consist of multiple computers that are geographically dispersed and can be distributed in a unit, city, country, or even globally. The function of the whole system is distributed in each node, so the distributed system has the distribution of data processing. (b). Autonomy. Each node in the distributed system contains its processor and memory, and each node has its function of processing data. (c). Parallelism. A large task can be divided into subtasks that are executed on different hosts. (d). Globality. There must be a single, global process communication mechanism in a distributed system, so that any process can communicate with other processes, and it does not distinguish between local communication and remote communication.
Time synchronization in a distributed system is a critical issue. Time synchronization can ensure that all nodes in the distributed system work together [Lamport (1978)]. There are many classical time synchronization mechanisms such as receive-receive synchronization mechanism, two-way synchronization mechanism based on send-receive, and the synchronization mechanism based on send-receive. Based on these classic time synchronization mechanisms [Jiang, Chen and Hu (2017); Zhang and Zhang (2012)], many time synchronization algorithms have been born, such as RBS [Elson, Girod and Estrin (2002)], TPSN [Ganeriwal, Kumar and Srivastava (2003)], FTSP [Maróti, Kusy, Simon et al. (2004)].

DCCPOEDC model
This paper proposes a distributed covert channel of the packet ordering enhancement model based on data compression. The covert channel of the packet ordering is applied to the distributed system. In this type of system, the sender is divided into a master and child nodes, the master communicates with each node, and each node communicates with the receiver to transmit secret information.  First, the data compression algorithm is used to compress the characters. The master needs to define the number of child nodes, determine the time interval between sending bit information and data every round according to the number of nodes, and establish a relationship 1.
Second, the master uses multiple rounds of secret information retrieval relationship 1 to get multiple groups of node sequences. Define the sending time of these multiple groups of nodes. The time series corresponding to each node can be obtained. The master sends the corresponding time series of each node to the relevant node and continues to communicate secret information when receiving the confirmation information transmitted by each node. Third, after the nodes receive the time series sent by the master. They synchronize their time. After the synchronization is completed, the relevant nodes send the information to the receiver at the specified time. When the time series is sent, each node sends the confirmation information to the master. Fourth, the receiver receives the information sent by the nodes. It will get multiple sets of node sequences. The receiver searches for relationship 1 to obtain multiple rounds of secret information sent by the master. The sender combines these groups of information into a complete secret information. Then the sender decompresses the information to get the original information.

DCCPOEDC sender algorithm
The model needs to determine the relationship 1 rel between the sending order of m nodes in the distributed system and the n bits hidden information. The specific encoding method is: the m nodes are 1 N , 2 N ,…, m N , the hidden information sent in each round is . Therefore, it is possible to establish the relationship 1 rel between the sending order of nodes and the secret information: , ,…, .
When the distributed master sends information, it uses the data compression algorithm to perform character compression processing on originmsg to obtain compmsg . According to the relationship 1 rel , the master obtains the sending order of each child node in the distributed system corresponding to the n bits hidden information transmitted in each round. The master will obtain the sending order of each child node according to the hidden information content, that is, after each certain child node finishes sending information, it waits for the next child node to send information within a specified time. When the master performs multiple rounds of lookups, it will get multiple sending times for different child nodes. The master makes up a two-dimensional time series with multiple sending times corresponding to different nodes. That is, each child node will have a complete-time series to represent this hidden information. The master sends the relevant time series to the corresponding child nodes. This way of communication can avoid frequent communication between the master and the child nodes, and improve the privacy of the transmission channel. The algorithm of the master of this model is as follows: The encoding hiding process at the sender is as follows: (a). The master uses the data compression algorithm to convert the hidden information originmsg into the compressed hidden information compmsg .
(b). The master divides each n bits of the hidden information compmsg , and obtains the transmission sequence [] node according to the relationship 1 rel .
(c). After performing h operations on step (b), the master will obtain h transmission times of n child nodes in (f). When the child node finishes sending, it sends confirmation information to the master. After receiving the confirmation information from all the child nodes, the master transmits the next round of hidden information.

DCCPOEDC receiver algorithm
The receiver is the reverse process of sending secret information by the sender. After receiving the information of the child nodes, the node sequence is formed according to the time when the data packet arrives. The relationship 1 rel is searched to obtain the secret information. The specific decoding process at the receiver is as follows: (a). The receiver composes a node sequence [] node according to the arrival time of the data packet, and searches for the relationship 1 rel according to the node sequence [] node , and converts it into n bits secret information crecvmsg .
(b). After receiving the secret message, the receiver uses a data compression algorithm to decompress to obtain the hidden information originmsg .

Experimental results and analysis 4.1 Selection of compression algorithm
To comprehensively compare the application effects of various algorithms in this model, the main performance indexes include compression rate, compression time, and system complexity. The original file size is FS , and the file size is ' FS after using the compression algorithm. The compression ratio C R of the compression algorithm is: When the original file size FS is unchanged, the smaller the file size ' FS after compression, the lower the compression rate C R , and the better the performance of the compression algorithm. In this paper, several common compression algorithms are selected for testing. In order to be able to measure various compression algorithms, we use Eq. (3) to measure all compression algorithms.
The test data is from hidden information of a fixed size, and different algorithms are used to process the same data. The test algorithm uses the functions in each algorithm library to compress the data read into memory. The results are shown in Tab. 2. It can be seen that because the algorithms of Huffman encoding and RLE encoding are relatively simple, their compression effect are not good. At the same time, Gzip, LZMA, and Zlib have higher compression effect. Combining the performance value of the compression algorithm, it can be obtained that the LZMA compression algorithm can achieve a good balance in compression time and compression rate, which meets the needs of this model. Igor Pavlov invented the LZMA compression algorithm in 1998. LZMA use a dictionary encoding mechanism. This compression algorithm is the default compression algorithm for the 7z format in the 7-Zip program. 7-zip provides the LZMA software development kit. LZMA is a variant of LZ77 compression algorithm. LZMA combines sliding window, dictionary compression algorithm and interval coding in LZ77, it has the advantages of high compression rate, small space requirement for decompression and fast speed. The LZMA algorithm supports a dictionary space of 4 KB to hundreds of MB. Dictionary improves the compression effect, but leads to a large search cache space.
In the implementation of the LZMA algorithm, several possible longest matches are stored in a hash list, and the data structure of a Hash chain table or a binary lookup tree is used to find the matching data [Leavline and Singh (2013)]. This method reduces the time required to match the longest string and quickly searches for matching characters. The encoding process of LZMA is similar to the encoding process of DEFLATE. Both of the two compression codes use the sliding window and dictionary compression algorithm in LZ77 encoding, but LZMA USES the interval encoding to improve the compression performance of LZMA [Hübbe, Wegener, Kunkel et al. (2013)].
The LZMA encoding process is as follows: (a). Write compressed data to the cache; (b). Read the dictionary into the cache and perform sliding matching. If it is successful, go to step (e), otherwise continue to the next step; (c). Write '0' (character match failed) into the output stream; (d). Update the sliding cache data and go to step (f); (e). Output flag bits and match information; (f). Interval coding the data; (g). If there is uncompressed data, go to step (b); (h). End data compression.

Analysis of the information transmitted by the average node
The number of distributed nodes is m , which corresponds to m nodes, and the number of receiving ports is one. When transmitting hidden information, the m nodes are sorted according to the hidden information, and then the sender sends data packets to the receiver. The receiver listens to the receiving port, receives data packets sent by m different nodes. Secret information is obtained according to the arrival order of packets from different nodes.
In each round of information transmission, it is necessary to determine the n bits secret information corresponding to the sequence of nodes. The relationship between m nodes and n bits hidden information satisfies the Eq. (1) and is Due to the introduction of a compression algorithm, the sender needs to compress the hidden information. Therefore, the compression rate CR of the compression algorithm need to be taken into account when calculating the information transmitted by the average node. It can be obtained that the information transmitted by the average node of this model is:

Figure 4: Relationship between information transmitted by average node and number of nodes
The compression ratio of the compression algorithm depends on the probability distribution of the file content. Therefore, the compression rate is 30% during the simulation. In Fig. 4, it can be seen that when the number of nodes (ports) is the same, the total number of hidden information transmitted by this model is much higher than the covert channel of the packet ordering and on\off covert channel. As the number of nodes (ports) increases, the average amount of hidden information that each node (port) can also transmit increases.

Analysis of concealment
The LZMA compression algorithm uses a variable dictionary encoding in the compression process. The results of using the LZMA compression on the same information in different SDKs are different because different dictionary encodings are used in the compression. A different dictionary cannot decompress the compressed content of another dictionary. When the dictionary is not leaked, the protection of the original secret information can be realized to a certain extent. And the size of the sent hidden information can be reduced. Fig. 5 shows the results before and after compressing text in the SevenZip library. Fig. 6 (2020)]. For example, customize the compression method in lzma.compress, modify the differences between adjacent bytes to "5", and change the compression preset to "7". It will produce a different compression result than the default compression method. Fig. 7 shows the compression results of the default compression method and the custom compression method. The results show that different compression methods have different compression results for the same information content. The existing detection algorithms of covert timing channels usually use rules and statistics. The regular data of the receiver over a period of time is used as the basis for detecting the covert timing channel. In this model, data traffic is distributed among multiple connections in a distributed system. To detect whether the data traffic existing in multiple connections is related to each other, large performance and time overhead are required. Use Wireshark to perform packet capture analysis on the receiver, and use Wireshark's IO Graphs tool, which can display the overall traffic situation in the packet capture file. It is useful for viewing peaks/troughs in traffic. Use relevant filters to display specific information on the chart. Fig. 8 shows the distribution diagram of UDP traffic in daily use at the specified network adapter. Fig. 9 shows a distribution diagram of UDP traffic at the runtime of DCCPOEDC at the specified network adapter. In the Figs. 8 and 9, the abscissa is time, and the ordinate is the total number of data packets transmitted at the current time. It can be seen that the traffic distribution during the running of this model is similar to the traffic distribution during daily use. In summary, through analysis of the LZMA compression algorithm, it can be seen that different function libraries and different compression methods can have different effects on the LZMA compression result, which can reduce the data size while protecting hidden information. From the analysis of DCCPOEDC, each node in the distributed system receives time series instead of secret information. Therefore, it can ensure that the secret information is not leaked in the transmission process. By analyzing the principle of the covert channel used in this article, it can be found that because the content of the data packet has not been modified, the packet capture software is used to analyze the data packet in DCCPOEDC, which is similar to the normal data packet transmitted in the network. In the process of covert communication, the content of the data packet can withstand the scrutiny of the security device. By comparing the distribution diagram of the transmission traffic of this model, it can be seen that the transmission diagram of this model is similar to the traffic distribution diagram of the computer during daily use, and it has some concealment to the transmission channel.

Transmission rate
Because the compression algorithm is used, the sender needs to compress the hidden information. Therefore, when calculating the transmission rate, the compression ratio CR of the compression algorithm needs to be considered. The transmission rate of this model is: The on\off covert channel model determines whether the data packet arrives at the receiver to express the secret information. It needs to express information in the same time interval, so its transmission rate is low. The covert channel of the packet ordering model and DCCPOEDC are better than the on\off covert channel model because they do not use the interval between data packets to represent hidden information. And because DCCPOEDC model uses a data compression algorithm, the transmission time of the same information is less than the covert channel of the packet ordering model.

Conclusions
This paper focuses on the problems of inadequate concealment, low channel capacity and obvious data distribution in the covert channel of the packet ordering model. A distributed covert channel of the packet ordering enhancement model based on data compression is proposed. Through the model analysis, the compression algorithm suitable for this model is selected, and the anonymity, transmission efficiency and transmission performance are analyzed. It can be seen that the model optimizes the covert channel of the packet ordering model, increases the total amount of data transmitted and improves the concealment.