Application of LTE-U Technology in Train-Ground Communication of Urban Rail Transit

With the development of intelligentizing of Urban Rail Transit (URT), the data that need to be transmitted between train and OCC (Operational Contral Center) has increased significantly, and the licensed spectrum range is gradually unable to satisfy the data transfer task of URT communication systems(URT-cs). Therefore, this article considers to use LTE-U (Long-Term-Evolution in unlicensed spectrum) technology to deploy this communication system in unlicensed spectrum. In view of the interference caused by the introduction to WLAN system which have already been deployed in unlicensed spectrum, a dynamic spectrum range duty cycle adjustment scheme based on Q-learning algorithm is proposed to achieve coexistence of the two system. The scheme takes spectrum range as state, modelling the spectrum range allocation as a Markov Decision Process, obtains the spectrum range allocation strategy through Q-learning algorithm. Simulation results show that, compared with fixed spectrum allocation, the proposed scheme has a higher throughput and user satisfaction. Realizing the transmission task of URT-cs while meeting the user satisfaction of the WLAN system, and the problem of spectrum resources insufficiency of URT-cs is well solved.


Introduction
Communication system of Urban Rail Transit (URT-cs) is the guarantee for its fast, efficient and safe operation. The technology used in this system is updated with the development of communication levels, from the initial track circuit to the current cellular mobile communication. At present, URT The specification requires dual-network redundant communication. At present, URT-cs is deployed in the license spectrum of 1785-1805MHz, with only 20MHz bandwidth. Among them, network A carries all URT-cs services including CBTC, IMS, PIS, etc., and network B bear CBTC system to form a redundant network. With the development of URT digitalization process, communication data has increased significantly, the current spectrum resource can no longer carry the network capacity gradually. However, there are abundant resources in unlicensed spectrum, so it is considered to deploy part of URT-cs in unlicensed spectrum.
When URT-cs is introduced into unlicensed spectrum, the coexistence problem with WLAN systems should be considered firstly. At present, the WLAN based on the IEEE802.11 protocol is deployed in unlicensed spectrum. Due to the different access mechanism (WLAN system will use Listen Before Talk (LBT) [1] mechanism to sense the channel before transmitting, if the channel is idle, the transmission will be performed. But LTE can not sense the channel before transmitting [2]). If URT-cs is introduced into the unlicensed spectrum directly, it will affect the transmission of WLAN. Literature [3] proposed a joint-adaptive duty cycle (DC) regulatory scheme and dynamic  [4] takes the throughput as the optimization object and proposes a game model between LTE and WLAN users to find the optimal solution for system resource allocation. Taking fairness and the sum of throughput into account, literature [5] allocates appropriate idle time for the WLAN to ensure coexistence of two systems effectively.

Let-u technology
Standard LTE technology is deployed in licensed spectrum. As data traffic task in licensed spectrum increase more and more saturated, Long Term Evolution in unlicensed spectrum (LTE-U) is proposed to deal with the saturated problem. LTE-U takes the standard LTE air interface to complete communication. It does not need to improve the network structure, only to upgrade the base station (BS), can greatly reduce the operation cost.
The licensed spectrum is used as main carrier.It can obtain the unlicensed spectrum resources through carrier sensing and carrier aggregation (CA) technology to realize the aggregation of licensed and unlicensed frequency bands. There are three main networking methods for LTE in unlicensed frequency bands, that is Supplement Downlink (SDL) mode, CA mode and Standalone (SA) mode. SDL mode conduct carrier aggregation process on the downlink, part or all of the downlink data is transmitted in the unlicensed frequency band. CA mode carrier aggregation the uplink and downlink of the unlicensed band separately, both uplink and downlink data can be transmitted in unlicensed frequency band. All data of SA mode is transmitted in unlicensed band. LTE-U network deployment mode is shown in Figure 1.

Algorithm design
Reinforcement learning solves the sequential decision problem by maximizing the reward function to obtain the optimal selection strategy. Agent selects the optimal action based on the learned optimal strategy and performs state transition. The environment gives feedback about the state transition process, agent learns again due to the feedback, then, executes the optimal action according to the optimal strategy to enter next state. Agents update optimal strategie by continuously interactive with the environment. Q-learning is a model-free algorithm of reinforcement learning, its does not need interaction information and learning samples between systems. The algorithm based on environmental actions to maximize the expected return, Its evaluation function is the expected return of all follow-up conduct starting from the present state [6]. In this article, this algorithm is used to solve the coexistence problem of URT-cs and WLAN.
The algorithm consists of behavioral strategy, reward function and evaluation function. In updating courses, agent senses environment state, choose a action to act on environment and transfer to next state according to the current evaluation strateg  , at the same time, environment feedbacks a timely value. by continuously interacting with environment, agent gets best action of each state. The decision-making is a process in which agent continuously interacts with the environment to update the Q table, after stabilization, the agent uses the Q table to make decisions. The iterative process is formula(1). Where represents learning rate; n V represents the th n V learning; 1 ( , ) t Q s a  represents renewal Q value; ( , ) t Q s a represents the present Q value [7].
, , Where , An P and , Bn P are transmission power of base station A and B at resource block n; , An G and , Bn G is the link gain among two BSs and the users in RS n. 2  is the noise power. The train running process between two BSs is divided into multiple equal-length sub-space. It takes the SINR corresponding to each sub-space as the state, and the change of SINR is used to characterize the state's change, the future accumulation reward are considered by the evaluation function, the state contain the rate of the change of SINR, communication quality of the train before and after the BS will also change greatly, so the system state is: represents the name of the BS that the train communicates with at the current time. Action: In this paper, the efficient coexistence about two system is realized by dynamically adjusting the ratio of URT-cs and WLAN subframes under one wireless frame. The switching of the train between the BS will also affect the next state. Therefore, agent action can be expressed as formula (5). Where u and w denote the number of subframes of URT-cs and WLAN respectively;

Simulation results and analysis 4.1. System throughput
Throughput refers the number of bits transmitted by a communication system per unit time. It barring reference signal which is used to evaluate the channel quality, the overhead of the physical channels, and the cyclic prefix used to avoid interference between adjacent carriers. Here T refers to the sum of the uplink and downlink throughputs. If real u T and real d T represent the uplink and downlink throughput respectively, the total throughput of coexistence system can be expressed as formula (11).
(1 4.76% 19.05% 6.67%) Where d Bit is the number of bits transmitted in a transmission time period, which is related to modulation and coding methods, and the number of RB; d N is the number of downlink subframes; w d N is a proportion of the number of downlink pilot signals in special subframes. Downlink pilot signals in the special subframe can transmit data while uplink can't. The signal SRS is used to evaluate and dispatch uplink signals. The simulation results of the throughput are shown in Figure 3. The letter Q on the horizontal axis represents the scheme proposed in this paper, other items represent fixed DC scheme. Figure.3(a) shows total throughput of URT-cs and WLAN. It is clearly that the scheme proposed in this paper has a obvious advantages. Figure.3(b) lists throughput of both URT-cs and WLAN. The Figure.3(b) shows that with increases of the DC, the throughput of URT-cs increases significantly, WLAN decreases either. In the scheme of this article, two coexistence systems can achieve a higher throughput simultaneously (The throughput of URT-cs with this scheme is higher than the system with a (6:4) DC; the throughput of WLAN with this scheme is higher than the system with a (4:6) DC).

Customers satisfaction value
Adaptive Modulation Coding were used to calculate packet loss probability [8]. Channel quality is divided into N levels corresponding to n modulation and coding methods of URT-cs. So, packet loss probability can be expressed as formula(16).  Figure 4. User satisfaction value The BS of URT-cs are distributed at equal distance along the line. In this paper, the distance between two BSs is 320m. Only user satisfaction in the first 160m is listed here due to the channel quality SINR between two BSs is symmetrical about the midpoint. It can be seen from Fig.5 that the user satisfaction is highest when the train is directly below the BS. As the train leaves the BS, the interference from neighboring BS increases gradually, and the customer satisfaction also decreases. When the train travels to the middle of two BSs, the user satisfaction decreases to a minimum. Compared with the DC fixed scheme, the DC dynamic adjustment mechanism due to Q-learning algorithm in this article has higher user satisfaction and better performance.

Conclusion
In view of the increasing in URT safety requirements and the shortage of licensed spectrum resources, LTE-U technology is considered to deploy part of URT-cs services in unlicensed spectrum range. Based on Q-learning algorithm, this paper proposes a spectrum range allocation scheme to solve the interference to WLAN which has been deployed in the unlicensed spectrum band caused by the introduction of URT train-ground communication system. By allocating spectrum range for URTcs and WLAN system dynamically, two systems obtain channel usage rights alternately and complete respective data transmission tasks. Taking throughput and user satisfaction as reference indicators, the simulation results show this scheme has higher superiority comparing with the fixed duty cycle item. Achieving coexistence with WLAN friendly while meeting the data transmission task of URT-cs.