Cooperative-Aware Radio Resource Allocation Scheme for 5G Network Slicing in Cloud Radio Access Networks

The 5G network is designed to serve three main use cases: enhanced mobile broadband (eMBB), massive machine-type communications (mMTC), and ultra-reliable and low-latency communications (uRLLC). There are many new technological enablers, including the cloud radio access network (C-RAN) and network slicing, that can support 5G and meet its requirements. The C-RAN combines both network virtualization and based band unit (BBU) centralization. Using the network slicing concept, the C-RAN BBU pool can be virtually sliced into three different slices. 5G slices require a number of Quality of service (QoS) metrics, such as average response time and resource utilization. In order to enhance the C-RAN BBUs utilization while protecting the minimum QoS of the coexisting three slices, a priority-based resource allocation with queuing model is proposed. The uRLLC is given the highest priority, while eMBB has a higher priority than mMTC services. The proposed model allows the eMBB and mMTC to be queued and the interrupted mMTC to be restored in its queue to increase its chance to reattempt the service later. The proposed model’s performance measures are defined and derived using a continuous-time Markov chain (CTMC) model and evaluated and compared using different methodologies. Based on the results, the proposed scheme can increase C-RAN resource utilization without degrading the QoS of the highest-priority uRLLC slice. Additionally, it can reduce the forced termination priority of the interrupted mMTC slice by allowing it to re-join its queue. Therefore, the comparison of the results shows that the proposed scheme outperforms the other states of the art in terms of improving the C-RAN utilization and enhancing the QoS of eMBB and mMTC slices without degrading the QoS of the highest priority use case.


Introduction
Presently, there is a continually increasing demand for higher data rates, better energy efficiency, better connection reliability regardless of location, and lower network latency. The fifth wireless generation (5G), which is expected to offer higher data rates and lower latency as well as meet other technical expectations, represents a massive jump in quality over previous generations [1][2][3][4]. The International Telecommunication Union (ITU) has defined the vision, requirements, and use cases of 5G networks. International Mobile Telecommunication 2020 Standards (IMT-2020) is the ITU standard that is used as a name for the components, systems, and other related elements designed to enable network capabilities beyond those offered by previous mobile wireless generations. The main role of IMT-2020 is to set 5G research activities and define the overall objectives of 5G standardization.
In order to meet the diverse requirements and demands for future network usage, the ITU has defined the important 5G use cases and classified them into three important categories. These categories are enhanced mobile broadband (eMBB), massive machine-type demand services for all expected requests and applications. It should be noted that the logical networks are in the same physical network and point to network slices [2][3][4][5]8,9].
The C-RAN is defined as one of the main components of the new architecture of 5G networks. Generally, the main radio resources located at the 5G base station (BS) include the radio antennas, the remote radio heads (RRHs), and the baseband units (BBUs). In order to improve the 5G architecture, the Next Generation Mobile Networks project transferred the BBUs from the BS to the 5G C-RAN. Therefore, all the BBUs of the RRHs connected by the same C-RAN are transferred and grouped into a virtualized BBU pool located in the C-RAN. In this new architecture, the BS is connected to the C-RAN using a high speed fronthaul connection and has only the RRHs and the radio [11][12][13]. With this new 5G C-RAN, the processing of the resource allocation is centralized at the C-RAN and can serve hundreds of distributed RRHs [3]. The BBUs are responsible for allocating the C-RAN resources to RRHs based on the defined network needs.
CRAN has three main components, which are BBU, RRU, and front-haul. This architecture is used in 5G, where it can serve as a transport network for 5G and improve network performance. It also has the ability to pool resources, simplify network management and operation, reduce energy consumption, and co-execute RAN functionalities along with other network functions in a data center. Distributed RAN is a classical setup where remote radio units and BBUs are collocated at every cell site, and it is not appropriate for enhancing network performance.
C-RAN combines both network virtualization and centralization of processing units (BBUs) in order to increase resource utilization and reduce the cost of deploying dense mobile networks. Using the network slicing concept, the C-RAN BBU pool can be virtually sliced into different slices, where each slice is allocated for different types of 5G use cases. These centralized and sliced resources are in critical need of efficient and flexible radio resource management [3]. Based on the 5G use cases, the BBU pool can be sliced into three slices, as shown in Figure 1. These 5G slices have different QoS requirements and different priority levels. The 5G network is the future of mobile networks, and C-RAN is the 5G network component that groups all BBUs in a centralized BBU pool. In order to enhance C-RAN resource utilization while protecting the minimum QoS required for each 5G slice, an efficient radio resource allocation method is necessary and critical to controlling the coexisting 5G slices. This motivated us to propose a priority-based radio resource allocation method for 5G C-RAN with different coexisting 5G network slices. This method differentiates between the different slices based on their required QoS. The queuing model is used to store the delay-tolerant slices. This will decrease their forced termination rate and increase C-RAN resource utilization without degrading the QoS of the higher-priority slices. The proposed model allows the interrupted slice with the lowest priority to be restored in its queue in order to provide a chance to reattempt the service later. Therefore, it will in- The 5G network is the future of mobile networks, and C-RAN is the 5G network component that groups all BBUs in a centralized BBU pool. In order to enhance C-RAN resource utilization while protecting the minimum QoS required for each 5G slice, an efficient radio resource allocation method is necessary and critical to controlling the coexisting 5G slices. This motivated us to propose a priority-based radio resource allocation method for 5G C-RAN with different coexisting 5G network slices. This method differentiates between the different slices based on their required QoS. The queuing model is used to store the delay-tolerant slices. This will decrease their forced termination rate and increase C-RAN resource utilization without degrading the QoS of the higher-priority slices. The proposed model allows the interrupted slice with the lowest priority to be restored in its queue in order to provide a chance to reattempt the service later. Therefore, it will increase its service completion rate without degrading the higher-priority slices. The performance measures of the proposed model are conducted and derived using a continuous-time Markov chain (CTMC) model. The proposed scheme distinguishes the 5G use cases (eMBB, mMTC, and uRLLC) based on their priority level and uses a queuing system with a feedback queue to accommodate the arriving 5G slice requests with lower priority (eMBB, mMTC) or interrupted mMTC services. The queuing system with feedback will improve the service completion rate of eMBB and mMTC without degrading the uRLLC services, thereby increasing the BBU pool utilization of C-RAN. The contributions that distinguish this research from most of the previous works are as follows: • A priority-based resource allocation scheme with a queuing model is proposed for 5G C-RAN with eMBB/mMTC/uRLLC coexistence.

•
The priority queuing model is used to buffer the lower priority slices, and the interrupted non-delay-sensitive mMTC services are allowed to be returned to their queue to reattempt the service.

•
Mathematical equations are derived for the main performance measures of the proposed scheme using a continuous-time Markov chain (CTMC) model. • A simulation model is developed to compare with and validate the results of the proposed analytical model. • The proposed system is compared with the traditional queuing system to verify its main objectives.
The remainder of this paper is organized as follows: In Section 2, related works are described. The system model and assumptions are explained in Section 3. In Section 4, the proposed work is explained. Furthermore, Section 5 provides the performance analysis of the proposed scheme. In Section 6, we provide the numerical results of the proposed scheme. Finally, concluding remarks are made in Section 7.

Related Works
The radio resource management and allocation methods in 5G networks with SDN/NFVbased functions were presented and studied in many research papers [8,9,11]. Several research efforts have introduced and analyzed the radio resource control and QoS provisions for 5G network slices [10,[12][13][14][15][16].
The resource allocation strategies and challenges for network slicing were introduced and discussed in some recent works [8,9,11]. A survey of principles and a model for network slicing resource management were discussed in [8]. In [9], the problem of sharing 5G resources between different network slices in a fair manner was studied. In [11], network slicing in virtualized wireless networks was studied. Furthermore, a resource allocation algorithm was proposed for orthogonal frequency division multiple access (OFDMA) virtual RAN that can enhance the spectral efficiency of eMBB and uRLLC slices and improve uRLLC reliability.
In [12], three scheduling schemes for SDN-based network slicing were proposed to enhance the joint resource management problem with different QoS requirements. In [13], in order to achieve resource fairness and high utilization in the 5G network, packet delay modeling was proposed for the traffic flow through the chain of virtual network functions (VNFs). In [14], a 5G network slicing framework was presented in order to enable efficient isolation between the coexistence of IoT and eMBB slices.
In [10], a radio resource allocation method for network slicing is discussed. Specifically, a dynamic network resource allocation model for various 5G slices across different 5G network domains was suggested. As there are different requirements for each 5G slice in C-RAN, the proposed resource allocation aims to increase the level of satisfaction for each traffic slice in C-RAN and in all 5G network segments. In [15], the authors defined a new slicing architecture for the 5G network, as they considered it the main task to make 5G easier to understand. In addition, they presented a model for managing the constraints due to spectrum sharing by different access networks.
In [16], the end-to-end behavior of slice isolation is examined, especially from a security perspective. Isolations were classified as traffic, bandwidth, processing, or storage. A set of Sensors 2023, 23, 5111 5 of 18 challenges for the recent trends in 5G slice isolation was discussed. The authors argued that isolation is the most important property of network slicing.
The authors of [6,17] provided a model that can work with different types of data. From this point on, different data types require a traffic scheduling technique, as this would make the QoS in 5G mobile networks easier to manage. The techniques they have suggested are priority queuing, first-in-first-out (FIFO), and weighted fair queuing. In [17], a queue model was assumed to interact with two packet types (real-time and non-real-time). Each packet type has its own transmission time, during which a scheduler allocates the packet to a resource unit (BBU). The queue model assigns a higher priority to real-time packets, and the higher priority will pass to the queue model via frequency feedback for packet transmission.
In [18], an efficient and secure service-oriented authentication framework that could support both network slicing and fog computing for the 5G-enabled Internet of Things network is proposed. In [19], a dynamic network slicing framework is presented in which a regional orchestrator is introduced to coordinate workload distribution among local fog nodes. In this node, the number of resources allocated to each slice could be dynamically adjusted according to service requests and energy availability. In [20], the authors proposed a network slicing architecture that utilizes spectrum resources in both licensed and unlicensed bands. In [21], an optimization framework is proposed that could jointly allocate resources for slices in terms of both network bandwidth and cloud processing power at a fine-grained level.
In [22,23], the concept of the queuing model was used to model the resource allocation problem for the coexistence of different services with different QoS in 4G or 5G networks. In [22,23], the coexistence of machine-to-machine (M2M) and human-to-human (H2H) communications in 4G networks was studied. In [22], a queuing model was used to analyze the impacts of coexistence between M2M and H2H communication on a 3GPP LTE system. In [23], a resource sharing method was proposed for M2M and H2H traffic under a time-controlled scheduling scheme in LTE networks.
In [24], the resource allocation problem for 5G network use cases across its main domains is analyzed using a general queuing model. This proposed scheme consists of three sub-queuing models covering the three main domains of the 5G network, which are C-RAN, mobile edge computing (MEC), and cloud data centers (CDCs). More specifically, the model aims to estimate the required QoS for network use cases in each component of the 5G architecture. In [25], a radio resource allocation for 5G use cases at 5G C-RAN is proposed using polling system methods. The aim of this scheme is mainly to protect the required delay for each 5G slice (use case) [26,27].
In [28], the authors proposed an allocation method that allocates the cost of the network to each deployed 5G network slice. This resource allocation method saves the core network costs. However, the QoS for each 5G slice was not addressed. In [29], an energy-efficientbased scheduling algorithm for eMBB and URLLC is proposed. The proposed algorithm is implemented as a mixed-integer non-linear problem. The numerical results proved that it outperformed the existing classical schemes. In [30], the impact of the SDN controller on packet delay is studied using an analytical mode. The study focused on two services: eMBB and mMTC. The results show that the proposed model can be used to plan the resource allocation method in 5G edge clouds in order to meet the QoS requirements for all 5G slices [31][32][33].
To the best of my knowledge, the balance between the provision of higher resource utilization in 5G C-RAN and the protection of QoS required by each of the coexisting slices has not been investigated. Therefore, we are motivated to investigate this issue by proposing a priority-based resource allocation that can accommodate the coexistence of 5G slices at the C-RAN node. Protecting slice isolation and ensuring maximum C-RAN utilization are the main goals of the proposed scheme.

System Model and Assumptions
A single 5G C-RAN that has a BBU pool has been considered. Many RRHs are connected to the C-RAN through high-speed connections. In this paper, a single-cell area served by a single RRH through a 5G air interface is assumed. Each RRH serves many users, and each user's request is related to one of the specified 5G slices. The C-RAN is assumed to have a C number of BBUs and is virtually sliced into three slices, where each slice is isolated for a specific service type. The general components of the proposed system are shown in Figure 2. The first slice is dedicated to uRLLC services, while the second slice is dedicated to eMBB services, and finally, the third slice is dedicated to mMTC services. All the RRH connection requests are sent to the BBU pool and classified based on their slices. The arrival rate of each request type is assumed to follow a Poisson distribution and is denoted as λ 1 , λ 2 , and λ 3 for uRLLC, eMBB, and mMTC, respectively. The service time for all connection requests is assumed to be exponentially distributed with rates µ 1 , µ 2 , and µ 3 for uRLLC, eMBB, and mMTC users, respectively. We use the variable C to represent the number of BBUs in the C-RAN BBU pool. The value of C will depend on the 5G bandwidth selected and is assumed to be statically identical. The connection is assumed to be millimeter wave with Giga speed.
are the main goals of the proposed scheme.

System Model and Assumptions
A single 5G C-RAN that has a BBU pool has been considered. Many RRHs are co nected to the C-RAN through high-speed connections. In this paper, a single-cell ar served by a single RRH through a 5G air interface is assumed. Each RRH serves ma users, and each user's request is related to one of the specified 5G slices. The C-RAN assumed to have a C number of BBUs and is virtually sliced into three slices, where ea slice is isolated for a specific service type. The general components of the proposed syste are shown in Figure 2. The first slice is dedicated to uRLLC services, while the second sl is dedicated to eMBB services, and finally, the third slice is dedicated to mMTC servic All the RRH connection requests are sent to the BBU pool and classified based on th slices. The arrival rate of each request type is assumed to follow a Poisson distribution a is denoted as λ1, λ2, and λ3 for uRLLC, eMBB, and mMTC, respectively. The service tim for all connection requests is assumed to be exponentially distributed with rates µ1, and µ3 for uRLLC, eMBB, and mMTC users, respectively. We use the variable C to rep sent the number of BBUs in the C-RAN BBU pool. The value of C will depend on the bandwidth selected and is assumed to be statically identical. The connection is assum to be millimeter wave with Giga speed. The uRLLC (e.g., with a smart drive) has the highest priority, and the mMTC has t lowest priority (e.g., smart city applications). eMBB has a lower priority than the uRL (e.g., in smartphones). The arrival of an uRLLC can interrupt any in-service lower-prior requests if all BBUs are busy. In addition, the arrival of eMBB can interrupt any in-serv mMTC if all BBUs are busy. The proposed model has two queues with finite lengths eMBB and mMTC requests. The queued requests can wait until the resources are availab

Proposed Resource Allocation Scheme
We propose a priority-based resource allocation method that resides in the C-RA and has two queues. Two queues are provided, where the first is given to eMBB servic and the second is allocated to mMTC services. Figure 3 shows the proposed scheme w the two queues. The incoming eMBB request is stored in its queue when the system fully occupied and is indicated as 1QeMBB, whereas the incoming mMTC request stored in its queue when all resources are busy and is called 2QmMTC. The sizes 1QeMBB and 2QmMTC are assumed to be M and N, respectively. When we apply FIF in the C-RAN, the C-RAN will allocate resources for each queue based on FIFO discipli All BBUs are open for all incoming services, and 1QeMBB is served before 2QmMTC. The uRLLC (e.g., with a smart drive) has the highest priority, and the mMTC has the lowest priority (e.g., smart city applications). eMBB has a lower priority than the uRLLC (e.g., in smartphones). The arrival of an uRLLC can interrupt any in-service lower-priority requests if all BBUs are busy. In addition, the arrival of eMBB can interrupt any in-service mMTC if all BBUs are busy. The proposed model has two queues with finite lengths for eMBB and mMTC requests. The queued requests can wait until the resources are available.

Proposed Resource Allocation Scheme
We propose a priority-based resource allocation method that resides in the C-RAN and has two queues. Two queues are provided, where the first is given to eMBB services and the second is allocated to mMTC services. Figure 3 shows the proposed scheme with the two queues. The incoming eMBB request is stored in its queue when the system is fully occupied and is indicated as 1QeMBB, whereas the incoming mMTC request is stored in its queue when all resources are busy and is called 2QmMTC. The sizes of 1QeMBB and 2QmMTC are assumed to be M and N, respectively. When we apply FIFO in the C-RAN, the C-RAN will allocate resources for each queue based on FIFO discipline. All BBUs are open for all incoming services, and 1QeMBB is served before 2QmMTC.  The main expected behaviors of the proposed scheme can be explained based on the occurrences of the following events: • uRLLC arrival: when an arrival request from uRLLC occurs, it will go immediately to the BBU pool, and the BBUs will be checked first. If there are available BBUs, then the request is allocated to the required resources. If no BBUs are available, then we will have the following scenarios: If the same service type (uRLLC) occupies all BBUs of the C-RAN, then the received request is blocked. If the BBUs are fully occupied and there is at least one service type of eMBB or mMTC in progress, then the ongoing service type will be forced to terminate its occupied BBUs. Since eMBB has a higher priority than mMTC, the ongoing mMTC service will be interrupted first. If the interrupted service is of type mMTC, then it will be sent back to its queue if it is not full; otherwise, it will be dropped and considered a forced termination. If there is no ongoing mMTC, then the ongoing eMBB is forced to terminate its occupied BBUs. • An eMBB or mMTC arrival: If the new arrival request is of type eMBB or mMTC, then if there is at least one available BBU resource, it will be allocated to the available resource. In the event that there are no available BBU resources, the arrival request will be queued in its corresponding queue. If its queue is full, then the new arrival request will be blocked.

•
An uRLLC, eMBB, or mMTC departure: If any of the use cases complete their service and release the used BBU resources, then the resource becomes available. After that, the system will check the 1QeMBB queue, and if there is any waiting request, it will be allocated to the available resource based on FIFO discipline. If the 1QeMBB queue is empty, then the 2QmMTC queue will be checked and serviced if it has a queued request. If all queues are empty, then the available BBU will be left empty in the BBU pool.
The pseudocode for the main arrival process of the proposed scheme is shown in Table 2. The main expected behaviors of the proposed scheme can be explained based on the occurrences of the following events: • uRLLC arrival: when an arrival request from uRLLC occurs, it will go immediately to the BBU pool, and the BBUs will be checked first. If there are available BBUs, then the request is allocated to the required resources. If no BBUs are available, then we will have the following scenarios: If the same service type (uRLLC) occupies all BBUs of the C-RAN, then the received request is blocked. If the BBUs are fully occupied and there is at least one service type of eMBB or mMTC in progress, then the ongoing service type will be forced to terminate its occupied BBUs. Since eMBB has a higher priority than mMTC, the ongoing mMTC service will be interrupted first. If the interrupted service is of type mMTC, then it will be sent back to its queue if it is not full; otherwise, it will be dropped and considered a forced termination. If there is no ongoing mMTC, then the ongoing eMBB is forced to terminate its occupied BBUs. • An eMBB or mMTC arrival: If the new arrival request is of type eMBB or mMTC, then if there is at least one available BBU resource, it will be allocated to the available resource. In the event that there are no available BBU resources, the arrival request will be queued in its corresponding queue. If its queue is full, then the new arrival request will be blocked. • An uRLLC, eMBB, or mMTC departure: If any of the use cases complete their service and release the used BBU resources, then the resource becomes available. After that, the system will check the 1QeMBB queue, and if there is any waiting request, it will be allocated to the available resource based on FIFO discipline. If the 1QeMBB queue is empty, then the 2QmMTC queue will be checked and serviced if it has a queued request. If all queues are empty, then the available BBU will be left empty in the BBU pool.
The pseudocode for the main arrival process of the proposed scheme is shown in Table 2.
In order to study the performance of the proposed scheme and to compare it with related previous models, we will study its behavior using different processing methodologies, as follows:

•
Methodology I: the resource allocation scheme has no queues, which is similar to the behaviors studied in [22][23][24]. • Methodology II: the resource allocation scheme has two queues for eMBB and mMTC without allowing the interrupted mMTC to rejoin its queue again. • Methodology III (proposed scheme): the resource allocation scheme is similar to Methodology II, but it allows the interrupted mMTC to rejoin its queue again if it is not full. The arrival request of type uRLLC can preempt the lower-priority services that are in service.

Performance Analysis
The performance analysis of the proposed resource allocation scheme is explained in this section by using continuous-time Markov chains (CTMC) with multiple dimensions.

Analytical Model
In order to analyze and derive the performance measures of the proposed system, we define an analytical model using the continuous-time Markov chain (CTMC) model. The CTMC is used as a tool to model the proposed scheme and evaluate the scenarios being discussed. The proposed scheme is assumed to operate inside the 5G C-RAN component and manage the resource allocation for three different 5G network slices, namely the uRLLC, eMBB, and mMTC slices. The used CTMC model has five dimensions with variables (i, j, k, m, and n) and is based on organizing the requests of each 5G slice on a priority basis. These five variables represent the state of the proposed system at any given instance, where i, j, and k represent the number of ongoing uRLLC services, eMBB services, and mMTC services, respectively. The variables "m" and "n" correspond to the current number of queued services of type eMBB and mMTC request, respectively. The number of queued mMTC services includes the interrupted mMTC service requests that were forced to terminate their connections by the arrival of uRLLC requests.
The total current requests in the system (C-RAN) at any given time, including the queued requests, is the sum of i + j + k + m + n. Let us represent this total as T such that T = i + j + k + m + n and 0 ≤ T ≤ (C + M + N). Therefore, the state of the system can be represented by a five-dimensional state (i, j, k, m, and n), and when a new event occurs, the system will change its current state to another state, and the value of the five variables will change when the system state moves into different states. In order to derive the detailed balance equation of the proposed system and simplify its expression, we will define the variables, notations, and indicators that are used in the analytical model as follows: The variable M represents the buffer size for storing new delay-sensitive eMBB requests. The variable N represents the second buffer, and this buffer has a size for storing new and interrupted-delay-tolerant mMTC requests. The variables M and N correspond to the maximum occupancy of 1QeMBB and 2QmMTC queues, respectively. The variables λ 1 , λ 2 , and λ 3 are used to indicate the arrival rates of uRLLC, eMBB, and mMTC requests, respectively. Moreover, µ 1 , µ 2 , and µ 3 are used to indicate the departure rates of uRLLC, eMBB, and mMTC requests, respectively. The 5G C-RAN is assumed to have C resource units (BBUs). Table 3 summarizes the main notations and parameter descriptions that are used here. From the above discussion, we can represent the system state space, S, as: In the proposed system, there are two queues, and based on the queues' status and the total number of free resources, we can classify the system state space (S) of the proposed system into different substates. For those substates, every single substate defines a section of the practical state of the proposed system. Furthermore, we must study the behavior of every single substate and define its balance equation. Hereafter, the detailed balance equation is derived, where the usage of several derived states leads us into the required balance equations. The queuing analysis methods presented in [26,27] were followed. In order to derive the proper balance equation for each substate, we need to define the number of Boolean indicators to control the transition rates between substates, as shown in Table 4.
The system state space represents the complete, feasible states of the proposed system. In order to simplify the analysis of the system, the system state space S can be described by using four substates S i (i = 1, 2, 3, 4) such that S = S 1 ∪ S 2 ∪ S 3 ∪ S 4 and these substates can be defied as follows:
Substate 1 (S 1 ): These substates represent all the possible states where the whole system is empty, i.e., T=zero. That is, all of the resource units (BBUs) of C-RAN are free with no service. We call this an initial state. Therefore, the state space of S 1 can be written as: where all five variables are equal to zero and its balance equation can be defined as: (λ 1 + λ 2 + λ 3 )π 0,0,0,0,0 = π 1,0,0,0,0 + π 0,1,0,0,0 + π 0,0,1,0,0 Substate 2 (S 2 ): These substates represent all the possible states (0 < T < C and m = n = 0) where the total number of requests in the system is greater than zero and less than C without any queued requests. Therefore, these substates show only the number of occupied BBUs at the current moment. Therefore, the total available BBU resources will be greater than zero and less than C when both queues are free or empty. Hence, the state space of S 2 can be written as: Considering the possible states of S 2 , the detailed balanced equation for S 2 can be represented as: (iµ 1 + λ 1 + jµ 2 + λ 2 + kµ 3 + λ 3 )π i,j,k,0,0 = λ 1 π i−1,j,k,0,0 + (i + 1)µ 1 π i+1,j,k,0,0 + λ 2 π i,j−1,k,0,0 + (j + 1)µ 2 π i,j+1,k,0,0 + λ 3 π i,j,k−1,0,0 + (k + 1)µ 3 π i,j,k+1,0,0 Substate 3 (S 3 ): These substates represent the system states when i + j + k = C and m = n = 0. In this case, the whole system resource BBUs at this moment are fully utilized, while both queues 1QeMBB and 2QmMTC are still empty. Thus, the balanced equation for S 3 can be represented as: Substates 4 (S 4 ): These substates represent all the remaining feasible states of the system when 0 < T < C where 0 < m < = M and 0 < n < = N. In this case, clearly all BBUs are occupied, and the arrival of both eMBB and mMTC services has to wait in their own queues if the corresponding queue is not full. Thus, the balanced equation for S 3 can be represented as:   α 6 iµ 1 + α 1 λ 1 + α 7 (1 − α 6 )iµ 1 + α 6 jµ 2 + α 7 (1 − α 6 )jµ 2 + α 8 kλ 1 + α 6 kµ 3 + α 7 (1 − α 6 )kµ 3 + α 9 λ 3   π i,j,k,m,n = α 3 λ 1 π i−1,j+1,k,m−1,n + α 3 λ 1 π i−1,j+1,k,m,n−1 + α 3 λ 1 π i,j,k,m−1,n + α 3 λ 1 π i,j,k,m,n−1 Using all the above substates and assuming that all other remaining states that are not covered by those substates are empty, we can derive the steady state probability of the system [23,24]. We assume that the Markov chain is memoryless and has a finite number of states. Within this memoryless feature, the constraint is that the work is performed at this moment or the current time (now), which is aperiodic, so it does not repeat itself. The Markov chain process can be described by an individual, time-independent matrix π i,j,k,m,n . In that regard, π i,j,k,m,n will be assumed to be zero for non-feasible states when the steady-state probability is π i,j,k,m,n . Therefore, we define the value of the matrix π i,j,k,m,n = 0 for the situation where (i,j,k,m,n) / ∈ S. Accordingly, having the defined system state, S, the transition matrix Q, and the vector π representing the steady-state probability satisfied the following condition: Based on this condition, the system global balance equations and the normalization equation can be derived using the following two equations: The CMC model that is used to analyze the proposed system is assumed to be irreducible and has a finite state chain. Accordingly, the CMC is ergodic since it is irreducible and positive-recurrent [23]. Therefore, all the state probabilities π i,j,k,m,n can be derived. A developed Matlab code is used to run all the detailed balance equations and calculate the probabilities of being in any defined state. After finding the probability of each feasible state, the transition matrix Q is derived. This transition matrix contains the transition rates between every two-system state. Having the entire steady-state probabilities π i,j,k,m,n and the transition rates between every two states, the steady-state probability vector π is derived.

Performance Measurements
Using the derived steady-state probability matrix π, we can define and derive the performance measures that can be used to compare and study the QoS metrics for both eMBB and mMTC services, as they have a lower priority than uRLLC. The queuing analysis methods presented in [26,27] were followed. These performance measures include the forced termination probability (PF), the service completion rate (SCR), and the average queuing delay (D). In addition, the system resource utilization for both eMBB and mMTC services is measured. These performance measures are derived and defined as follows: 1.
eMBB PF: the forced termination probability of the eMBB is the probability that an uRLLC request with higher priority arrives and the in-service eMBB is forced to terminate its in-service connection before its normal end. The in-service eMBB is forced to release its occupied BBUs and assign them to the arriving uRLLC when the system resources are fully utilized, there are no available BUU resources, and the in-service mMTC connections are zero. The released resources are then given to the newly arrived uRLLC. In this case, the forced termination probability can be derived as the average rate of terminated eMBB services divided by the admitted rate of eMBB services. Therefore, we can denote PF 1 as the forced termination probability of the requests of type eMBB, which can be derived as follows: PF 1 = the averge rate of terminated eMBB the eMBB admitted rate (6a) Using the derived balance equations, the average number of eMBBs forced to terminate is given by: The total eMBB admitted rate is designed as λ 2 (1 − PB 2 ). Hence, the PF 1 can be rewritten as: 2. mMTC PF: This probability can be derived following the procedures used for eMBB. The mMTC is forced to terminate its connection by the arrival of a higher-priority request (in this case, uRLLC and eMBB) when all BBU resources are busy and its queue is full. Similar to (7), the forced termination probability for mMTC is defined as: 3 The service completion rate (SCR): The rate of service completion of both eMBB and mMTC is measured as the total rate of service completion of each type by unit time.
Using the derived balance equations, we can derive this rate. Let us assume that SCR 1 and SCR 2 are the service completion rates for eMBB and mMTC, respectively. The service completion rate of eMBB and mMTC services is derived as shown in (9) and (10), respectively.
This equation represents the service completion rate for eMBB. This rate is computed by adding the service completion rates (departure rate) of all states that represent an eMBB departure, where j represents the number of in-service eMBB, µ 2 the service rate of eMBB, and π i,j,k,m,n the probability that we are in state (i,j,k,m,n). Using the steady state characteristics, we can find the competition rate by adding all the states that have an eMBB departure.

4
C-RAN resource utilization (U) by both eMBB and mMTC services: this metric indicates the efficiency of the proposed system in terms of increasing BBU utilization by allowing both eMBB and mMTC services to be queued and reattempted. Using the system state space S, we can notice that at each state space (i,j,k,m, and n), the system has C resource units (BBUs), and (i + j + k) BBUs are occupied by either uRLLC, eMBB, or mMTC services. Therefore, the utilization percentage for both eMBB and mMTC requests can be defined as:

5.
The average delay (D) for both eMBB and mMTC services: each service type has its own queue, and we can consider D as the average delay time for each queued request until it is served. In order to calculate D, we need to know the queue length for both queues 1QeMBB and 2QmMTC. Using Little's laws, we divide the queue length of each service by its arrival rate. Let us assume that L 1 and L 2 are the queue lengths for eMBB and mMTC, respectively.
Therefore, the average delay for eMBB (D 1 ) can be calculated as follows: where L 1 ∑ ∀S 0≤m≤M mπ i,j,k,m,n .
In the case of mMTC services, the average delay (D 2 ) is calculated as follows: We will include the interrupted connection with its total arrival rate since there will be interrupted requests of type mMTC, shown as feedback, by giving the interrupted request another chance to be serviced in C-RAN. Hence, when we calculate D 2 , we need to add the interrupted connection rate to the mMTC arrival rate. Here, R Interrupt represents the interrupted rate of mMTC. Therefore, D 2 can be written as: where L 2 = ∑ ∀S 0 ≤ n ≤ N nπ i,j,k,m,n and R Interrup = ∑ ∀S i+j+k=C v<C This equation represents the average delay of mMTC. Using the little theorem, this average can be calculated by dividing the average mMTC queue size (L 2 ) by the total mMTC arrival rate. The total arrival rate includes the interrupted mMTC that is inserted back into its queue. Using the system states and the steady state probabilities, this term has the following form: The term L 2 can be found by adding the average number in the queue for all the system states that have queued requests. Using the system states and the steady state probabilities, this term has the following form:

Results and Discussion
In this section, the performance measures of the proposed system are studied and explained. We compare the proposed system with other systems, with and without queuing. Following that, we simulate examples of the three methodologies previously mentioned with Matlab and Simulink using Markov chain methods. For both eMBB and mMTC requests, performance is analyzed in terms of their service completion rate (SCR). Other parameters of importance, such as the average delay of both eMBB and mMTC requests and the forced probability, are discussed and analyzed. We assume a single C-RAN with 25 available BUU resources (i.e., C = 25). The arrival rate of all the requests to the C-RAN is classified into three priority-based 5G use cases, including uRLLC, eMBB, and mMTC, respectively. We assume that, unless otherwise specified, uRLLC, eMBB, and mMTC have the following arrival rates: λ 1 = 1, λ 2 = 1, λ 3 = 2, respectively. Their service rates are assumed to be µ 1 = 0.5, µ 2 = 1.0, and µ 3 = 1.0 for uRLLC and eMBB, respectively. Additionally, we assume that the sizes of 1QeMBB and 2QmMTC are 4 (M = 4) and 2 (N = 2), respectively.
First, the analytical and simulation results are compared and validated using the above parameters. In Figure 4, we validated the results of the system using Methodology III. As it is depicted in the figure, the behaviors of both analytical and simulation results for the eMBB and mMTC's SCR as a function of the uRLLC arrival rate are plotted, and the results are a good match. Furthermore, we can see from Figure 4 that the analytical and simulated SCR results are closer for eMBB than for mMTC when the arrival rate of uRLLC is low. The reason for SCR variation is that, in the 2QmMTC, there are two queuing processes during the arrival and the interruption rate, and the interruption rate is uncertain. In the case of the 1QeMBB queue, the arrival rate is Poisson. Thus, the proposed scheme shows acceptable matching between both simulation and analytical results. the results are a good match. Furthermore, we can see from Figure 4 that the analytical and simulated SCR results are closer for eMBB than for mMTC when the arrival rate of uRLLC is low. The reason for SCR variation is that, in the 2QmMTC, there are two queuing processes during the arrival and the interruption rate, and the interruption rate is uncertain. In the case of the 1QeMBB queue, the arrival rate is Poisson. Thus, the proposed scheme shows acceptable matching between both simulation and analytical results. Using all three methodologies, the combined resource utilization of both eMBB and mMTC services has been explored. The service rate of mMTC is variable, while it is fixed for uRLLC and eMBB. As depicted in Figure 5, the resource utilization of both eMBB and mMTC services decreases as the service rate of mMTC increases. This is because when the service rate of mMTC increases rapidly in a short time, more in-service mMTC requests complete their service more rapidly. In this case, the probability of keeping the resources busy with the mMTC requests decreases. Moreover, since the arrival rates of uRLLC and eMBB requests are fixed in this case while the service rate of mMTC increases, the probability of a new arrival requesting an idle resource in the other two cases decreases. Using all three methodologies, the combined resource utilization of both eMBB and mMTC services has been explored. The service rate of mMTC is variable, while it is fixed for uRLLC and eMBB. As depicted in Figure 5, the resource utilization of both eMBB and mMTC services decreases as the service rate of mMTC increases. This is because when the service rate of mMTC increases rapidly in a short time, more in-service mMTC requests complete their service more rapidly. In this case, the probability of keeping the resources busy with the mMTC requests decreases. Moreover, since the arrival rates of uRLLC and eMBB requests are fixed in this case while the service rate of mMTC increases, the probability of a new arrival requesting an idle resource in the other two cases decreases. Therefore, we can see that the resources in the C-RAN will stay idle for a relatively longer time. This makes sense when the resources are not greatly utilized in the system. Consequently, increasing the service time of one type while fixing it for others results in a reduction in resource utilization. Therefore, using the queuing concept allows the arrived eMBB and mMTC requests to be queued and the interrupted mMTC request to be queued again (as employed in Methodologies I and II). This process increases resource utilization as more chances are offered to the queued requests (including the interrupted mMTC requests) to reattempt the service again in the C-RAN. Figure 6 shows the SCR of mMTC service versus its arrival rate using all three methodologies. The arrival rates of both uRLLC and eMBB are set to a fixed value. As can be seen from the figure, the SCR of mMTC increases rapidly, then tends to move smoothly to a peak value in the sequence. This is because when the arrival rate of mMTC increases at the beginning, the SCR also incrementally increases. After that, the SCR curves move into a peak or max value because all the C-RAN resources have started to be completely utilized. Further, we can observe from the figure that, in addition to the queuing model, Therefore, we can see that the resources in the C-RAN will stay idle for a relatively longer time. This makes sense when the resources are not greatly utilized in the system. Consequently, increasing the service time of one type while fixing it for others results in a reduction in resource utilization. Therefore, using the queuing concept allows the arrived eMBB and mMTC requests to be queued and the interrupted mMTC request to be queued again (as employed in Methodologies I and II). This process increases resource utilization as more chances are offered to the queued requests (including the interrupted mMTC requests) to reattempt the service again in the C-RAN. Figure 6 shows the SCR of mMTC service versus its arrival rate using all three methodologies. The arrival rates of both uRLLC and eMBB are set to a fixed value. As can be seen from the figure, the SCR of mMTC increases rapidly, then tends to move smoothly to a peak value in the sequence. This is because when the arrival rate of mMTC increases at the beginning, the SCR also incrementally increases. After that, the SCR curves move into a peak or max value because all the C-RAN resources have started to be completely utilized. Further, we can observe from the figure that, in addition to the queuing model, increasing the queue size will improve the SCR of the mMTC service. When we increase the queue size from 4 to 8, the curve of the SCR shows better improvements. Therefore, increasing the queue size allows extra requests to potentially be queued. Since the requests of uRLLC have a fixed rate and those of mMTC have an increasing rate, this leads to more queued mMTC requests and a higher probability that mMTC services will complete their service connection. Hence, the SCR increases by increasing the queue size. Now, comparing the SCR using Methodologies II and III, it can be seen that Methodology III results in a higher SCR as compared with Methodologies II with no feedback link for interrupted mMTC requests. In regards to Methodology II, all interrupted mMTC requests will be forced to terminate the service even if the 2QmMTC has available resources, and interrupted requests will be ignored during SCR calculations. However, the SCR behavior in Methodology III is improved as compared with that in Methodology II because the in-service mMTC requests that have been interrupted are given a second chance to join the queue and reattempt accessing resources. With this feedback link, Methodology III outperforms other methodologies in terms of SCR rate.
In Figure 7, the PF of mMTC requests is depicted against the arrival rate of uRLLC. The main goal of this figure is to explain the effectiveness of Methodologies II and III when the arrival rate of the uRLLC request with the highest priority increases. As we can see from the figure, as the arrival rate of uRLLC requests increases, the PF for mMTC requests increases using all methodologies. The reason behind this behavior is that when eMBB and mMTC requests are forced to terminate by the higher priority slice, the arrival rate of uRLLC keeps increasing, thus increasing the usage of C-RAN resources. When the system accepts more uRLLC requests, many in-service eMBB or mMTC requests will be interrupted, and most of the BBUs will have been taken by the arrival of uRLLC requests. In regards to Methodology II, all interrupted mMTC requests will be forced to terminate the service even if the 2QmMTC has available resources, and interrupted requests will be ignored during SCR calculations. However, the SCR behavior in Methodology III is improved as compared with that in Methodology II because the in-service mMTC requests that have been interrupted are given a second chance to join the queue and reattempt accessing resources. With this feedback link, Methodology III outperforms other methodologies in terms of SCR rate.
In Figure 7, the PF of mMTC requests is depicted against the arrival rate of uRLLC. The main goal of this figure is to explain the effectiveness of Methodologies II and III when the arrival rate of the uRLLC request with the highest priority increases. As we can see from the figure, as the arrival rate of uRLLC requests increases, the PF for mMTC requests increases using all methodologies. The reason behind this behavior is that when eMBB and mMTC requests are forced to terminate by the higher priority slice, the arrival rate of uRLLC keeps increasing, thus increasing the usage of C-RAN resources. When the system accepts more uRLLC requests, many in-service eMBB or mMTC requests will be interrupted, and most of the BBUs will have been taken by the arrival of uRLLC requests.
In addition, the PF of the mMTC requests does not improve much with or without using two queues (Methodology II). The reason is that when we allow the eMBB and mMTC requests to be queued, the interaction between them and the uRLLC request will be greater than when we have no queues. This interaction will result in an increase in the calculated PF.
the arrival rate of the uRLLC request with the highest priority increases. As we can see from the figure, as the arrival rate of uRLLC requests increases, the PF for mMTC requests increases using all methodologies. The reason behind this behavior is that when eMBB and mMTC requests are forced to terminate by the higher priority slice, the arrival rate of uRLLC keeps increasing, thus increasing the usage of C-RAN resources. When the system accepts more uRLLC requests, many in-service eMBB or mMTC requests will be interrupted, and most of the BBUs will have been taken by the arrival of uRLLC requests.  Therefore, allowing the arrival request of mMTC to be queued is not enough, and an effective method should be used to efficiently minimize the PF of mMTC. Methodology III is proposed to solve this issue by allowing all requests that have been interrupted to rejoin the queue while there are free spaces in it and it is not full. In addition, increasing the queue size to 2QmMTC can also minimize the PF. Finally, it is clear that the idea of Methodology III can be selected as a better solution for having a lower PF.
As illustrated in Figure 8, the average queuing delay of mMTC requests for both Methodology II and III is studied and compared. As we can see, the performance of Methodology II outperforms that of Methodology III in terms of average queuing delay. The reason for this is that when we allow the interrupted mMTC request to rejoin its queue, the average queue size will increase. Increasing the number of queued requests will increase the average queuing delay, as the queue length is directly proportional to the average delay as derived in the analytical model. Therefore, Methodology III reduces the PF of mMTC requests and increases their average queuing time. Therefore, there is a tradeoff between increasing the average delay and decreasing the PF for mMTC requests. Since the mMTC is a delay-tolerant service, decreasing the PF has more benefit in terms of SCR than decreasing the average queuing delay. The best solution for this conflict is to find the optimal value for the queue size that can balance these two conflicting factors. In addition, the PF of the mMTC requests does not improve much with or without using two queues (Methodology II). The reason is that when we allow the eMBB and mMTC requests to be queued, the interaction between them and the uRLLC request will be greater than when we have no queues. This interaction will result in an increase in the calculated PF.
Therefore, allowing the arrival request of mMTC to be queued is not enough, and an effective method should be used to efficiently minimize the PF of mMTC. Methodology III is proposed to solve this issue by allowing all requests that have been interrupted to rejoin the queue while there are free spaces in it and it is not full. In addition, increasing the queue size to 2QmMTC can also minimize the PF. Finally, it is clear that the idea of Methodology III can be selected as a better solution for having a lower PF.
As illustrated in Figure 8, the average queuing delay of mMTC requests for both Methodology II and III is studied and compared. As we can see, the performance of Methodology II outperforms that of Methodology III in terms of average queuing delay. The reason for this is that when we allow the interrupted mMTC request to rejoin its queue, the average queue size will increase. Increasing the number of queued requests will increase the average queuing delay, as the queue length is directly proportional to the average delay as derived in the analytical model. Therefore, Methodology III reduces the PF of mMTC requests and increases their average queuing time. Therefore, there is a tradeoff between increasing the average delay and decreasing the PF for mMTC requests. Since the mMTC is a delay-tolerant service, decreasing the PF has more benefit in terms of SCR than decreasing the average queuing delay. The best solution for this conflict is to find the optimal value for the queue size that can balance these two conflicting factors. To summarize, the main goals of the proposed system have been met. It proved that using the queuing system with feedback can improve the overall utilization of C-RAN resources (BBUs) and the SCR of both lower slice types. As we increase the queue size, the forced termination probability can be improved further at the expense of increasing their average queuing delay. The size of these queues can be designed according to the maximum acceptable delay required by each traffic slice type. To summarize, the main goals of the proposed system have been met. It proved that using the queuing system with feedback can improve the overall utilization of C-RAN resources (BBUs) and the SCR of both lower slice types. As we increase the queue size, the forced termination probability can be improved further at the expense of increasing their average queuing delay. The size of these queues can be designed according to the maximum acceptable delay required by each traffic slice type.

Conclusions and Future Scope
In this research, we propose a priority-based resource allocation for 5G C-RAN with multi-traffic slices categorized based on the slice's traffic priority and QoS. It includes two queues for lower-priority traffic, eMBB and mMTC, to increase their chances of completing their services. Hence, this will increase the utilization of C-RAN resources without degrading the QoS of the highest priority class. An analytical model using CMTC with multidimensional states is developed to derive the system performance measures. The analytical results of the proposed system are conducted and validated with simulation-based results. In order to investigate and compare our proposed system, it was studied using varied methodologies. In Methodology I, the system operates without using queues for lower-priority slice types, similar to the previous related models, and is used as a reference methodology. In methodology II, we use two queues for both eMBB and mMTC without allowing the interrupted mMTC to rejoin its queue for a possible service reattempt. Finally, Methodology III is similar to Methodology II in that it allows the interrupted mMTC to rejoin its queue for a possible service reconnect.
Based on the conducted results, Methodology II and Methodology III show differences in behavior when an mMTC is interrupted. With Methodology II, when there are few requests of type uRLLC and many requests of type eMBB and mMTC, the model is very efficient in terms of eMBB and mMTC SCR and forced probabilities. Methodology III allows the interrupted mMTC to rejoin its queue and hence reduces its forced termination probability, while Methodology II leads to its increase. Therefore, Methodology III will work well when we have high arrival rates of uRLLC, while Methodology II is better when we have low loads of uRLLC and high loads of mMTC. Nevertheless, Methodology III is very active in the sense that uRLLC is very active.
In the future, a mixing of these methodologies could be very effective if they are made hybrid by adding multiple constraints and utilizing multi-objective functions that may enhance the effectiveness of the future schemes proposed.