Joint Admission Control and Resource Allocation with Parallel VNF Processing for Time-Constrained Chains of Virtual Network Functions

Network Function Virtualization decouples network function deployment from dedicated hardware and reduces costs. Network services are structured as chains of VNFs. Each chain is a set of VNFs that should be executed according to a predefined order. For some applications, VNF chains should be executed within time constraints to meet the application’s objectives. Most studies provide a solution to allocate substrate network resources to the chains without considering admission control. Allocating resources to all chains may not be possible due to resource limitations. Efficient admission control is therefore required to determine chains admission. This paper proposes a joint admission control and resource allocation mechanism for VNF chains. We propose a resource allocation mechanism based on the idea of parallel VNF processing to meet tight time constraints. As the used assumptions in deterministic modeling of the system do not hold in a wide range of network conditions, we propose a stochastic modeling at which VNF chain execution is modeled by a Queue network. The Queue network is analyzed to calculate the expected value of the probability of deadline meeting in chains, according which the joint resource allocation and admission control problem is modeled as a non-linear optimization. The proposed optimization framework maximizes the profit of the network provider while keeping the confidence level of deadline-meeting for the admitted chains at desired levels. To have an efficient power usage, power consumption is also considered in network provider profit calculation. A heuristic for the joint resource allocation and admission control of VNF chains is proposed. The effectiveness of the proposed method is demonstrated through simulation.


I. INTRODUCTION
The advances in information and communication technology, as well as the high capacity and low latency requirements in new generations of communications, necessitate the use of emerging technologies such as Network Function Virtualization (NFV) [1] and Software-Defined Networks (SDNs) [2]. NFV enables network functions like Network Address Translators, Intrusion Detection Systems, Intrusion Prevention Systems, firewalls, and WAN optimizers to be executed on Virtual Machines (VMs) hosted on generalpurpose hardware. This decouples network function deployment from dedicated hardware, thereby reducing Capital Expenditures (CAPEX) and Operation Expenditures (OPEX) while enhancing network flexibility [1]. Such Virtualized Network Functions are referred to as VNFs. SDN technology is utilized to decouple network control logic from the underlying transport infrastructure, i.e., switches and routers, in the data plane. Therefore, network switches/routers will function as forwarding devices operated by a logicallycentralized controller [2].
Network providers can offer services with different Quality of Services (QoSs) requirements using a shared infrastructure. Each service can be considered as a chain of Virtual Network Functions (VNFs) that are designed to operate on the traffic flow from a source node to a destination node [1], [3], [4]. A major challenge in the management and orchestration of services in NFV is allocating substrate network resources to the VNF chains in such a way that VOLUME XX, 2021 3 application requirements are satisfied [1], [5]. To allocate resources to a VNF chain, the involved VNFs are assigned to computational resources, while the connections between the VNFs are mapped to the physical routes [5]. Some applications require VNF chains be executed within a predefined time constraint to address the application objectives. The constraint may be tight in a Tactile Internet application [6], or looser in a video streaming application [7]. For such applications, the resources should be allocated to the VNF chains such that the QoS in terms of constraints for chain execution time be met. Another resource management issue is admission control, defined as a mechanism that a network provider applies to determine the admission of the VNF chains to the system. Most studies in the literature provide resource allocation solutions without considering admission control [4], [8]- [31]. However, admitting all the requests in a system is not always possible due to system resource limitations [32]- [34]. An appropriate mechanism is required to control the admission of chains.
A few studies have shown joint admission control and resource allocation can offer a significant improvement in the performance of NFV [35]- [43], among which few studies e.g., [40]- [43] consider time constraint satisfaction for VNF chains. These studies apply an admission decision mechanism that selects a subset of the input VNF chains to be admitted to the system. The resource allocation is performed for the admitted chains. VNF chain admission decision is performed based on the resource capacity to accommodate chains, chains' demands like required bandwidth, and QoS constraints, e.g., the time bound for chain execution.
The resource allocation solutions or joint admission control and resource allocation solutions available in the literature, model systems as a deterministic process [3], [4], [8]- [31], [35]- [44]. Under such circumstances, the traffic arrival amount to the chains and the processing rate of computational and communication nodes (i.e., VMs, switches) are assumed to be constant values. However, these assumptions do not hold in a wide range of network conditions due to dynamicity in traffic arrival, work load variation in a node (i.e., VM, switch), and competition on the usage of resources of a node [45], [46]. Furthermore, in the literature, the computational and transmission delays are calculated according to simplified deterministic models based on the constant traffic arrival amount and the constant processing/transmission rate which can involve low precision estimations in resource allocation procedure for a wide range of network conditions. This can end to low quality solutions and consequently time constraints violation at run time. However, stochastic modeling of a system gives a more realistic representation, in which the nonlinear and non-deterministic behaviour of the system is estimated by performance modeling analysis [47], [48]. This modeling shifts the time constraint satisfaction assessment from a binary space (satisfied or not satisfied) to a probabilistic space, i.e., determining the probability of the time constraint satisfaction. Such probability calculation, however, is not trivial.
In addition, an appropriate resource allocation is required to meet the tight deadlines of VNF chains. In our previous work [12], based on a deterministic modeling of the system, we have suggested exploiting parallel VNF processing when allocating resources to the chains. In this approach, the processing of individual flow is distributed among multiple VMs (performing the same VNF functionality) whenever the deadline is tight. This is in contrast to the existent studies, including [3], [4], [8]- [10], [13]- [29], and [36]- [44], where there is sequential traffic processing at the flow level, i.e., every flow is assigned to a single VM for a VNF functionality. In this paper, we extend the idea of parallel VNF processing in [12] by sharing a VM among multiple VNF chains when allocating resources. This will avoid waste of processing capacities of VMs (as a result of dedicating each VM to a chain as in [12]) and, it will enhance admission ratio in comparison with [12] as the results in Section VI show. Furthermore, we extend the idea in [12], by proposing a resource allocation mechanism based on a stochastic modeling for VNF chain execution, and providing an admission control mechanism.
Although parallelism brings gains in terms of traffic processing speed, it makes the communication model more complex because of the multiple routes the traffic may traverse. Accordingly, stochastic analysis of time constraint satisfaction becomes difficult because of the complexity of the communication model. This paper provides a joint admission control and resource allocation of time-constrained VNF chains while tackling the above-mentioned issues.
In an NFV scenario, power consumption due to resource utilization will bring electricity costs for network providers [49]. To provide an efficient resource allocation mechanism, it is important to meet the QoS in terms of time constraints for VNF chains while considering the power consumption. Indeed, appropriate resource allocation to a chain (i.e., degree of VNF parallelism and allocation of VMs) is required that considers the power consumption characteristics of the physical nodes. To address this aim, similar to [44], [50], we define the profit for the network provider based on the revenue obtained from the admission of chains and the cost imposed by power consumption as a result of resource allocation for the admitted chains. Next, the admission control and resource allocation is done such that the profit of the network provider is maximized. To maximize the network profit, the proposed method will also be efficient from the aspect of power consumption. This paper makes the following contributions: 1) Extending our idea of parallel VNF processing in [12] by sharing a VM among multiple VNF chains in resource allocation; 2) Stochastic modeling of VNF chain execution with a queue-network and analyzing it to calculate the expected value of the probability of deadline meeting in VNF chains; 3) Modeling of joint admission control and resource allocation for time-constrained VNF chains as an optimization problem. In the optimization, the profit of the VOLUME XX, 2021 3 network provider is maximized while the Confidence-Levels (CLs) for the deadline meeting of the admitted chains are met; 4) Proposing a heuristic for joint admission control and resource allocation for the VNF chains; and 5) Utilizing the stochastic analysis, and proposing a Tabubased heuristic that exploits parallel VNF processing for allocating the substrate network resources to admitted chains. This paper continues with the related works in Section 2. The system model is explained in Section 3. In Section 4, we model VNF chain execution with a queue network and provide its analysis. The optimization for joint VNF chain admission control and resource allocation is explained in Section 5, followed by the proposed heuristic to solve the optimization. Section 6 presents the performance evaluation, and finally, Section 7 gives the conclusion and future work.

II. RELATED WORK
We explain the related work in three categories: 1) resource allocation for VNF chains; 2) admission control; and 3) joint admission control and resource allocation, and indicate our contribution in each category. Resource Allocation for VNF Chains. Generally, the studies in this category allocate resources i.e., computational resources and link bandwidth, to an input set of VNF chains without applying any admission control mechanism. Most studies model the problem as (Mixed) Integer Linear Programming (ILP) optimization; they optimize an objective function while considering some constraints: e.g., computational resource usage capacity, link usage capacity, constraints on chain's demands, and QoS constraints. They employ optimization tools/heuristics to solve the problem. We review these studies with a focus on objective functions.
There are studies that consider time constraints for chains while allocating substrate network resources. The works in [4], [18], [19] minimize the resource usage cost for VNF chain execution while meeting their deadlines. Resource utilization minimization is considered in [20]- [22]. The study in [23] minimizes network load cost. A resource allocation algorithm that minimizes the energy consumption in VNF chains while considering deadline satisfaction is presented in [24], [55]. The authors of [25] decide on the amount of resources needed for each VNF in the chain to satisfy the delay requirement while minimizing resource consumption. The authors of [28] consider resource allocation to VNF chains in a network consisting of hierarchical resources from edge to 5G core. They migrate VNFs among resources to avoid deadline violation, and minimize the migration frequencies. The study in [29] focuses on the consolidation of VNF instances while allocating resources to the chains. The allocation of resources to 5G network slices have been considered in [8]- [10]. Resource usage cost minimization for slices is considered in [8], [9]. The work in [10] considers the satisfaction of availability, reliability and delay tolerance requirements of the slices in resource allocation.
There are three main differences between studies in this category and our work: 1) We provide an admission control mechanism along with an resource allocation solution. Admission decisions are required when a system cannot meet the requirements of all the chains. 2) We analyze the system using stochastic modeling, while the above-mentioned works employ a deterministic analysis. 3) All the studies mentioned above process the traffic flow sequentially. Indeed, for each flow, a single VM serves the entire traffic for a VNF functionality. In contrast, we apply a parallel chain traffic processing to be able to meet tight time constraints. The works in [11], [12], [30], [31] use parallel traffic processing when allocating resources to the chains. However, they do not have an admission control mechanism. Furthermore, they are based on a deterministic system analysis which does not provide a precise representation of the system. Admission Control. The studies in this category focus only on the admission control of VNF chains. The admission decision is based on the system's resource capacity and the constraints on chains' demands like required bandwidth or resources. In [32] the authors assume that the VNFs of the chain have already been deployed in the system. Their approach decides on the admission of flows passing through the VNFs. The problem is modeled as an ILP with the objective of revenue maximization. The study in [33] considers the admission of slices. Focused on the uncertainty of the resource demands of slices, they model the admission problem as a Markov Decision Process to admit a maximum amount of requests. The works in [32], [33] assume that infrastructure can provide the QoS requirement of VNF chains, e.g., time constraints for chain execution. However, meeting this requirement depends on resource allocation decisions. This correlation makes the admission decision a complicated task, which is the focus of this paper. The work in [34] decides on the packet admission at each VNF such that end-to-end latency is kept within a predefined deadline. However, it does not consider communication latency between VNFs. In comparison with the works in this category, we consider joint resource allocation and admission control. Joint Admission Control and Resource Allocation. The studies in this category consider both admission control and resource allocation for VNF chains. The authors of [36] maximize the number of chains admitted to the system and allocate resources to the admitted requests. In [3], researchers consider the relation between Link resource usage and the VM reuse factor, called the LV relation. They propose a mechanism to obtain a chain LV relation such that the maximum number of chains can be admitted, admitting those chains for which the obtained LV relation holds. A game theoretic approach for admission and resource allocation to VNF chains has been presented in [56]. The VOLUME XX, 2021 3 works in [35], [37]- [39], [44] propose a joint optimization for admission control and resource allocation in an NFV environment. In [35] the focus is on maximizing the system revenue, while [44] maximizes the network profit. The study in [37] maximizes the number of chains admitted. In [38], a chain will bring revenue/penalty for admission/rejection. The aim is to maximize the system utility. Admission mechanisms for the network services in mobile edge computing are proposed in [39]. The aim is to maximize the revenue by admitting as many requests as possible while meeting their reliability requirements. The works in [3], [35]- [39], [44], [56] do not consider time constraints, either in resource allocation or admission decision. Thus, deadline violation is probable in these works. The studies in [40]- [43], [57]- [59] consider time constraints. The authors of [43] utilize the migration of VNFs to serve the newly-arrived requests and the already-existent chains. They consider the end-to-end delay of the chains be less than the required deadline. The problem is modeled as an ILP to maximize the number of admitted requests while minimizing the migration cost. VNF remapping and rescheduling in space-air-ground integrated networks has  [57]. The migration or instantiation of the VNFs on the network nodes in order to admit newly-arrived chain as well as serving the already-existent chains has been modeled as an optimization problem. The objective is to perform the admission and resource allocation such that the service provider profit be maximized while the chains deadlines are respected. Algorithm is proposed to obtain suboptimal solution. The authors of [41] maximizes an aggregation of the number of chains admitted to the system and the resource preference usage. The authors of [40] admit slices to the system such that system throughput is maximized. Dynamic VNF placement and routing mechanism to decide about admission and resource allocation of VNF chains in a NFV-enabled SDN environment, has been considered in [58]. The aim is to minimize resource consumption cost, while respecting QoS constraints including end-to-end delay, packet loss, and jitter. The study in [42] provide placement mechanisms of VNFs with the aim of increasing fault-tolerance by exploiting back-up VNF instances. They focus on maximizing the number of admitted chains such that the chains' deadlines are met while the deployment cost remains within a budget. The deployment cost is defined as the costs of VNF processing and traffic transmission to back-up VNFs. The authors of [59] provide resource allocation to admit maximum amount of services from IoT or mobile devices, to the edge computing resources while considering constraints for services response time. A linear response time modeling of edge resources at which the response time depends on the number of admitted requests has been utilized. A scaling mechanism for edge resources is proposed to adopt to the workload. However, the proposed method in [59] is applicable for services with a single VNF. The methods proposed in [3], [36]- [44], [56]- [59] are based on sequential traffic processing, while we advocate for parallel traffic-chain processing that allows tight deadlines to be met. Furthermore, [3], [35]- [44], [56]- [58] consider a deterministic process, under assumptions that the traffic arrival and processing rate of computational resources is constant. This paper utilizes a stochastic modeling of the system, which provides a more realistic representation of the system.

III. SYSTEM MODEL
In this section, we explain the modeling of VNF chains, NFV infrastructure, communication, power consumption, decision variables, auxiliary notations, and assumptions. Table I indicates the used symbols.
1) VNF types: Let = { , … } be the ordered set of VNF types that any subset of which can be included in various VNF chains. Each VNF instance runs on a VM in order to process the traffic. The traffic size may be expanded, shrunk, or remain the same after VNF application. We define > 0 as the traffic scaling ratio of VNF type . When < 1, shrinking will occur. When is 1, the traffic size will not change. Finally, when > 1, the traffic will be expanded. VOLUME XX, 2021 3 2) VNF chains: Let be the set of VNF chains. We define binary variable , to be 1 when VNF type ∈ is used by chain . We also define ( , ) as the set of VNF predecessors of VNF type in chain . Like [30], [60], we consider a Poisson traffic arrival pattern to the chain. The traffic for chain is generated at source node according to a Poisson process with a mean rate of chunks per second. When the traffic has been processed by all the VNFs defined in the chain, it will be forwarded to the destination node . Let be the Confidence Level (CL) for meeting deadlines for the traffic chunks of chain . Indeed, chunks of traffic that come from the source should be processed through the VNFs and reach the destination according to the specified deadline and CL. For example, a CL of 0.98 means that the processing of at least 98% of the traffic chunks should be completed by the predefined deadline. In the application of Tactile Internet as an example, in remote orthopedic surgeries which are already being conducted in 5G [61], artificial intelligence could be used to predict the contents of traffic chunks that do not reach their destination in time [62]. Thus, for the orthopedic surgeries application, a value of less than 100% can be set for the CL, to impose less cost to the user. In contrast, CL might be selected as 100% for applications like remote heart surgery, in which predicting the content of the delayed chunk is difficult and so all the chunks need to reach within the specified deadline [62]. Meanwhile, in applications like VoIP data transmission, video conferencing, and streaming media (with looser deadline in comparison with Tactile Internet), the CL might be selected as less than 100%, since the delayed traffic chunks can be dropped and the application can tolerate the loss of some traffic chunks, up to a threshold that will still assure an acceptable QoS [63]. The deadline profiles for all chains and their CL profiles are represented by vectors and , respectively i.e., every chain has a predefined deadline and CL. We define ℵ as the revenue gain for chain in the case when the chain is admitted to the system.
3) NFV infrastructure: NFVI is modeled as a graph = ( , ). consists of all nodes, including physical servers represented by PS and |SW| software-defined switches represented by SW; i.e., = ∪ . Vector is a matrix of size ( + | |) × ( + | |) that indicates the connectivity between nodes. Here, element ∈ is 1 when node ∈ is directly connected to node ∈ . Communications between physical servers are carried out by a network of software-defined switches (see Fig. 1). Each physical server is connected to one arbitrary switch. Thus, for server ∈ we have: ∑ , ∈ = 1. Note that this modeling of connectivity, let pool of servers be connected to a single switch.
Like [7], [64], [65], we assume there are preinstalled VNF instances for each VNF type, where each VNF instance is running in a VM. Indeed, there is a pool of VMs for each VNF type. All VMs in a pool have the hardware/software required to execute the VNF. To have an efficient VM utilization, like [3], [66]- [68] we assume different VNF chains that demand the same VNF type can share the same VM running that VNF type. Security mechanisms like threat detection by monitoring the status of the chain running over the VM [69] or applying security policies for the chains through VNFs connected to the VM [70] can be adopted to enhance the security of VNF chains under VM sharing circumstances. Also like [3], we assume that each VM can run at most one VNF type. Physical servers host the VMs. Note that the VMs in a single pool can be distributed over several servers. Pool consists of ≥ 1 VMs, capable of hosting ∈ in parallel. Pool is represented by set = , … . Here, ( ∈ {1, … }) is the VM with index inside . For VM , we define a 1 × vector with the element of 1 for the physical server hosting , while the other elements are 0. Like [46], [71] we model each VM as an M/M/1 FCFS (First Come First Service) queue. VM processes the arrived traffic at a speed with exponential distribution, with a mean rate of chunks per second, which is equivalent to the VM processing capacity. In this regard, the VMs within a pool are heterogeneous from the aspect of traffic processing speed. 4) Communication: Switches are programmed by the SDN-Controller to distribute traffic among VMs. For simplicity, let us focus on a chain that uses all VNF types. There are two end-point communications: First, the traffic which is generated at source node arrives into a switch, , , which forwards the traffic towards the VMs in the first pool i.e., . Second, the traffic processed by the VMs of the last pool are transmitted by a switch denoted by , towards the destination . Like [45], we model switches as M/M/1 FCFS queues with exponential transmission with mean rates of , and , chunks per second, respectively. The communication between two adjacent pools of and (1 ≤ ≤ ) is done by switch , with a mean transmission rate of , chunks per second. We represent the switches providing connection between pools with the set , operates according to a probabilistic transmission strategy. It sends each traffic chunk belonging to chain to the VM with probability , . The whole distribution policy is represented by vector . Direct transmission of traffic chunks among VMs hosted on the same server is possible. In this case the traffic does not need to go through a switch. To reduce latency, the switch, in which a maximum number of shortest paths between every pair of VMs in pools , and crosses, is labeled as , . We refer to the mean transmission rate of an arbitrary switch in the network ∈ with . 5) Power consumption: We use the power consumption model of [72], in which the power consumption of an electronic device depends on the Power Usage Effectiveness (PUE) as well as the static (leakage) and dynamic power consumption. Here, a device can be a physical server or a switch. Static power consumption is caused by current leakage and is unrelated to the device usage. Dynamic power consumption is caused by device circuits' activities and thus is determined by the device utilization. For a physical server, the utilization of the capacity of traffic processing is used, while for a switch the utilization of capacity of traffic transmission is considered. Let be the PUE of device . The static power consumption of the device is represented by . Let be the dynamic power consumption of the device when the utilization is maximum (all the processing/transmission capacity of the physical server/switch is utilized). The power consumption of the device is calculated as (1). Here, is the utilization of device .

6) Decision variables:
Depending on the deadline and the CL, a subset of VMs inside a pool is allocated to each chain. The allocation profile is represented by vector with binary variables , ; it has the value of 1 when VM ∈ serves VNF type to chain . We also define as the decision variable defining if chain is admitted to the system (1 for the case of admission and 0 for the case of rejection).

7)
Indices and auxiliary notations: We introduce several auxiliary notations in this paper. |X| is used for the size of set X. Notation 1. ( ) or 1. [ ] is 1 when boolean is true, otherwise it is 0. The symbol × is a vector of dimensions 1 × N with all elements of 0. Symbol is a vector of 1 × with all elements of 0 except the element, which is 1. We also use some indices in this paper. We use index for pool, index for VM, and for VNF chain. We also introduce to denote immediate a VNF predecessor of VNF type . Note that is used in the context of a chain. For example, for a chain that uses VNF types 2, 5, 7, when is 5, then is 2; and when is 7, then has the value of 5.
8) Assumptions: These are the assumptions used in this paper: a) To enable parallel VNF processing, we assume there are at least K+1 switches, i.e., | | ≥ + 1. Considering that the number of VNF types is rather small, in real networks, the number of switches is commonly more than + 1. b) The traffic is distributed among VMs in the pool, in proportional to their processing speed. In this regard, the probability that a traffic chunk of chain goes through is calculated using (2). Note that the proposed method is applicable to any other policy as well. From the aspect of implementing the traffic distribution, programmable data plane technology provides facilities through which the switches can be programmed to specifically process the packets of an application i.e., VNF chain. One of the most renowned architectures for programmable switches, i.e. Protocol Independent Switch Architecture (PISA) [73], provides programmability of switches through match-action tables. The switches inter-connecting the pools can be programmed such that matching recognize the VNF chain packets, while action performs the probabilistic routing in (2), which is a load balancing strategy. Note that implementing load balancing in programmable switches has been shown to be feasible in [74], [75]. As the hardware of programmable switches can provide a line-rate of processing [73], it is expected that the splitting traffic in switches, has ignorable impact on VNF chain execution latency.
c) Similar to [30], for the sake of simplicity we advocate the benefit of using the shortest path routing (from the aspect of latency), and we assume that traffic is transmitted from any physical server to any switch , ∈ , using the shortest path between them. Similarly, the traffic is transmitted from any switch , ∈ to any physical server via the shortest path between them. In Section V.C we explain how VOLUME XX, 2021 3 the proposed method can be generalized to decide about routing. d) The probabilistic transmission used in switches connecting the pools, might alter the order of packets at destination. Though there exist efficient reordering mechanisms that can be applied at destination-side to deliver the packets in order [76], [77], appropriate value for size of traffic chunk will further diminish the reordering overhead. Indeed, as each traffic chunk can include multiple packets, the order of packets inside each chunk is kept and does not require re-ordering. In this paper, we assume that size of chunk has been chosen appropriately so that the reordering will be done with tolerable overhead.
Example. To clarify the VNF parallel processing, Fig. 1 shows an example. The network consists of four physical servers that are connected through six switches. There are three VNF types, and accordingly, three pools. Each pool has three VMs. A sample allocation is shown for two chains which need all VNF types. From the VNF1 pool, two and one VMs are allocated to the grey-shaded and the dotted chains, respectively. From the VNF2 (VNF3) pool, two and one VMs are allocated to the dotted and the grey-shaded chains, respectively. We focus on the dotted chain for the communication pattern. A possible routing has been illustrated. Here, VM on the second and VM on the last physical server are the origin and destination of the traffic, respectively, and the traffic will traverse the VNFs in the sequence of , , . Four switches have been labeled to provide connections among the pools. The whole route consists of four route segments. A VNF type is visited at each route segment. Route segments 1 and 2 are shown in Fig.  1(a) while route segments 3 and 4 are shown in Fig. 1(b). Traffic from the origin node comes to , , which sends the traffic to (route segment 1). The processed traffic is then forwarded to , , which forwards a fraction of the traffic to and the rest to (route segment 2). The processed traffic is next sent to , , which sends a fraction of the traffic to and the rest to (route segment 3). Finally, the traffic is routed towards , , from where it goes to the destination (route segment 4). The SDN controller program switches to provide such routing among the pools.

IV. VNF CHAIN EXECUTION ANALYSIS
In this section, we first explain how VNF chain execution can be modeled by a queue network. Next, we provide the analysis.

A. MODELING CHAIN EXECUTION WITH A QUEUE NETWORK
We can model the VNF chain execution as a queue network. Fig. 2 shows the queue network when the traffic traverses all VNF types. In this model, each queue inside a pool illustrates a VM, and the queues between pools illustrate the switches. The traffic chunks traverse the VNFs from source to destination. Each switch probabilistically transmits each chunk toward a VM inside the next pool to apply the VNF.
As traffic might traverse several switches before reaching a switch like , (See Fig. 1 is involved in the calculation. Note that the minimum transmission rate of the switches on the path is equivalent to the bandwidth capacity of the path connecting the two VMs. Similarly, the effective mean transmission rates of , and , are calculated by (4) and (5), respectively. For the , switch, the shortest path from the source nodes to the switch is considered. For switch , the shortest path from the switch to the destination nodes is considered.
As we see in Fig. 2, some XORs are appended just after a VM queue. This indicates the possible routes of chunks inside the queue network. After a chunk is processed by a VNF instance, it may go directly to the next VNF instance (in the succeeding pool); this happens when both VMs hosting the VNF instances are located on the same server. Otherwise, the traffic goes to the switch connecting the two pools.
The traffic exit rate from for chain is calculated using (6). Indeed, the original traffic rate of the chain is multiplied by the traffic scaling ratios of all the predecessor VNFs of .

B. QUEUE NETWORK ANALYSIS
In this subsection, we analyze the queue network. Before proceeding to our analysis, we define , as the arrival traffic rate to VM from chain , and define , , as the arrival traffic rate to switch , from chain . To calculate the rate , , we introduce the variable with the value of 1 only when has been co-located with any of the VMs in the immediate predecessor VNF pool; i.e., . Eq. (7-1) shows the calculation. Every individual is checked to see if it has the same host as or not. This will be verified by a logical statement − == × . In this regard, ∑ 1.
− == × counts the number of VMs in the immediate predecessor VNF pool that have been co-located with . In the case where the count is above zero, co-location has occurred and takes the value of 1.
The rate , is calculated as given in . The first relation is for the case when co-location has not occurred; i.e., = 0. In this case, the source of the traffic arrival is switch , , which sends a fraction of its arrival rate to i.e. , , .
, . The second relation in (7-2) is for the case where colocation has occurred. In this case, a fraction of the traffic comes to via the switch , , and a fraction of traffic comes directly from the immediate predecessor pool. The first and second terms in the relation calculate the specified amounts of traffic. In the second term, the numerator is the total traffic that comes from the VMs in that have been co-located with . The denominator indicates the number of VMs allocated to the chain in that have been co-located with . Indeed, the traffic is divided among these VMs and . The arrival traffic rate to VM is calculated as the summation of the traffic arrival rates from all chains. Eq. (8) shows the calculation. Here, we have added a traffic rate, , , which indicates any background traffic that might come into VM because of already-admitted chains to the system that have been assigned to the VM.
The rate , , is calculated as all of the traffic that leaves from the predecessor pool in the chain, i.e., , minus the traffic that enters directly into the pool without passing through the switch , . Eq. (10) shows the calculation. Equations (11) and (12) The arrival traffic rate to switch , is the summation of the traffic rates for the chains and the background traffic rate of the switch. The background traffic rate includes the rate of traffic imposed to switch from the already admitted chains in the system that utilize switch Let ( ) be the probability density function for the response time of chain , i.e., ( ) = Pr ( == ). The Probability of Deadline Meeting (PDM) for the chain is calculated as below: Now we explain the calculation of the expected value of the PDM. Without loss of generality, let us focus on a chain that uses all VNF types. A chunk of such a chain may go through a path of , , , , , , , , … , , to be processed by the VNFs. The probability of being routed VOLUME XX, 2021 3 through such a path is calculated as ∏ , . The path is a Tandem queue network. Let and , represent the sojourn time in VM and switch , , respectively. In a Tandem network, i.e., a series of M/M/1 queues with FCFS policy, the sojourn times of a given chunk of data in each queue are independent [71], [78]- [80] 1 . Let be the sojourn time of the whole path. We have: The Relying on [71], [84], the distribution of is calculated as shown below: The expected value of the PDM is calculated as (18). Here, .. = {( … )| ∈ {1 … }, … , ∈ (1 … )}. Note that (16), (17) and (18) can simply be adjusted for arbitrary paths of any length; all that is required is to include the VM/switch terms that are used in the paths into these equations.

V. OPTIMIZATION FOR JOINT ADMISSION CONTROL AND RESOURCE ALLOCATION FOR VNF CHAINS
In this section, we first give the optimization model for the joint admission control and resource allocation for VNF chains. Next, we propose a heuristic to solve the problem, and then analyze its complexity.

A. OPTIMIZATION MODEL
The optimization objective is to admit chains of VNFs such that the profit of the network provider is maximized. Furthermore, the confidence level for meeting the deadlines of the admitted chains should be kept according to the 1 For the sake of simplicity in analysis, like [46], [71], [45] we assume traffic arrival/processing in VMs and switches follow M/M/1 queueing model. For the case of general traffic arrival/processing in VMs and Service Level Agreement (SLA) requirement. The optimization problem is defined as given below, where (19) defines maximizing the network provider profit as the objective function. Equations (20)- (24) are elements involved in the objective function calculation, and (25)- (32) are the constraints.

( ) ≥ . , ∀ ∈
Following the general calculation of profit as utility minus cost, we consider system revenue as utility and power consumption as the cost, similar to [45] and [49]. The network provider profit in (19) is defined as the amount remaining after the cost that the network provider should pay for power consumption has been deducted from the revenue the provider receives for giving service to the VNF chains. Γ is the coefficient utilized to convert the power consumption to the monetary term.
The power consumption of the system is calculated as the summation of the power consumption in the physical servers and switches, as shown in (20). Here, , , , , and , are the PUE, static power consumption, and dynamic power consumption, respectively, of switch , .
is the utilization of physical server , calculated by (21). Indeed, the server can be viewed as a composite computational resource composed of several VMs. Server utilization is calculated as the ratio of the total traffic arrival rate to the server to the total traffic processing rate. Similarly, (22) illustrates the utilization of switch , . Eq. (23) calculates the mean arrival traffic to other switches in the network. Here, , is the background traffic to the switch due to already admitted chains. Index is the immediate successor pool of . The notation , , is a binary variable with the value of 1 when switch ∈ \ is on the shortest path from to which traverses , i.e., ∈ ( , ). For every chain like , when the switch is on the shortest path from any allocated VM in pool (like ), to any allocated VM in the successor pool (like ), traffic amount of , . . , will pass through the switch. The co-location issue has also been considered in calculations of (23). Eq. (24) shows the switch utilization calculation.
Constraint (25) ensures that only the admitted chains will be assigned to VMs. According to (26), when a chain does not use a VNF type, no instances of that VNF type are allocated to the chain. Constraint (27) ensures that at least one instance of a VNF type (one VM in the associated pool) is assigned to the chain when it needs that VNF type and is admitted to the system. Constraint (28) ensures that the number of VMs for a specific VNF type that are assigned to every VNF chain are bounded with the number of VMs inside the pool. Constraints (29), (30), and (31) ensure the ergodicity conditions in the queue network. Here, (29) and (30) ensure the admission of traffic within the capacity of the transmission rates of the switches, thereby avoiding congestion in the network (equivalent to a bandwidth capacity constraint). Constraint (31) is the VM processing capacity constraint which indicates that the arrived traffic to a VM should be less than its processing capacity. Constraint (32) ensures the deadline meeting confidence level for the admitted chains. Note that the is calculated by (18).

B. PROPOSED HEURISTIC FOR JOINT ADMISSION CONTROL AND RESOURCE ALLOCATION FOR VNF CHAINS
The optimization problem formulated in subsection V.A is a binary nonlinear optimization (see equations  and (17), which are nonlinear); given the complexity of non-Heuristic ACRA CS: Set of all chains // CS is global variable Phase 1 (Chains: CS): // CS is local variable 3 While ∃ : . , ≥ .̂ , or ∃ : . ≥ or ∃ : . , linear solvers, heuristic is required to efficiently solve the problem. We solve the optimization problem, i.e., maximizing the network provider profit in (19), by proposing the heuristic ACRA, which is abbreviation for Admission Control and Resource Allocation. ACRA iteratively calls another heuristic, RA, which is abbreviation for Resource Allocation. RA does not decide on the admission of chains; instead, it allocates resources to the maximum number of chains with minimum power usage, a required step to for optimize the objective function (19). ACRA utilizes RA to allocation resources, and furthermore it provides control over: 1) the admission of highly profitable chains to maximize network provider profit; 2) maintaining the confidence level of deadline meeting for the admitted chains; and 3) satisfying the ergodicity constraints (VMs' processing capacity constraints; switches' transmission rate capacity constraints i.e., equivalent to a bandwidth capacity constraint). Next, we explain ACRA and RA. ACRA. ACRA is performed in two phases. In the first phase, resources are allocated to a maximum number of chains with minimum power consumption. This is performed by calling the RA (lines 1-2). The result of the RA includes an allocation profile, , and the queue network analysis for the allocation profile, . The ergodicity condition of the system, i.e., the VMs' processing capacity constraints and the switches' transmission rates constraints (i.e., bandwidth capacity) in (29)(30)(31), are checked. In the case of violation, the least-profitable chain will be added to the set . The resource allocation is then performed once more with the excluded chains in set . The process of excluding chains is repeated until the ergodicity condition can be kept (lines [3][4][5][6][7][8]. In the case where there are chains whose deadlines have not been met with the required CL, the second phase is performed. The aim in the second phase is to select the subset of chains for the admission such that the network provider profit is maximized. The resource allocation result in the first phase is considered as a base solution. The solution is investigated to calculate the amount of revenue loss due to not meeting the CLs of some chains. The chains with the least revenue gain, i.e., those whose accumulated gain is less than the revenue loss, are considered as belonging to the (lines 9-15). This is the set of chains that are candidates to be excluded in order to gain a new set of solutions, . The chains in are excluded from the set of chains in an additive manner. Each time the first phase is repeated and the result is added to . Note that by excluding chains with low revenue gains, there is a chance that resources can be allocated to other chains with high revenue gains. Therefore, the CL for more chains with high revenue gains, could be met in the new calls of the RA; thus enhancing the profit can be achieved (lines [16][17][18][19][20][21][22][23][24]. The solutions are evaluated by the objective function in (19) and the best solution is chosen ( ). The chains that were omitted due to ergodicity violation are regarded as non-admitted. The low-revenue gain chains that were excluded before obtaining the solution are also regarded as non-admitted. For the rest of the chains, if they have met their CL they are admitted, otherwise, they are not admitted. The resource allocation is done according to the best solution; i.e., . (lines [25][26][27][28][29][30][31][32][33][34][35]. RA. Now, we explain the RA operation that is called by ACRA so that ACRA can optimize the objective function  i.e., (19). RA obtains a set of chains as input and allocates resources to the maximum number of chains while minimizing the power consumption. We exploit the Tabu method to implement the RA, as this meta-heuristichas proven quite promising to find near-optimal solutions in resource allocation problems [68], [85]. Tabu performs the search through an iterative process. It starts from an initial solution as the current solution. At each iteration it generates the neighbours of the current solution by applying some Tabu moves. The neighbours are evaluated according to a fitness function, and the search process continues from the best neighbour. The iteration continues until a stopping condition is met. A memory structure called Tabu-List is used to avoid looping during the search process, thereby preventing the exploration of previously-visited solutions.
The main elements involved in Tabu-search are: 1) Initial solution: For each chain, a single VM in each required pool (from the VNF type that is required by the chain) is randomly chosen and allocated to the chain. Satisfaction of constraints (26)(27)(28) are considered in the selection.
2) Tabu moves: The moves are defined below: M0 (VM allocation for bulk chains) -A random VM in a stochastically-selected pool is allocated to all the chains which need that VNF type. The selection probability for pool is given in (33). The pools that are used by more chains are more likely to be selected.
M1 (VM allocation for a single chain) -A random chain among those who do not meet the CL of their deadlines, is stochastically selected in reverse-proportion to the deviation from CL. The logic behind the selection is that chains closer to their deadline-CL, are more probable to meet the CL by allocating more VMs. Eq. (34) shows the selection probability for chain . An unassigned VM of a random pool that is required is assigned to that chain.
M2 (VM deallocation from a single chain) -A chain is stochastically selected in proportion to the probability of deadline meeting. An assigned VM of a random pool that is required by that chain is deallocated for that chain. Note that to avoid disconnectivity, the selected pool should not have allocated just a single VM for that chain. M3 (VM deallocation from bulk chains) -A random pool is selected. A VM that has been assigned to a minimum number of chains, i.e., ∑ , ∈ is chosen and is deallocated from all hosted chains. To avoid chain disconnectivity, the hosted chains with a single assigned VM are randomly assigned to other VMs.
Note that M0 and M1 address meeting the CL of deadlines, while M2 and M3 reduce power consumption. Also, in all moves, constraints (26)-(28) are considered to be met. The other constraints are considered in the fitness function.
3) Tabu-list management: To avoid cycling in the search, the moves that yield to the best neighbour are marked as Tabu and stored in Tabu-list, , for a specific number of iterations, . A move can be removed from Tabu-list if it meets the aspiration criterion. The criterion is met when the move quality is better than the quality of the current best solution. 4) Resource allocation solution fitness: The fitness function is defined as the aggregation of the power consumption and the penalty imposed by the constraints' violation, as shown in (35). Here, , , , , and are coefficients for the purpose of normalization to assure that the deviation in the constraints and the power usage are at the same scale. Note that the optimization of (35) is consistent with the optimization model as defined in Section V.A (see constraints [29][30][31][32], and it is a required step to optimize the objective function (19).

C. DISCUSSION
For the sake of simplicity, like [30] we have advocated the benefit over using the shortest path routing (from the aspect of latency), and we have assumed the traffic is transmitted from physical servers to switches connecting the pools through the shortest path (or from switches to physical servers). Here, we discuss that the shortest path-approach is not mandatory and the proposed method can be adapted to decide about the routing within the resource allocation phase.
To address this aim, the changes are required to be applied in the optimization model as below: 1) Decision variables that map the virtual links in the chains to the physical paths should be introduced. Let call them as "link-allocation variables".
2) For a chain that uses the two consecutive VNFs and , for each VM in pool that has been allocated to the chain, the traffic should traverse through , . Thus, a path from the VM to the switch , should be allocated to the virtual link, using the link-allocation variables. Similarly, from switch , to every allocated VM in pool a path should be allocated using the link-allocation variables.
3) The link-allocation variables will be involved in calculating the effective transmission rate of switches interconnecting the pools, which demands modifications in equations (3), (4), (5). Similarly, the link-allocation variables will be involved in calculating the traffic arrival rate to the switches in \ , which demands modification in (23). Considering (3) as an example, the term will be involved in (3), only when the switch has been located on the physical route from to (passes through switch , ) which we have allocated to the virtual link between VNF type and in the chain. 4) For every virtual link between two consecutive VNFs like and in every chain, the connectivity of the allocated physical route should be met through defining some constraints which assign appropriate values to the linkallocation variables. Furthermore, for each chain, the traffic entrance to every switch on the allocated physical path should be equal to the traffic exit from that switch. This equality will be checked through some new constrains. 5) To adapt RA to decide about link-allocation variables, a random initial routing is required in initialization. Furthermore, for explorations purposes, changing the routing of the traffic for the chains, is required which can be performed by Tabu moves. . The size of the search space is 2 .| | × 2 . Considering that the number of VNF types is limited, we see it as ignorable constant for the analysis. Complexity of RA: At each iteration of RA, the moves are performed, the queue network is analyzed, and the fitness function is calculated. The moves are performed at a complexity of ( + . ). The traffic arrival rates to the VMs and the switches in (8) and (13) are calculated with the complexity of ( . ).
The fitness function in (35) is calculated in ( + + | | + | |), which is less than the complexity of queue network analysis. Therefore, the complexity of RA is ( + . + . | |) (ignoring the constant number of iterations). Complexity of ACRA: In the best case, only the first phase is performed with complexity of ( + M. + . | |). In the worst case, the chains are sorted by their gain and RA is called times with the complexity of ( . + M . + . . | |).

VI. PERFORMANCE EVALUATION
In this section, we provide the simulation results to evaluate the efficiency of the proposed method in comparison with other algorithms in the literature.

A. SIMULATION SETUP AND BASELINES
The simulation was conducted with a Java program running on an Intel Core™ i7-6600U processor with 8 GB of memory. The NFVI consists of 15 physical servers (a scale similar to [35], [49]), and 9 software-defined switches. We used Dragonfly topology, a common topology in data centers [86], [87]. According to this topology, switches are fully connected and provide connection among physical servers. Each physical server is connected randomly to a switch. We consider 8 VNF types with traffic scaling ratios chosen randomly in the range of [0. 1,2]. A pool of VMs is associated to each VNF type. Each pool contains 5 VMs that are randomly distributed on the physical servers. Thus, there are a total of 40 VMs. The mean transmission rate of the switches is chosen randomly between 1 Gbps and 10 Gbps according to real systems [88]. The mean traffic processing rate of the VMs is chosen randomly in the range of 10 to 100 Mbps to cover the required throughput of standard instances for VNF types including Firewall, WAN Optimization Controller, IDS, and IPS [89].
Similar to [89], [90], we set the PUE of the physical servers and switches randomly in the range of 1 to 3 Watts, static power consumption in the range of 40 to 60 Watts, and dynamic power consumption when the utilization of physical servers/switches are maximum in the range of 100 to 300 Watts. Parameter Γ is 0.02 of the unit of currency.
Each VNF chain requires a random subset of VNF types. The source and destination nodes of each chain are randomly selected from the physical servers. The mean traffic arrival rate at each chain is set in the range of 50 Kbps to 500 Kbps according to the demand of real applications in Web service, VoIP, and online Gaming [89]. The revenue gain for each VNF chain is selected randomly in the range of 10 to 300 units of currency such that the chains that have more VNFs and a higher CL bring more revenue for the system. The search in Tabu is performed for at least 100 iterations and at most 300 iterations. After 100 iterations, the search process terminates if the quality of the best resource allocation solution does not change within the last 40 iterations. We found the value of 2 for appropriate. Finally, we assume the size of a chunk of data to be the same as the average size of a packet, 256 bytes [91].
We compared our proposed heuristic (ACRA) with the following four baselines: Our proposed resource allocation heuristic RA, SP [12], ILP-AR [41], and a greedy algorithm called Greedy. RA and SP [12] utilize parallel VNF processing, while ILP-AR [41] and Greedy process the traffic sequentially. SP, ILP-AR, and Greedy employ deterministic modeling of the system. An overview of the baselines is presented below. 1) SP is a resource allocation (without an admission control) method based on deterministic modeling of the system, we proposed in [12]. SP utilizes the same pooling idea of this paper to enable parallel VNF processing however, the resource allocation mechanism in SP, dedicates each VM in a pool to a single chain. It allocates VMs to chains with the objective of cost minimization while respecting the chains' deadline constraints. We have defined power consumption as cost criterion, to have power efficiency in resource allocation mechanism of SP.  2) ILP-AR is a joint Admission control and Resource allocation algorithm for VNF chains [41]. It decides on the admission of chains and allocates resources to them so as to maximize an aggregation of the revenue (obtained from chain admissions) and resource usage preference. Deadline constraints for chain execution time are considered in the optimization. We have defined the preferences to prioritize servers with less power consumption. The coefficient for weighting the revenue in comparison to the power consumption is the same value as the best selected value in [41]. This value gives priority to chain admission. ILP-AR models the problem as an ILP optimization. Like [41], we used CPLEX to implement ILP-AR.
3) Greedy gives priority in resource allocation to chains with high revenue gain. The chains are sorted in a list in descending order according to their revenue gain. For each VNF of a chain, the fastest VM in the pool with enough capacity to process the traffic, is allocated. When not enough VMs are found to host all the VNFs in the chain, the resources for that chain are released and the allocation process is performed for the next chain.
To have a fair comparison and illustrate the effectiveness of our proposed method, as baseline, we have selected the aforementioned methods since they have utilized similar criteria to our proposed method in their resource allocation and/or admission. SP which allocates resources based on parallel VNF processing considers power efficiency, while ILP-AR considers both revenue and power consumption in joint admission and resource allocation. Greedy also considers revenue gain of the chains in resource allocation.

B. RESULTS
In this subsection we give the results of our simulation. Due to the randomness nature of initial solution and the Tabu moves in RA and ACRA the results for these methods are reported for the average of 40 runs. Fig. 3 presents the results when there are 10 VNF chains and the chains' deadlines are randomly selected in the range of 4 to 8 msec. Fig. 3(a) illustrates the admission ratios, showing that in all five methods, when the CL of deadline-meeting increases the admission ratio decreases. This occurs because when the CL increases, the size of the feasible domain is smaller and so solving the optimization becomes more difficult in all methods. Greedy has the lowest admission ratio, as it does not consider time constraints in resource allocation. Other methods consider time constraints, among which, SP has a poor admission ratio since it wastes resources by allocating each VM to a single chain. ACRA and RA performed better than ILP-AR because: 1) they exploit parallelism for chain execution, which speeds up the chains' executions and tends to fewer deadline violations; and 2) they are based on stochastic analysis, which is more precise than the deterministic analysis used in ILP-AR. ACRA and RA are competitive, with ACRA performing particularly well for high confidence levels. ACRA offers such a high performance because it explores combinations of chains to decide about the admission of chains, which lead to higher admission ratios. Fig. 3(b) illustrates the Cumulative Distribution Function (CDF) of the revenue obtained from VNF chain admission when the CL is 0.65. We show the CDF for the two methods that have the highest admission ratio: ACRA and RA. The slower the growth of a curve the greater the admission of chains with higher gain. The admitted chains in ACRA have a higher gain than those in the RA, as the ACRA gives resource usage priority to chains with higher gains when the admission of all chains is not possible. Fig. 3(c) shows the network provider profit. The profit decreases in all methods for higher CLs, as the admission of chains becomes more difficult under higher CL conditions. While SP achieved higher admission ratios than Greedy, there was not a significant difference (See Fig. 3(a)). On the other hand, Greedy prioritizes the admission of chains with higher gains, and in comparison with SP, it gained higher profits. The ACRA that has the highest admission ratio also has the highest profit. Fig. 4 shows the results for higher-load of system; in this case, 60 VNF chains. As indicated in Fig. 4(a), the SP shows a markedly poor admission ratio, mainly because it wastes resources by under-utilizing them due to a dedication of a VM to a chain. This restriction does not allow the SP to allocate resources to a fraction of chains, which causes the admission ratio to be reduced. Note that such underutilization of resources does not have as much of an impact in lightly loaded systems, as indicated in Fig. 3(a). The difference between ACRA and RA is greater at 60 chains  than it is with 10 chains, which highlights the importance of admission control applications in higher-load systems where there is more competition for resource usage. The ACRA outperformed the other methods. Its admission ratio is up to 8% higher than that of the RA due to its application of admission control and, up to 13% higher than ILP-AR because it exploits parallel VNF execution and stochastic analysis. Fig. 4(b) shows the CDF for the revenue obtained from VNF chain admission for ACRA and RA when CL is 0.65. ACRA admitted chains with higher gains. For example, 51% of the chains admitted in ACRA have a gain of more than 90, while 47% of the chains admitted in RA have a gain of more than 90. Fig. 4(c) shows the network provider's profit variation. ACRA outperforms the other methods because of its higher admission ratio and its admission of chains with higher gain. Fig. 5 shows the results when the number of chains is changed from 40 to 160. The CLs were chosen randomly with uniform distribution in the range of 0.55 to 0.95. Fig.  5(a) indicates the admission ratio variation. In all five methods the admission ratio decreases when the number of chains increases. This decrease is due to the increased competition for resources at higher system loads. The performance of SP decreases most notably, with its admission ratio at only 0.04 when there are 160 chains. This low admission ratio shows the detrimental side-effect of dedicating a VM to a single chain. The outperformance of ACRA over other methods increases when there are more chains in the system. At the 160-chain level, ACRA shows an admission ratio that is greater than those of the RA, ILP-AR, Greedy, and SP by differences of 0.09, 0.24, 0.32, and 0.83, respectively. The improvement is due to ACRA's consideration of admission decisions in the resource allocation and its utilization of parallel processing for chain execution. Fig. 5(b) indicates the power consumption. Generally, in all methods more power is consumed when the number of chains increases. Although the admission ratio goes down with the increase in the number of chains, (considerably) more chains will be admitted to the system; thus increasing the power consumption because the higher admissions of chains increases the processing/transmission loads at VMs/switches. The ACRA admits more chains and thus consumes more power than the other methods. However, the power usage increment is less than the admission ratio increase. For example, at 160 chains, ACRA consumes only 217 additional Watts to increase its admission ratio by 0.09 compared to that of RA. This is possible because ACRA maximizes the admission of chains with a minimum of power consumption in order to maximize the network provider's profit. Fig. 5(c) illustrates the network provider profit. When the number of chains increases, the profit increases for ACRA, RA, ILP-AR, and Greedy, as they admit more chains at higher loads. However, SP cannot admit more chains, and so it will not gain more profit. ACRA shows the highest amount of profit gain. Indeed, it increased the profit by 1562 in comparison with RA when there are 160 chains. ACRA has the highest performance because it has the highest admission ratio and prioritizes admitting chains with high revenue. Fig. 6 illustrates the effect of pool size on admission ratio and profit where there exists 50 VNF chains. The CLs were chosen randomly with uniform distribution in the range of 0.55 to 0.95. We changed the pool size in the range of 2 (i.e., total of 16 VMs) till 5 (i.e., total of 40 VMs). Fig. 6(a) indicates the admission ratio. When the number of VMs per pool increases, the admission ratio increases as well in ACRA, RA, and SP. The reason is that these methods exploit parallel VNF processing and for higher number of VMs in the pool, they can exploit more parallelism (i.e., splitting traffic processing of a chain among more VMs is possible) which helps the satisfaction of deadlines with required CLs, thereby, increasing the admission ratio. In contrast, as sequential traffic processing has limited power in meeting CL of deadlines, increasing the number of VMs in the pools will not necessarily increases the admission ratio in sequential traffic processing based methods i.e., Greedy and ILP-AR, thereby, there exists fluctuations in admission ratio for these methods. Considering that higher admission ratio will have higher profit, similar behavioral patterns can be seen for profit as indicated in Fig. 6(b). As it can be seen in Fig. 6(a), ACRA has gained higher admission ratio in comparison with the baselines as a result of stochastic modeling of the system and exploiting parallel VNF processing to perform a joint admission control and resource allocation. Accordingly, as it can be seen in Fig. 6(b) it has gained higher profit as a result of higher admission.
To assess the effect of traffic, Fig. 7 shows the admission ratio and network provider profit in ACRA. Here, there are 5 VMs per pool and 60 VNF chains. The mean traffic arrival rate of each chain changes in the range of 200 to 650 kbps. The deadlines and CLs have been chosen randomly in the range of [4,8] msec., and [0.55, 0.95] respectively. As it can be seen, admission ratio decreases when the traffic rate increases. The reason is that higher traffic rates impose higher loads to the VMs for traffic processing, and to the switches for traffic transmission. Thus, meeting the deadlines according to the requested CLs, becomes more difficult and fewer chains will be admitted. Accordingly, less profit will be gained in higher traffic arrival rates as a result of admission reduction. Table II illustrates the comparison of ACRA with an optimal solution determined by exhaustive research. Note that the optimal solution can be calculated in a reasonable amount of time for small scales of the problem. There are 2 VNF types and 2 VMs per pool (a total of 4 VMs). The simulation was conducted for 5 and 6 chains, where all chains need the 2 VNF types. The size of the search spaces for 5 and 6 chains are 33,554,432 and 1.1 × 10 , respectively. Note that for the case of 6 chains, finding the optimal solution at each run took an average of 7 hours and 12 minutes. The admission ratio and profit in ACRA for CL of 0.6 are the same as with the optimal solution. For a tighter CL of 0.8, where the feasible domain is smaller, ACRA does not obtain the optimal admission ratio and profit, but gets very close to the optimal values. This shows the effectiveness of ACRA in finding solutions.

VII. CONCLUSION
This paper provides a method for the joint admission control and resource allocation for VNF chains for applications with time constraints in chain execution. Pools of VNFs that execute the traffic in parallel are utilized to speed up traffic processing for tight time constraints. VNF chain execution is modeled by a Queue Network. The Queue theory is applied to calculate the expected value for the probability of deadline-meeting in VNF chains. The problem is modeled as a joint optimization that decides on the admission of VNF chains and the resource allocations for the admitted chains. The objective is to maximize the profit of the network provider while keeping the confidence level of deadlinemeeting for the admitted chains at desired levels. The power consumption of the physical servers and of the switches is considered in the profit calculation. A heuristic is proposed to solve the problem. Simulation results show that our method improves the admission ratio and the network provider profit when compared to three other methods. We have assumed that the size of pools has been given. Providing a solution to determine the optimal size of VNF pools is a future work.