Behavior Reconstruction Models for Large-scale Network Service Systems

In large-scale network service systems, the phenomenon of instantaneous gathering of a large number of users can cause system abnormality, whenever the load imposed by the user behaviors does not match the system load. This paper proposes a behavior reconstruction model for large-scale network service systems integrated with Petri net reconstruction methodology, for the purpose of achieving load balancing in the system under increasing number of users. Based on the features of the user interaction behavior sequence, the behavioral load balancing model defines a user behavior membership function. Then, a random fuzzy Petri net with delay is presented to control the user behavior reconstruction. Experiments conducted by considering various changes in the number of user behaviors and their distribution in unit time demonstrate that the proposed methodology can effectively trigger the reconstructed model to balance the system load when the system load exceeds the defined warning point.


Introduction
In the recent years, large-scale service systems based on Internet have witnessed rapid evolution such as growth of users, diversification of user requirements, and the openness of system services.Owing to the recent increase in the number of users using internet services, sharp gathering of users in a quick time often lead to the service unavailability issue.This is due to the fact that the new user load suddenly overloading the system, and often this imposed load surge up to paralyzing the system due to the increasing number of users.For instance, ticket booking system is often seasonal and can paralyze the system lead during peak time with the upsurge of user groups.
Large-scale user concurrent processing system are usually affected whilst expanding resources, whereby risking overloading the system due to uncertain user behaviors after expanding the computing resources.To this end, software self-adaptation strategies have been put forward to cope with this system overloading issue whilst expanding the system resources and to combat the complexities faced due to the increasing Internet service systems.It is obvious that the Internet system cannot suddenly scale to match the user behaviors.Therefore, it is important that special attention should be given to restructure the system behaviors in accordance with the changes in user behaviors by balancing the system load.Existing system load balancing methods [1][2][3] are mainly based on resource allocation and task scheduling strategies, but not consider when and how to dynamically reconstruct the system behaviors for the real-time load equilibrium.
Two important components should be considered for adaptive refactoring of system behavior.Firstly, classification of users according to the user behavioral characteristics and secondly, constructing the behavior flow for each user group to dynamically control the system load.
The remainder of the paper is organized as follows: Section 2 reviews the related works and Section 3 presents the proposed system behavior reconstruction model.Section 4 details the proposed Petri net model and algorithm for implementing system behavior reconstruction.Section 5 is covered with the experiments and discussion on the obtained results and Section 6 concludes this paper.

Related work
Recently, several research works have focused on dynamic system load balancing based on system behavior adaptive reconstruction.Doukha [4] proposed a load balancing method that distributes the beacon and fairly transmits the system load.Hwang [5] proposed the use of hardware indicators, CPU utilization and the number of online connections as a load evaluation criteria.Duan [1] used the CPU utilization rate, disk utilization ratio, page error number, request number, request response time and other relevant indicators to calculate the real-time load of the server.Gang [2] proposed a method to classify the user requested services to allocate the system resources, so as to achieve dynamic load balancing.Shailesh [3] used a fuzzy dynamic load balancing algorithm to achieve load balancing through task scheduling.Liu [6] proposed a distributed load balancing algorithm using a defined protocol sequence, and developed a model of queuing distributed asynchronous multi-server system.In order to achieve dynamic load balancing based on data stream level, Wang [7] proposed a cloud center dynamic load balancing method based on SDN (Software Defined Networks).However, these works have not given enough importance to the system behaviors and behavior time, both should be considered as essential criteria to achieve system load balancing in dynamic environments based on behavior reconstruction.
Adaptive reconstruction strategies have been the focus of a few research works.Slim [8] put forward adaptation as a key requirement for many software systems, whereby the system should be able to adapt its structure and behavior during runtime in order to respond to the changes witnessed in the operating environment and user needs.Zhang [9] proposed a new coordination method based on a reconfigurable network-event system.Pamela [10] proposed the implementation of a distributed persistence management model for reconfigurable multiprocessor systems on dynamically reconfigurable circuits.Rui [11] further proposed a dynamic adaptive wiping mechanism and Yang [12] proposed a reconfigurable architecture model based on layered hypergraph.Mohamed [13] proposed a reconfigurable and replaceable system for embedded control systems, and modeled it using Petri net.However, such research works have not considered the user requirements into account for system reconstruction.When a large number of users gather in a short time, system reconstruction may not be efficient without considering the user needs.
From the perspectives of the system behavior, Wang [14] pointed out that it is vitally important to understand user behaviors in online services and further proposed an unsupervised system based on the click traffic to check the modes of user behaviors.Luo [15] used fuzzy Petri nets to represent the fuzzy production rules, and performed a state analysis of power systems by an iterative computation of matrices.Kotevski [16] conjointly used queuing networks and Fluid Stochastic Petri Nets, and developed several performance models to analyze the behavior of complex systems.Lu [17] used a new hybrid model to explore the impacts and guidance of user behaviors on mobile banking services.Matthew [18] highlights the importance of finding out the user behaviors and using the same as the source of information by studying a long query log.Jose [19] proposed a genetic algorithm for user behavior modeling and classification from event sequences.In summary, the dynamic relationship between user behaviors and system service is vitally important and should be considered as an essential criteria whilst attempting to improve the overall system performance in balancing the system load.
To sum up, despite a number of works focused on adaptive dynamic balancing of system load, system reconstruction is hardly been considered in the state-of-theart works to date.Reconstructing the system behavior flow to dynamically balance the system load based on user behavior characteristics can achieve effective load balancing performance in the large-scale network service systems.In this paper, the behavior reconstruction method has been exploited to balance the system load when a large number of users gather in a short time, for the purpose of achieving real-time system load balancing to maximize the processing capacity of the system.Users are classified based on their behavioral characteristics and corresponding behavior processes are constructed.The proposed reconstruction model is triggered when the system load exceeds the warning point during runtime, ultimately to balance the system load by controlling the interaction time of various types of users.

Model of system behavior reconstruction based on user behavior classification
Under normal conditions, large-scale network service systems can provide users with a stable and good services.But sometimes, due to the rapid expansion of the user population within a short time the system behaviors and the user behaviors may become incompatible.Thus, the system will become abnormal or even paralyzed.Now, many large scale network service systems usually continue to provide the same services to the users as before.As a result, when the user population increases rapidly, the system load will increase beyond the capacity of the system.In this scenario, the system will be overloaded and the system resources will be limited.To this end, this paper considers reducing the system load by dividing the user behaviors into different groups according to the user interaction sequence features, and by delaying the user group interaction behavior time.

Definition 1 User behaviors membership function μ
It indicates the degree of the user behavior U i belonging to each class of S j .μ U i j ð Þ is defined as follow: where represents the interaction behavior sequence with the characteristics of user behavior time, assuming that the user behaviors are divided into p user groups based on the length of the interaction behaviors time; S j = {s 1 , s 2 , …, s p }(p ≥ 1) represents the standard for each class of user groups.
Definition 2 User behaviors subordinate standards d(u i , s j ).It is the standard of user behaviors belonging to a specific user group, that is, . It indicates that the user behavior membership function value is kept to a minimum if the user behavior belongs to the group.Suppose that Then, when N d = 1, the behavior U i belongs to the class of j user behavior group; when N d > 1, the behavior U i is randomly assigned to any kind of behavior group in the N d classes.
Definition 3 At time t, the total number of user behaviors B t submitted in the system is equal to the number of users, that is, B t = U t , where U t represents the number of users in the system.

Definition 4
The real-time load L t at time t.It is the system load corresponding to the total number of behaviors submitted by the users at time t, L t = B t × l, where l(l ≥ 1) represents a system load required by a user to submit a request behavior.
Definition 5 System good service status.It is the service state when the system can provide services normally.When 0 ≤ L t ≤ L safe , the system is in a good service state, where L safe is the safe load, indicating that the system is in a good service state which can withstand the maximum service capacity corresponding to the load value.Definition 6 System unstable service status.It is the service state when the system can provide services, but there may be abnormality.That is, when L safe < L t ≤ L max , the system is in an unstable service state, where L max represents the load value corresponding to the maximum service capacity that the system can withstand in the unstable service state, which is the maximum load that the system can resist.
Definition 7 System non-service status.It is the service state when the system cannot provide services because the load is too large to handle.That is, when L t > L max , the system is in a non-service state or in a state of paralysis.Definition 8 At time t, the system real-time load L t is the sum of the load corresponding to the p class user behaviors, namely , where L i represents the system load corresponding to the user behaviors of the group i. Definition 9 System processing capability L HC .It is the system load corresponding to the user behaviors which can be processed by the system in unit time.If L max = L HC ,and L t > L HC , then the system will enter the non-service status.
Definition 10 System load per unit time L ut .It is the system load corresponding to the number of behaviors B ut in unit time.When the system real time load is L t ≥ L safe at t moment, the system load exceeds the processing capacity of the system in unit time.Set L ut = L t /t c , and L ut < L safe , where t c is the time required to achieve load balancing in the system.Definition 11 Reconstruction system delay time Δt d ¼ ∑ j i¼1 t i , where t i is defined as follows: The user interaction behaviors are divided into p classes according to the time sequence characteristics, and L 1 ,..., L p are the system load of the p classes.Supposed that the system is in an unstable state, i.e., the system instantaneous load is L t > L safe at t moment.After reconstruction, the instantaneous system load is L t' ≤ L safe at any time t ' in the Δt period, and the total system load is equal to L t in Δt d time.
Assumption 1 The large-scale network service system itself has a maximum system load L max .
Assumption 2 When the number of user behaviors at a certain time increases sharply, which leads to an abnormal system i.e., L t > L safe , users can be classified according to the user interaction behavior time sequence.

Theorem. Load balancing reconstruction
Under Assumption 1 and 2, suppose that the system real-time load is L t 1 > L safe at time t 1 .The user behaviors are classified accordingly, and the system behavior flows are reconstructed at the system interactions.So the instantaneous load is L t 2 ≤ L safe at any time t 2 in the delay time Δt d .
Proof According to def.4,B t 1 Â l > L safe is L t 1 > L safe at time t 1 .The users are divided into p classes, so

If the system dealing with the load
According to def.11, the instantaneous load is L t 2 ≤ L safe at any time t 2 in the delay time Δt d .Therefore, the system load L t 1 can be balanced at the Δt d .

Petri net model and algorithm for implementing system behavior reconstruction
In this section, the system behavior reconstruction model provides a theoretical support for the adaptive reconstruction process for load balancing in the large-scale network service system.This section will elaborate the implementation of the system behavior reconstruction model in the actual system behavior reconstruction process based on user classification.

Random fuzzy Petri nets with time delay
Delay Petri nets [20] define the occurrence of changes that needs to be completed by a units of time.This transition issue can be divided into the problem of time transition and immediate transition, Li [21] used the stochastic Petri nets to construct the social network system model.Milinkovic [22]     Fig. 4 Petri net model of a booking system basis, in order to implement the system behavior reconstruction, this paper presents a timed stochastic fuzzy Petri Net.
Definition 12 Random fuzzy Petri net (DSFPN) with delay is a seven-tuple ∑ = (P, T; F, C, DI, τ, M), in which: (1) P is a set of places, P = {p 1 ,p 2 ,..,p n }(n ≥ 0), and the number of tokens in a place represents the number of user actions.The number of users arriving at the system over a period of time is subjected a to Poisson distribution; (2) T is a set of transitions, T = T t ∪ T i , T t ∩ T i = φ, where time transition set T t = (T 1 ,T 2 ,...,T k ) includes the transitions of service behaviors; and instantaneous transition set

Four basic structures of DSFPN model based on user classification
The system model based on Petri net is composed of four basic structures including sequence, parallel, selection and circulation [23].Therefore, the following four basic structures of large-scale network service systems are modeled by the Table 1 Transition description of t 1 ~t14 in Fig. 4 Transition tag Description Require users to enter the login information t 2 Require users to enter the verification code t 3 Get user information t 4 Require users to enter the travel information t 5 Query the system database t 6 Whether the user is booking Return a query page Requires users to enter the verification code t 9 Connect to the database t 10 The order generates the serial number t 11 Require users to pay t 12 The system judges whether the time is over 30 min t 13 The system sends an e-mail and messages to inform the details  We take the sequential structure as an example, since the other three cases are nearly similar.Petri net is used to model the behavior of the large-scale network service system.If the key interactive behavior node is in a sequential structure and the system load exceeds the warning point before the node is executed, the system is reconstructed as a DSFPN model to classify the system behaviors.
In Fig. 1, p 1 ~p10 is the place set that represents a state.When the user behavior load submitted to the system exceeds the warning point of the system load, the behavior transition t 2 and immediate transition t 3 face a conflict.Because the priority of the immediate transition t 3 is higher than the behavior transition t 2 , the immediate transition t 3 will be triggered.According to the user interaction speed, the user behaviors are divided into three groups as slow, medium and fast speed.Immediate tran-sitions t 4 , t 6 and t 8 , respectively decide the group to which a behavior belongs.Immediate transitions t 5 , t 7 , and t 9 respectively judge the relationship between the corresponding load of three behavior groups and the warning point.Control transitions c 1 , c 2 , c 3 control the delay of three behavior groups.

DSFPN algorithm
In the above Petri net model, the reconstructed flow will be activated when the tokens in behavior places exceed a certain value, i.e., when the system load exceeds the safe load.According to the definitions 1~11, the DSFPN can obtain the system load, each classified group load and required delay accordingly.The corresponding algorithm of system behavior reconstruction is described as follows.

The DSFPN model of a booking system
Online shopping systems and ticket booking systems are typical representatives of the large-scale network service systems, such as Taobao and 12,306 ticket system.Online booking system usually undergo rapid expansion of user groups during seasonal periods such as holidays (Fig. 2).The system will be overwhelmed by this state, this may even paralyze the system.This situation demands necessary modifications in the service process to accommodate the changes occurring in the system load.The process flow in a ticket booking system is simulated (Fig. 3).Now, an appropriate Petri net model is constructed according to the process flow in the ticket booking system, as shown in Fig. 4. The notations of the behavior transitions in Fig. 4 are shown in Table 1.The sequential user interaction behaviors in this system can be presented as follows: login, query, booking and paying.These interactive behaviors are reconstructed according to the defined DSFPN model, as shown in Fig. 5.The notations of the transitions of t 15 ~t42 , c 1 ~c12 in Fig. 5 is shown in Table 2.

Experimental design
The experiment simulates the system process based on the the booking system flow chart, as shown in Fig. 3.It detects and collects the number of user actions and records the time of the each of the user interaction behaviors.The tokens in the places present the user amount.We use a data generator to continuously increase the number of user actions, which is responsible for increasing the system load.The data for simulation is generated according to the flow chart of the 12,306 train booking system, as shown in Fig. 2, so that the experiment replicate the actual traffic characteristics such as gradually changing user behavior, little changes in the user behavior and suddenly changing user behavior.
According to the traffic characteristics, user behavior is simulated under three different types of load service states including [0,120000) as a good service state, [120000150000) as an unstable service state, [150000,+∞) as an unavailable service state, and it has been assumed that the user behavior is divided into three categories.Now in this simulation environment, we train the system data into the reconstruction algorithm.We implement the simulation system in C++ and use the drawing tool TeeChart to interpret the experimental renderings.

Experimental results analysis
The first set of simulated experimental data is applied to the load balancing algorithm of the system behavior reconstruction process; the experimental results are shown in Fig. 6, where the real-time load changes with time are illustrated.The real-time load exceeds the warning point, but it is not obvious, reflecting that the change is not significant.When the real-time load in the system exceeds the warning point, the reconstruction model is triggered and executed.So the user behaviors are divided into three categories and the system load is balanced.The load at any time does not exceed the warning point after the point of equalization, and the system is in good service state.
The second experimental results are shown in Fig. 7.The real-time load exceeds the warning point significantly, but it does not exceed the maximum load which the system cannot withstand.That is, the magnitude of the change is significant.When the real-time load exceeds the warning point, the reconstruction model is triggered and the system load is balanced.
The third experimental results are shown in Fig. 8.The real-time load exceeds the maximum load, that is, the magnitude of the change is huge.When the system real-time load exceeds the warning point, the reconstruction model is executed and the system is in good service state at any time.
From the above three groups of experimental results, if the system real-time load exceeds the warning point, the system enters into the unstable service state, and the system reconstruction model is triggered.At this time, the user behaviors are classified by the time features of the behavior sequence, and the system load is balanced by controlling the interaction time of each type of user group.Therefore, the system load does not exceed the warning point at any time.The experimental results show that the system behavior reconstruction model based on time features of the user behavior sequence can effectively balance the system in good service condition at any time.

Conclusions
This paper proposes a system behavior reconstruction model based on the user interaction time sequence characteristics, with the aim of resolving the system overloading issue resulting from the rapid growth of user behaviors large-scale network service systems, by the way of delaying user behavioral time.Furthermore, a reconstruction algorithm has been developed based on random fuzzy Petri net with imposed delay.
The user behaviors have been classified based on a membership function and a membership criteria, which provided the basis for constructing the system behavior reconstruction model.In the actual service systems, the behavioral flow of different user groups has been constructed by the system behavior reconstruction model and an algorithm based on randomized fuzzy Petri net with delay has been implemented.The proposed model for the balancing the system load guarantee that the system is always in good running state.As a future work, we plan study the potentials of adaptive refactoring system in effectively balancing the system load.

Fig. 1 Fig. 2
Fig. 1 Four basic structures of the timed stochastic fuzzy Petri Net

YNFig. 3 A
Fig. 3 A booking system flow chart

.
t includes the service transitions which are triggered by the system load beyond the warning point;( 3 ) C i s t h e c o n t r o l s e r v i c e t r a n s i t i o n s e t , C = {c 1 ,c 2 ,..,c m }(m ≥ 0); (4) DI is the time function on the transition set, DI:C → R 0 .For t ∈ C, DI(t) = a, it indicates that the occurrence of the transition t requires a units of time to complete; (5) F is a directed arc set, whereF = F T ∪ F C , F T ⊆ (P × T) ∪ (T × P), F C ⊆ (P × C) × (C × P);(6) τ is a function on the transition set, which represents the triggering threshold of the transition, and its range is [0,∞).

Fig. 5
Fig.5Random fuzzy Petri net model of the booking system with the delay

Fig. 7
Fig.7The second group of experimental results

Fig. 8
Fig.8The third group of experimental results