A Novel Algorithm Combining Finite State Method and Genetic Algorithm for Solving Crude Oil Scheduling Problem

A hybrid optimization algorithm combining finite state method (FSM) and genetic algorithm (GA) is proposed to solve the crude oil scheduling problem. The FSM and GA are combined to take the advantage of each method and compensate deficiencies of individual methods. In the proposed algorithm, the finite state method makes up for the weakness of GA which is poor at local searching ability. The heuristic returned by the FSM can guide the GA algorithm towards good solutions. The idea behind this is that we can generate promising substructure or partial solution by using FSM. Furthermore, the FSM can guarantee that the entire solution space is uniformly covered. Therefore, the combination of the two algorithms has better global performance than the existing GA or FSM which is operated individually. Finally, a real-life crude oil scheduling problem from the literature is used for conducting simulation. The experimental results validate that the proposed method outperforms the state-of-art GA method.


Introduction
In recent years refineries have to explore all potential costsaving strategies due to intense competition arising from fluctuating product demands and ever-changing crude prices. Scheduling of crude oil operations is a critical task in the overall refinery operations [1][2][3]. Basically, the optimization of crude oil scheduling operations consists of three parts [4]. The first part involves the crude oil unloading, mixing, transferring, and multilevel crude oil inventory control process. The second part deals with fractionation, reaction scheduling, and a variety of intermediate product tanks control. The third part involves the finished product blending and distributing process. In this paper, we focus on the first part, as it is a critical component for refinery scheduling operations. Scheduling of crude oil problem is often formulated as mixed integer nonlinear programming (MINLP) models [2,5,6]. The solution approaches for solving MINLP can be roughly divided into two categories [7]: deterministic approaches and stochastic approaches. Some deterministic methods have been available for many years [8]. These methods require the prior step of identification and elimination of nonconvexity and decompose the MINLP models into relevant nonlinear programming (NLP) and mixed integer linear programming (MILP) and then these subproblems have to be iteratively solved. The most common algorithms are branch and bound [9], outer-approximation [10], generalized benders decomposition [11], and so forth. Also, some commercial MINLP solvers have been developed for solving the problem at hand optimally [12]. However, the commercial solver can only handle MINLPs with special properties. The other stream of global optimization is the stochastic algorithms, for example, simulated annealing (SA), GA, and their variants [7]. GA proposed by Holland [13], because of their simple concept, easy scheme, and the global search capability independent of gradient information, have been developed rapidly. Much other attention is given to the development of GA for MINLP. For instance, Yokota et al. developed a penalty function that is suitable for solving MINLP problems [14]. Costa and Oliveira also implemented another type of penalty function to solve various MINLP problems, including industrial-scale problems [15]. They also noted that the evolutionary approach is efficient, in terms of the number of function evaluations, and is very suitable to handle the difficulties of the nonconvexity. Going one step further, some mixed coding methods were proposed, which include mixed-coding genetic algorithm [15] and information-guided genetic algorithm (IGA). Ponce-Ortega et al. [16] proposed a two-level approach based on GA to optimize the heat exchanger networks (HENs). The outer level is used to perform the structural optimization, for which a binary GA is used. Björk and Nordman [17] showed that the GA is very suitable to solve a large-scale heat exchanger network.
Obviously, the two different approaches previously discussed have their own advantages and disadvantages. On the one hand, a deterministic approach usually involves considerable algebra and undeviating analysis to the problem itself, whereas the evolutionary approach does not have this property. On the other hand, some deterministic approaches, such as mathematical programming, usually cannot provide practical solutions in reasonable time, whereas the evolutionary approach can generate satisfying solutions. In this work, a novel genetic algorithm which combined the finite state method and GA is proposed to solve crude oil scheduling problem. A MINLP model is formulated based on the single-operation sequencing (SOS) time representation. A deterministic finite automation (DFA) model which captures valid possible schedule sequences is constructed based on the sequencing rules. The initialization and mutation operation of GA is based on the model which builds legal schedules complying with sequencing rules and operation condition. Thus, the search space of the algorithm is substantially reduced as only legal sequence is explored. The rest of the paper is organized as follows: the MINLP model is specified in Section 2. Section 3 reviews the background of finite state theory. In Section 4, a novel genetic algorithm which combined the finite state method and GA is proposed to solve the MINLP model. A test problem is studied to verify our approach in Section 5. In the last section, conclusive remarks are given.

Mathematic Model
In this section, the MINLP model of refinery crude oil scheduling problem is described [18]. This problem has been widely studied from the optimization viewpoint since the work of Lee et al. [19]. It consists of crude oil unloading from marine vessels to storage tanks, transfer and blending between tanks, and distillation of crude mixtures. The goal is to maximize profit and meet distillation demands for each type of crude blend (e.g., low sulfur or high sulfur blends), while satisfying unloading and transfer logistics constraints, inventory capacity limitations, and property specifications for each blend. The logistics constraints involve nonoverlapping constraints between crude oil transfer operations.

Sets.
The following sets will be used in the model. (iii) ⊂ is the set of unloading operations; (iv) ⊂ is the set of tank-to-tank transfer operations; (v) ⊂ is the set of distillation operations; (vi) is the set of all operations: = ∪ ∪ ∪ ; (vii) ⊂ is the set of vessels; (viii) ⊂ is the set of storage tanks; (ix) ⊂ is the set of charging tanks; (x) ⊂ is the set of distillation units; (xi) ⊂ is the set of inlet transfer operations on resource ; (xii) ⊂ is the set of outlet transfer operations on resource ; (xiii) is the set of products (i.e., crudes); (xiv) is the set of product properties (e.g., crude sulfur concentration). (x) is the gross margin of crude .

Assignment Variables
The Scientific World Journal 3

Time Variables
V is the start time of operation V if it is assigned to priority slot ; V = 0 otherwise. V is the duration of operation V if it is assigned to priority slot ; V = 0 otherwise.

Operation Variables
V is the total volume of crude transferred during operation V if it is assigned to priority slot ; V = 0 otherwise. V is the volume of crude transferred during operation V if it is assigned to priority slot ; V = 0 otherwise.
is the total accumulated level of crude in tank ∈ ∪ before the operation was assigned to priority-slot .
is the accumulated level of crude in tank ∈ ∪ before the operation was assigned to priority-slot .

Objective Function.
The objective is to maximize the gross margins of the distilled crude blends. Let be the individual gross margin of crude ,

General Constraints.
It should be noted that the crude composition of blends in tanks is tracked instead of their properties. The distillation specifications are later enforced by calculating a posteriori the properties of the blend in terms of its composition. For instance, in the problem, a blend composed of 50% of crude A and 50% of crude B has a sulfur concentration of 0.035 which does not meet the specification for crude mix X nor for crude mix Y.

Assignment Constraints.
In the SOS model, exactly one operation has to be assigned to each priority slot, Crude volume variables are positive variables whose sum equals the corresponding total volume variable, Total and crude level variables are defined by adding to the initial level in the tank all inlet and outlet transfer volumes of operations of higher priority than the considered priority slot,

Sequencing Constraints.
Sequencing constraints restrict the set of possible sequences of operations. Cardinality and unloading sequence constraints are specific cases of sequencing constraints. More complex sequencing constraints will also be discussed later.

Cardinality Constraint.
Each crude oil marine vessel has to unload its content exactly once. ∑ ∈ ∑ V∈ V = 1, ∈ . The total number of distillation operations is bounded by and in order to reduce the cost of CDU switches,

Unloading Sequence Constraint.
Marine vessels have to unload in order of arrival to the refinery. Considering two vessels 1 , 2 ∈ , 1 < 2 signifies that 1 unloads before 2 , 2.5.6. Scheduling Constraints. Scheduling constraints restrict the values taken by time variables according to logistics rules.

Nonoverlapping Constraint.
A nonoverlapping constraint between two sets of operations 1 ⊂ and 2 ⊂ states that any pair of operations (V 1 , V 2 ) ⊂ 1 × 2 must not be executed simultaneously.
Unloading operations must not overlap, 4 The Scientific World Journal Inlet and outlet transfer operations on a tank must not overlap, Although we do not consider crude settling in storage tanks after vessel unloading, it could be included in the model with a modified version of constraint (14) taking into account transition times. We define TR as the transition time after unloading operation V ∈ and TR as the maximum transition time, TR = max V∈ TR Constraint (15) is valid in the four possible cases: A tank may charge only one CDU at a time, A CDU may be charged by only one tank at a time, To avoid schedules in which a transfer is being performed twice at a time, thus possibly violating the flow rate limitations, constraint (19) is included in the model, 2.5.8. Continuous Distillation Constraint. It is required that CDUs operate without interruption. As CDUs perform only one operation at a time, the continuous operation constraint is defined by equating the sum of the duration of distillations to the time horizon, 2.5.9. Resource Availability Constraint. Unloading of crude oil vessels may start only after arrival to the refinery. Let be the arrival time of vessel , 2.5.12. Property Constraint. The property of the blended products transferred during operation V is bounded by V and V . The property of the blend is calculated from the property of crude assuming that the mixing rule is linear, 2.5.13. Composition Constraint. It has been shown that processes including both mixing and splitting of streams cannot be expressed as a linear model. Mixing occurs when two streams are used to fill a tank and is expressed linearly in constraint (10). Splitting occurs when partially discharging a tank, resulting in two parts: the remaining content of the tank and the transferred products. This constraint is nonlinear. The composition of the products transferred during a transfer operation must be identical to the composition of the origin tank, Constraint (24) is reformulated as an equation involving bilinear terms, The Scientific World Journal Note that constraint (25) is correct even when operation V is not assigned to priority-slot , as then 2.5.14. Resource Constraints. Resource constraints restrict the use of resources throughout the scheduling horizon.

Tank Capacity Constraint.
The level of materials in the tank must remain between minimum and maximum capacity limits and , respectively. Let 0 be the initial total level and let 0 be the initial level of crude in the tank . As simultaneous charging and discharging of tanks is forbidden, the following constraints are sufficient: 2.5.16. Demand Constraint. Demand constraints define lower and upper limits, and , on total volume of products transferred out of each charging tank during the scheduling horizon,

Finite State Theory
This section presents in a somewhat informal way those basic notions and definitions from formal language and finite state theories, which are relevant for the sections to follow. Related definitions are taken from literature [20,21]. Readers, who are unfamiliar with formal language theory, are advised to consult the sources whenever necessary.

Finite State Automata. A DFA is a 5-tuple ( , Σ, , , ),
where is a set of states, Σ is an alphabet, is the initial state, ⊆ is a set of final states, and is a transition function mapping × Σ to . That is, for each state and symbol , there is at most one state that can be reached from by "following" (Figure 2).

Finite State Transducers. A finite state transducer (FST)
is a 6-tuple (Σ 1 , Σ 2 , , , , ), where , , and are the same as for DFA, Σ 1 is input alphabet, Σ 2 is output alphabet, and is a function mapping × (Σ 1 ∪ { }) × (Σ 2 ∪ { }) to a subset of the power set of ( Figure 3). Intuitively, an FST is much like an NFA except that transitions are made on strings instead of symbols and, in addition, they have outputs. [22][23][24][25], many of the rules used can be analyzed as special cases of regular expressions. They extend the basic regular expression with new operators. These extensions make the finite state automation and finite state transducer become more suitable for particular applications. The system described below was implemented using FSA Utilities [26], a package for implementing and manipulating finite state automata, which provides possibilities for defining new regular expression operators. The part of FSAs built in regular expression syntax relevant to this paper is listed in Table 4.

Finite State Calculus. As argued in Karttunen
One particular useful extension of the basic syntax of regular expressions is the replace-operator. Karttunen [22][23][24][25] argues that many phonological and morphological rules can be interpreted as rules which replace a certain portion of the input string. Although several implementations of the replace-operator are proposed, the most relevant case for our purposes is the so-called "leftmost longest-match" replacement. In case of overlapping rule targets in the input, this operator will replace the leftmost target, and in cases where a rule target contains a prefix which is also a potential target, the longer sequence will be replaced. Gerdemann and van Noord [27] implement leftmost longest-match replacement in FSA as the operator: where Target is a transducer defining the actual replacement and LeftContext and RightContext are regular expressions defining the left and right context of the rule, respectively. The segmentation task discussed in the mutation procedure makes crucial use of longest-match replacement.

The Hybrid Algorithm
From the point view of optimization efficiency and robustness, a novel two-level optimization framework based on finite state method and GA is proposed for the MINLP model in this section.

Two-Level Optimization Structure.
As the foundation of the framework, a two-level optimization structure is introduced. Once all binary variables are fixed the original problem becomes a relatively simpler model with only continuous variable. Following this deal, we rewrite (5) as follows: where and represent continuous and binary variables, respectively. Equation (30) shows when is fixed as , the submodel ( , ) can be solved optimally by continuousoptimization solvers in the inner level; then we update towards the best binary solution * in the outer level. We used an example in Figure 4 to show how binary solution can be mapped to a scheduling sequence. The schedule = [7683513762] where 7 stands for the specific operation 7 to assign to position 1 corresponding to the binary decisions 17 = 1.

Initial Population.
Based on the sequencing rules [18] and the extension to the regular expression calculus [22][23][24][25], a DFA model which builds legal schedules complying with sequencing rules and operation condition is constructed. The whole set of possible schedules is too huge to be processed at once. The DFA model of the schedule constitutes a reasonable framework, capturing all possible schedules and removing many redundant sequences of operations. Initial values of decision variables must satisfy the equality constraints and operation condition and therefore represent a feasible operating point.
Here, we still use the instance with 8 operations from Mouret et al. [18] to describe an efficient sequencing rule by However, this automation suffers from a serious problem of overgeneration. For example, the short length of the sequence may lead to infeasibility, while the long length of the sequence may result in an unsolvable model. It is an interesting challenge for finite state syntactic description to specify a sublanguage that contains all and only the sequences of valid length.
Our solution is to construct a suitable constraint for the sequences of valid length. The constraint expressions denote a language that admits sequences of valid length but excludes all others. We obtain the desired effect by intersecting the constraint language with the original language of sequence expressions. The intersection of the two languages contains all and only the valid dates: The ValidLength constraint is a language that includes all sequences of length : ValidLength = (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8) . (33) We have now completed the task of describing the language of valid sequences from the set of possible sequence expressions. It is also possible to create an automation on the basis of the regular expression and ValidSequence and then generate all possible sequences V 1 ⋅ ⋅ ⋅ V ⋅ ⋅ ⋅ V accepted by the automaton. The processes are implemented using FSA Utilities [26] that is a package for implementing and manipulating DFA and finite state transducer. In order to generate all possible sequences. When all possible sequences V 1 ⋅ ⋅ ⋅ V ⋅ ⋅ ⋅ V accepted by the automaton are generated, and the population of the according possible binary decisions is generated. In the initial population stage of GA, the population size is the number of individuals. When the number of individuals is given, a population of candidate solutions is generated by randomly selecting from the population of the all possible binary decisions.

Rule-Based Mutation Approach.
In the mutation stage, we use a finite state transducer for this rule-based mutation process. The rule-based mutation strategy must obey The proposed mutation approach is a two-step procedure.
Step 1. Segmentation of the input sequence into a set of subsequences (i.e., the subsequence which belongs to the regular language L7 or L8).
Step 2. Mutation of the subsequences into others.
Formally, the rule-based mutation procedure is implemented as the composition of three transducers (see Algorithm 1).
An example of mutation including the intermediate steps is given for the sequence "7681325712" as shown in Figure 5.

Segmentation Transducer.
Segmentation transducer splits an input sequence into subsequences. The goal of segmentation is to provide a convenient representation level for the next mutation step.
Segmentation is defined as shown in Algorithm 2. The macro "SSequence" defines the set of subsequences. The subsequences which belong to the regular language L7 and L8 are displayed in Tables 1 and 2. Segmentation attaches the marker "-" to each subsequence. The Targets are identified using leftmost longest-match, and thus at each point in the input, only the longest valid segment is marked.

The Mutation Rules.
In the GA process, the mutation rules are made by carefully considering nonoverlapping constraint between operations. A concrete instance for partially illustrating the mutation rules is given in Algorithm 3. Note that the final element of the left-context must be a marker and the target itself ends in "-. " This ensures that mutation rules cannot apply to the same subsequence.

Experimental Study
In this section, the same problem from the literature [18] is used for computational experiments. The proposed methodology is compared with existing promising algorithms, mixed-coding GA [15,28]. Figure 1 depicts the refinery configuration for problem. The data involved in the problem are given in Table 3. The performance comparison with different computing times, such as 350 s, 500 s,. . ., 2400 s, is conducted. The objective value is used to statistically analyze the optimization results. The performance comparison between the two methodologies used is illustrated in Figure 6, which shows that the hybrid optimization algorithm which combined the finite state method and GA will statistically outperform the mixedcoding counterpart. The genetic algorithm which combined the finite state method and GA finds feasible solutions very fast and is able to find better solutions in reasonable time.
In Figure 7, we compare the objective variance of each iteration in the two evolution processes of these two kinds of methodology. By tracking the evolution process, we find that the mixed-coding GA is easy to stick in a local minimal sequence solution. This situation only can be improved through increasing the mutation scaling factor. However, this may result in a hard convergence, unless sufficient iterations are implemented. As for the hybrid optimization algorithm, the optimization processes of binary variable and continuous variable are separated. The performance of the whole methodology mainly depends on the FSM  which captures most promising schedules and removes many redundant sequences of operations, so that the user can use a small population size of corresponding discrete variables to obtain suboptimal solutions. From Figure 7, we see that the proposed method has converged at 350 iterations as opposed to 2400 iterations for the mixed-coding GA.
The success of the proposed algorithm lies in a comprehensive analysis of the region of the search space and its capacity to focus the search on the regions with the partial solution. One of the good merits of the hybrid algorithm is Table 2: Subsequence belonging to 8 .

Length
Sequences belonging to 8  1  8  2  8 1  8 2  8 3  8 5  3  812  813  825  831  832  835  851  852  4  8125  8132  8312  8313  8325  8351  8352  8512  8513  8525  5  81325  83125  83132  83512  83513  83525  85132  85125  6  831325  835125  835132  851325  7 8351325  Composition of the transducers and macro (Term, ): Use term as an abbreviation for that each solution involved in the GA algorithm is guaranteed to be feasible by using the mutation rules generated by DFM method while in existing GA algorithms the procedure to generate feasible solution under complex process constraints is very time costive. The deterministic finite automata (DFA) can easily represent this kind of structure. Furthermore, the complex process constraints can be very difficult to express with mixed integer programming. Consequently, it is unfeasible to solve the industrial problem by using MIP solver.

Conclusion
In this paper, a novel hybrid optimization algorithm which combined the finite state method and GA is proposed.  The proposed algorithm constitutes a reasonable framework, capturing both the operating condition and sequencing rule of the schedule. The solution captures all possible schedules and removes many redundant sequences of operations. The algorithm is equivalent to introducing new structure information into the optimization process, which will help reduce the risk of trapping in a local minimal sequence solution. The hybrid optimization algorithm is an effective and robust tool to solve the crude oil scheduling problem in terms of efficiency and reliability. Algorithms only with the two properties are suitable for solving practical engineering application.