Determination Method of Nonlinear Membership Function Based on the Density Function of the Square Error

In this study, membership function is used to express the uncertainty in the petroleum drilling, then the fuzzy density clustering method based on density function of square error is put forward, which has avoided the defect of artificial clustering parameter and initial centroid, by using this fast algorithm, the clustering center, the number of clustering center and the parameters used for describing the nonlinear membership function can be obtained. In the simulation, the nonlinear membership function of change percentage of flow out rate EB_Da36 is determined, which has solved the problem that the membership function is hard to be defined.


INTRODUCTION
The research of uncertainty phenomenon and things has great significance (Li et al., 1995(Li et al., , 2011Li, 2000;Lu et al., 2008;Luo et al., 2007). Petroleum drilling is a risky and costly system engineering (Zhao et al., 2009;Fred et al., 2005;Li et al., 2009). During the process of drilling, there is much uncertainty, such as fuzziness and randomness. In the process, all kinds of engineering accidents may happen and dealing with the troublesome condition and drilling accidents will cost lots of manpower and fund. There have been lots of research on petroleum drilling accidents at home and aboard and have got rich achievements, in which the fuzzy inference method is always applied. When using the fuzzy inference to forecast the accident in the petroleum drilling, there is a difficulty that is how to express the uncertain data detected by the sensor.
Doctor (Zadeh, 1965) proposed a method using membership to signify the uncertainty, there are two forms of membership function: piecewise linear and nonlinear. However, how to determine the membership function has not been resolved essentially. In the application, it is unreliable to determine the value of membership subjectively, when using the statistical method, it costs too much, at the same time and it cannot be achieved sometimes.
Fuzzy clustering is the foundation of classification and system model (Ni et al., 2005;Zhou and Zhou et al., 2000;Ma et al., 2003;Zheng et al., 2011). The purpose of clustering is to extract the inherent characteristics from larges of data and get representation of the system behavior. Fuzzy C-means clustering (Pedrycz and Reformat, 2005;Liu et al., 2009;Wang et al., 2012;Liu and Liu, 2012) put forward by Doctor Bezdek is a mature method at present, but it has disadvantage, that is to say, its clustering result seriously depends on the initial cluster center. For mass data, there are two classic clustering methods: density clustering and grid clustering. The grid clustering method has small calculation and high speed, but the processing result is rough, only the rectangle and the square can be found. The density clustering method can avoid defect of artificial cluster parameter and the initial centroid and can find the round cluster, but its speed is slow. In this study, a clustering method based on the density function of square error is proposed, which can obtain the membership function fast and objectively express the uncertainty of the condition in the petroleum drilling. [ ] (2)

FUZZY DENSITY CLUSTER METHOD BASED ON THE DENSITY FUNCTION OF SQUARE ERROR
In this study, a fuzzy density cluster method based on the density function of square error is proposed, first, according to the data distance, the initial cluster center is obtained, then by using the density function, the cluster center is modified, at last the final center and the membership is got. This method can avoid falling into the local minimum and the speed is fast.
Density function of the square error: In fact, there is a defect in the traditional density cluster method when evaluating standard parameters in the connected domain, in this study, a new evaluation criterion of the density is brought forward, that is the density function of the square error, shown as Eq. (3): In the above equation, d : The radius of the similarity range n : The number of the nodes in a cluster In the adjacency domain with radius d, E ρ is the density of the square error, which reflects the hypo dispersion of each sample to the center particle, when the distance is small, the density of the particle is large. f (λ, d) Is a penalty function of factor λ(λ>1) in the index grade, which represents the intensive degree of a cluster and can be set by the user. With the increasing of d f (λ, d) will grow in the index degree, as a result, if the slope of the function curve becomes larger, the curve is steeper. Meanwhile, proportion of the area is larger, the cluster is sparser and vice versa. In this study, Eq. (4) is used as regulatory factor: Clustering method: For the data set X = {x 1 , x 2 , …, x n } R, the process of a new cluster method is shown as follows: Step 1: Set a range value θ of the square error density, mean of the neighborhood radius of the core object and parameter D of the neighborhood radius of the extended node, here the calculation is shown as Eq. (5) and (6) In the above equation, E : A neighborhood radius of the centroid, in this region, the number of samples means a sample density E i : A radius of the core object E p : Radius of the object waiting for extended Step 2: Calculate the distance of any two data: Set two nearest data as a class and select the middle point of the data as the initial cluster data w k (1) here k = 1.
Step 3: Delete the field whose center is w k (1) and radius is α from the data set and calculate the distance of the residual data, then definy the nearest data as a cluster, at last select the middle point of the two data as a new initial cluster data w k (1).
Step 4: Repeat Step 3, then the q cluster center can be obtained; this center is defined as the initial center w k (l), here l = 1 k = 1, 2, … , q. Step 7: According to Eq. (3), calculate the density function of square error. If Stop the iteration and export the cluster result, if not, set l = l + 1 and return to step 5.

SIMULATION
By using the fuzzy density clustering method, the clustering center, the number of center and the membership can be obtained, according to these parameters, the nonlinear membership function can be determined. In this simulation, the data EB-Da36 of flow out rate is selected from a petroleum drilling in the north of Hubei (Wang et al., 2007(Wang et al., , 2010, as the change percentage of the flow out rate is a main parameter for kick and lost circulation, the nonlinear membership function of it is defined. The distribution of 600 data from the total volume EB-Da36 is shown as Fig. 3.
By using these parameters: the number of the cluster center q, the cluster center and the membership, the nonlinear membership function can be obtained by the following method: Step 1: Converse the sample data and get the change percentage of the flow out rate, then map it to the unified universe space.   Step 2: By using the fuzzy density clustering algorithm, the cluster center w k the number of cluster center q and membership degree u ik are determined. Seen from Fig. 4 it can be known that the middle linguistic value basically obeys to the normal distribution, which can be represented by Gaussian membership function, but there is an edge distortion in the bipolar values, which can be well avoided by the sigmoid membership function.
Step 3: By using Gaussian membership function, the universe of the middle linguistic value is approached a curve composed by membership degree u ik of 600 sample data, that is to say, the object function shown in Eq. (11) is made to be the smallest, the parameters obtained are shown in Table 1: Step 4: By using Sigmoid membership function, the universe of the bipolar linguistic value is approached a curve composed by membership degree u ik of 600 sample data, that means, the object function shown in Eq. (12) is made to be the smallest, the parameters obtained are shown in Table 1: Step 5: According to the parameters in Table 1, the membership function can be defined, which is shown as Fig. 5.

CONCLUSION
According to the case study, the fuzzy clustering method based to the density function of square error put forward in this study has advantages, which avoids the defect of artificial clustering parameter and initial centroid, by using this method, the clustering center, clustering number and the membership can be obtained. By function approximation, the Gaussian and Sigmoid membership function of flow out rate in petroleum drilling is determined, which solves the problem that nonlinear membership function is hard to be obtained.