A Resolution-Free Parallel Algorithm for Image Edge Detection within the Framework of Enzymatic Numerical P Systems

Image edge detection is a fundamental problem in image processing and computer vision, particularly in the area of feature extraction. However, the time complexity increases squarely with the increase of image resolution in conventional serial computing mode. This results in being unbearably time consuming when dealing with a large amount of image data. In this paper, a novel resolution free parallel implementation algorithm for gradient based edge detection, namely EDENP, is proposed. The key point of our method is the introduction of an enzymatic numerical P system (ENPS) to design the parallel computing algorithm for image processing for the first time. The proposed algorithm is based on a cell-like P system with a nested membrane structure containing four membranes. The start and stop of the system is controlled by the variables in the skin membrane. The calculation of edge detection is performed in the inner three membranes in a parallel way. The performance and efficiency of this algorithm are evaluated on the CUDA platform. The main advantage of EDENP is that the time complexity of O(1) can be achieved regardless of image resolution theoretically.


Introduction
In recent decades, image processing technology has experienced dramatic growth and widespread applications. Nearly no area escapes impact in some way by digital image processing. Normally, digital image processing includes three main levels, i.e., low-level, mid-level and high-level processing [1]. As one of the most basic operators in low-level image processing, edge detection can preserve the important structural properties of an image while significantly reducing the amount of data. This excellent property makes it a basic tool for many high-level image processing algorithms and is extensively applied in target tracking [2], image compression [3], and object recognition [4]. An edge can be defined as points in a digital image at which the image brightness changes sharply or has discontinuities. This phenomenon may be caused by depth discontinuous, illumination changes, or intrinsic texture properties of objects. In various edge detection algorithms, the gradient based method is a type of classic edge detection approach with the merit of simple theory and good performance. However, as convolution calculation (i.e., a classic neighborhood computing in image processing) [5] is involved in this kind of algorithm, the time complexity increases squarely with the increase of image resolution. So it is difficult to deal with images with large resolution, such as remote sensing images, medical images, etc., in real time processing.
In order to achieve real-time calculation of high resolution images, many researchers have put much effort into this problem and several methods have been proposed. Generally, there are two main categories of resolutions. The first type of resolution concerns computational algorithms. In this kind of method, an elaboratively computational algorithm is usually designed to reduce the computational complexity. For the template matching problem, integral image [6] and dual-bound algorithm [7] are two classical approaches to speed up the computation. In [8], Fast LDA feature extraction is present, where steepest descent and conjugate direction methods are combined to optimize the step size in each iteration. In [9], common orthogonal basis extraction is proposed to extract a common basis of collection of matrices. The second category is based on hardware with parallel architecture, such as Graphics Processing Unit (GPU) [10][11][12] and Field Programmable Gate Array (FPGA) [13,14]. GPU uses hundreds of parallel processor cores executing tens of thousands of parallel threads to rapidly solve large problems having substantial inherent parallelism. However, with the shrinking volume of chips, semiconductor technology begins to reach its physical limits, which means the performance of conventional computing technique based on silicon chip integrated circuit microprocessors will be difficult to improve further [15]. Under this background, some scholars have turned their attention to non-traditional computing, such as quantum computing [16], DNA computing [17] and membrane computing (MC) [18]. MC is a new active branch of natural computing that simulates the function and structure of living cells and tissues, abstracting their biochemical reactions and material exchanges [19]. One of the most prominent features of MC is its capability of generating exponential growth space over a polynomial time, which makes it a promising method to resolve the conflict between the ever-increasing amount of data in the image processing field and the backward computing power of conventional computer [20]. In recent years, image edge detection and image segmentation [21][22][23][24], image smoothing [25], obtaining homology groups of 2D images [26,27], counting cells [28], Enzymatic numerical P systems image thinning [29] and corner detection [30] in MC framework have been vividly studied. In the previous literature about MC and image processing, much work is based on tissue-like P systems. However, when designing a parallel implementation program of an existing image processing algorithm, it is difficult to realize the mathematical formula in "tissue-like P systems language". The reasons for this are as follows. First, the data type of an image is an integer between 0 and 255. When design image processing algorithm uses tissue-like P systems, the image data should be coded to symbolic variables and those symbolic variables need to be decoded to integer for display as the algorithm finished. Second, most image processing algorithms are composed of several steps in determined logical order, which means variables in the membrane system need to be calculated in a deterministic way, rather than in a random manner. Since the rules in tissue-like P systems are implemented randomly, it is difficult to control the execution orders of different rules.
In order to overcome the above shortcomings when tissue-like P systems are combined to image processing, we make the first attempt to introduce enzymatic numerical P system (ENPS) to image processing. Concretely, a parallel algorithm for gradient based edge detection algorithm is designed and tested in the framework of ENPS. Besides the features described in [25], ENPS has another two good properties which make it particularly appropriate for image processing. One is that numerical variables and numerical expressions can be used directly in ENPS. Thus, image data can be directly operated without the additional encoding and decoding process. Another important characteristic is that enzymatic variables can control the execution orders of multiple rules in ENPS, i.e., the algorithms with complex logical steps can be designed easily.
The main contribution of this paper is that a parallel algorithm for image edge detection in the framework of ENPS, namely EDENP, is designed. The significant advantage of EDENP is that it can achieve the time complexity of O(1) theoretically, no matter how large the image resolution is. Moreover, the performance is equivalent to the performance run on the serial computing platform. This is very important for real projects, because most of the classical image processing algorithms have been widely proven to be effective in practical engineering, so the designed parallel implementation algorithm can be directly applied to the real image processing project without the need to perform large-scale testing. To the best of our knowledge, it is the first time to bridge problems from image processing with ENPS.
The rest of this paper is structured as follows. Section 2 introduces the definition, characteristics and applications of MC and ENPS. The problem statement is elaborated in Section 3. Section 4 discusses the EDENP algorithm in detail. The experiments and results are presented in Section 5. Conclusions are drawn in Section 6.

MC and ENPS
MC is a young biocomputing model proposed by Gh.Pȃun in 2000 [19]. The computational devices in MC are called P systems. Generally, a P system includes three ingredients: (i) the membrane structure; (ii) multisets of objects; (iii) rules of a bio-chemical inspiration. The multisets of objects are placed in the membrane, and evolved according to given rules which are usually applied in a synchronous non-deterministic maximally parallel manner. Since being proposed, MC has received great attention from scientists in many fields [31][32][33]. In the past 20 years, both the theory [32,[34][35][36][37] and application [31,[38][39][40][41] of MC have been greatly developed, and many different classes of P systems have been investigated. According to the way in which membranes are structured, there are three major types of P systems, i.e., cell-like [19], tissue-like [42] and spiking neural P systems [43,44]. Enzymatic numerical P system comes from numerical P system (NPS). NPS is a new special research branch of cell-like P systems, proposed by the founder of MC, Gh.Pȃun in 2006 [45]. In NPS, multisets of objects associated to membranes are sets of numerical variables, and the evolutionary rules are composed of a production function and a repartition protocol [46][47][48]. The most common widely application area of NPS is robot controller design [49][50][51][52]. Although NPS can deal with numerical variables, it can only execute one production function per membrane at a time. When there are multiple production functions per membrane, one is selected randomly. This limits its application in some situations where the rules should be executed deterministically. In order to solve this problem and expand the application of NPS, ENPS is put forward [24]. It is extended from NPS by introducing enzyme-like variables which can make rules run deterministically [53]. The standard form of ENPS is defined as follows: where: 1. m is the number of membranes used.

4.
Var i is the set of variables from membrane i and Var i (0) are the initial values for these variables.

5.
Pr i is the set of rules in membrane i, composed of a production function and a repartition protocol. A typical rule is as follows.
where e j,i is a variable from Var i different from y 1,i , . . . , y k l ,i and v 1 , v 2 , . . . , v n i . The rule can be executed at a time t only if e j,i > min {y 1,i (t), y 2,i (t), . . . y k l ,i (t)}. From the definition of ENPS, it is clear that with enzymes-like variables, the system can control multiple production functions to run in parallel in the same membrane deterministically [54]. Hence, it can overcome the disadvantages of traditional NPS that only run one rule nondeterministically at a time in a membrane. The ENPS with deterministic, parallel execution model has already been proved to be Turing universal [55,56]. In [57], it is shown that any ENPS working in all-parallel mode or one parallel model can be simulated by an equivalent one-membrane ENPS working in the same mode. Since the proposal of ENPS, this model has been successfully applied in a wide range of domains, such as robot control [58], big data field [59], and sequential minimal optimization [60] fields. In this paper, ENPS is used to solve the problem of gradient based image edge detection.

Problem Statement
Edges generally occur in areas where the brightness of the image changes dramatically. These changes can be described by image gradients. Usually, a pair of convolution masks are used to estimate the gradients in the x and y directions, respectively, as shown in Equations (1)-(3), where (Sobel x , Sobel y ), (Prew x , Prew y ), (Rob x , Rob y ) are three classic pairs of convolution masks. In this paper, we take Sobel operator as an example of gradient based edge detection (GBED). When the masks are sliding over the image, a square of pixels are operated. Then both directional gradients and absolute gradient magnitudes of image are computed, as shown in Equations (4) and (5), where I is the image, (g x , g y ) are gradients in x and y direction respectively, g i,j is the absolute gradient magnitude of a pixel with coordinate (i, j), 2 ≤ i, j ≤ n − 1 for image with resolution of n × n.
When the gradient magnitude g i,j is computed, the difference between it and a predefined threshold θ is used to judge whether this pixel is an edge pixel or not, as presented in Equation (6), where d i,j is the difference. More concretely, if d i,j is greater than or equal to 0, then the pixel is assumed as an edge point, otherwise, it is a background point, as shown in Equation (7). It is worth noting that in real application, before thresholding, the gradient image should be filtered by "non-maximum suppression" for getting more real edges. In this paper, in order to simplify the algorithm, this step is ignored.
The program pseudo code of GBED run on conventional serial computer is illustrated in Algorithm 1, where the initial value of edg i,j is set to 0. From Algorithm 1, it can be deduced that the computational complexity is O(n 2 ) because two loops are involved. When n becomes large, the calculations are very time-consuming under the serial computing platform.

4:
Computing g y i,j

end for 9: end for
In order to reduce the calculation time complexity, we attempt to introduce an enzymatic numerical P system to design a high parallel computing algorithm for edge detection. The details of how to design the algorithm will be given in the next section.

The EDENP Algorithm
This section starts with the mathematical model of EDENP followed by the detailed description of EDENP. The execution process and resources needed are discussed lastly.

Mathematical Model of EDENP
From Section 3, we know that the GBED algorithm contains four steps for a certain pixel in an image. In EDENP, the four steps will be executed in a cell-like P system under the control of enzyme variables, as illustrated in Figure 1. The initialization of variables, start and stop of the system will be controlled in the skin membrane. The directional gradients estimation will be completed in membrane 1. The absolute gradient magnitude estimation will take place in membrane 2. Membrane 3 is responsible for computing the image edge. The corresponding membrane structure is illustrated in Figure 2.
The mathematical expression of EDENP is as follows, and , are the gray value of pixel with coordinate of (i, j) on the source image plane.
are the corresponding edge points of the source image with initial value 0.

θ[threshold]
, is a numerical variable which is used as the threshold value for edge detection, and the value of threshold should be predefined.
, are the horizontal derivative approximations at each pixel.
g y i,j (1 ≤ i, j ≤ n), are the vertical derivative approximations at each pixel.
, are the gradient magnitude approximations at each pixel.
, is a numerical variable with initial value 0, which is used as the background value of the edge image.
, is a numerical variable with initial value 1, which is used as the edge point value of the edge image.
, is a numerical variable with initial value −256, which is used as a intermediate variable.

5.
E k is a set of enzyme variables from membrane k, i.e., Pr k is the set of programs (rules) in membrane k, composed of a production function and a repartition protocol.
Pr 4,CE n,i : 0| e 1,1 → |g x n,i ; (1 ≤ i ≤ n), Pr 8,CE n,i : 0| e 1,1 → |g y n,i ; (1 ≤ i ≤ n), Pr 9,CE i,1 : 0| e 1,1 → |g y i,1 ; (2 ≤ i ≤ n − 1), Pr 10,CE i,n : 0| e 1,1 → |g y i,n ; (2 ≤ i ≤ n − 1). Those rules are used to execute Formula (1). The enzyme in Pr 1,CE i,j ∼Pr 10,CE i,j must exist in enough amount so that the rules can be activated. Specifically, if the value of the enzyme e 1,1 is greater than variable x i,j 1 ≤ i, j ≤ n , then rules Pr 1,CE i,j ∼Pr 10,CE i,j are effective. Since variable x i,j is the gray value of image, the maximum value is 255. So, the initial value of e 1,1 is set to 256, such that the condition modeled by rule Pr 1,CE i,j ∼Pr 10,CE i,j are satisfied. It is important to note that the number of rules are n × n, and all the rules are executed in parallel.

7.
Pr 21,CE i,j : ( g 2 x i,j + g 2 y i,j )|e 1,1 → 1|g i,j ; 1 ≤ i, j ≤ n Pr 21,CE i,j are the rules which are executed by Formula (5). Hence, after executing Pr 1,CE i,j ∼Pr 10,CE i,j , the value of the variables g x i,j , g y i,j are obtained. The maximum value of g x i,j and g y i,j is 255, and the enzyme e 1,1 is 256. So the condition of execution for rules Pr 21,CE i,j is satisfied. Hence, all n × n rules are executed concurrently. 8.
Pr 32,CE i,j and Pr 33,CE i,j are rules for computing edge value as Formula (7). If E i,j is greater than or equal to 0, then Pr 32,CE i,j and Pr 33,CE i,j are executed. Because ed 1 is 0, and ed 3 is -256, so E i,j ≥ min(ed 1 , ed 2 ) and E i,j ≥ min(ed 1 , ed 3 ). The execution condition of Pr 32,CE i,j and Pr 33,CE i,j is satisfied. If d i,j < 0, only Pr 33,CE i,j will be executed. Because E i,j ≥ min(ed 1 , ed 3 ) and E i,j < min(ed 1 , ed 2 ), only the execution condition of Pr 33,CE i,j can be satisfied. After executing Pr 32,CE i,j and Pr 33,CE i,j , variables edg i,j will be set to 1 if d i,j ≥ 0 and every variable E D i,j will be assigned. 10. Pr main : (0 * E D 1,1 + 0 * E D 1,2 + . . . + 0 * E D m,n + 1)| → 1|E D Pr main is a rule contained in membrane 4, which controls the stop condition of the P system. For pixel (i, j), if all the enzyme variables E D i,j are assigned, the condition for Pr main is meet.
Enzyme variable E D is set to 1 by rule Pr main , and the system stops running.

The Structure and Execution Processes of EDENP
As shown in Figure 2, the structure of EDENP includes four membranes. The system begins to start when the input variables x i,j representing the gray value of source image at location (i, j) appear in the skin membrane. The whole process includes five steps.
Step 1: Horizontal and vertical derivative approximations of every pixel are computed in membrane 1 by using rules of Pr 1,CE i,j ∼Pr 10,CE i,j in a parallel manner. When the directional gradients are computed, membrane 2 will be activated.
Step 2: The gradient magnitude of all the pixels are obtained at the same time with rules of Pr 21,CE i,j in membrane 2.
Step 3: The comparisons between the gradient magnitudes of all pixels and the predefined threshold are executed by rules of Pr 31,CE i,j in membrane 3.
Step 4: The edge pixels are detected and marked with 1, while the background pixels are marked with 0 by rules of Pr 32,CE i,j and Pr 33,CE i,j in membrane 3.
Step 5: The system stop condition is satisfied and the system stops working by rules of Pr main in membrane 4.
So as described above, only five steps are needed in the proposed algorithm for images with arbitrary resolution. Since we do not change the mathematical model of Sobel based edge detection, the detection result by our proposed method is the same as if run on a serial computing platform.

Complexity and Resources Analysis
Taking into account that the size of the input data is n × n, and the image is a gray image. The amount of resources needed is illustrated in Table 1. From Table 1, we can see that there are (7n 2 + 6) variables, including (2n 2 + 2) enzymatic variables and (5n 2 + 4) numerical variables. (6n 2 + 4) rules are involved in this system. The total storage space is 1 cell with (13n 2 + 10) molecules. So the space complexity is O(n 2 ) theoretically. The time complexity is O(1) because the number of execution steps is 5, which implies the computational efficiency is constant for images under arbitrary resolutions.
From the above analysis, we can see that the core of the proposed algorithm is to use space to replace time to obtain high-performance parallel computing, which is exactly the prominent characteristic of MC. Since molecules are used as storage units in a real biological computer, huge storage space can be utilised when this algorithm is implemented on it. So we think the proposed parallel algorithm is effective for images with high resolutions, at least at a theoretical level.

Experiments and Results
In this section, both the performance and efficiency of our proposed EDENP algorithm are evaluated. Since there is no hardware implementation of MC systems at present, the only way to test the behaviors of the designed P systems is to simulate them in conventional computers. In this paper, a parallel computing architecture, Compute Unified Device Architecture (CUDA), is used as the simulating platform, as it has been reported in literature [24,61]. The parameters of the platform on which our experiments are carried out are illustrated in Table 2. The threshold θ for all the experiments is set to 0.2.

Performance Evaluation
Two case studies are considered to evaluate the performance of the proposed method for different types of images. Since the proposed algorithm is in the framework of MC, edge detection methods based on tissue-like P systems [21,24] are chosen as comparison methods. Algorithms in the literature [21,24] are sketched and implemented on a CPU platform using the MATLAB program.

Qualitative Evaluation
Case study 1 is considered to evaluate the performance of the three algorithms for images with rich textures. Four images named rice, cameraman, mri, and AT3_lm4_01 randomly collected from the MATLAB Image Tool Box are used as testing samples in this experiment, as shown in Figure 3a,e,i,m. Figure 3b-d,f-h,j-l,n-p show the detailed qualitative edge detection results of the three algorithms for the four images. It can be clearly observed from Figure 3b,f,j,n, that the contours of the objects can be detected, but meanwhile the noise in the background is also detected, which will make the following image processing, such as object recognition, more difficult to deal with. The results by reference [21] are shown in Figure 3c,g,k,o. It can be seen that there are too many small edges, and the main outlines of the targets can hardly be found even by human eyes. The results of EDENP are illustrated in Figure 3d,h,l,p, from which we can see that not only the main contours of objects can be detected successfully, but also the noise is well suppressed.
Case study 2 is used to test the performance of the three methods for images with less texture, in which images named toyobjects, circbw, text, testpart1 randomly selected from MATLAB Image Tool Box are used as testing image samples. In image toyobjects, each object has a constant gray value, while the other three images are binary images. Like in Case 1, the detected edge results by the three approaches are shown in Figure 4. Figure 4b,f,j,n clearly show that there are many discontinuous edges when using algorithm in reference [24], while the other two methods can detect the edges completely. When comparing the thickness of the edges, it is obvious to see that the method in reference [21] can achieve the thinnest edges, then the EDENP method, and the edges detected by [24] is the thickest. Although the method in [21] can obtain the finest edges, those edges often have burrs, as shown in Figure 5. Figure 5a,e are the whole edge image of toyobjects and circbw. Figure 5b-d,f-h are the local enlargement of areas in pink rectangles in Figure 5a,e. Areas marked in green in Figure 5b,f are some examples of discontinuous edges by [24]. When comparing Figure 5c,g with Figure 5d,h, it is clear that edges by EDENP are much smoother than by algorithm [21].

Quantitative Evaluation
The confidence degree of the edge image is one of the most used indexes for evaluating the authenticity of the edge pixels. In general, the greater the edge confidence degree is, the more reliable the edges are. In this paper, we use this index to evaluate the performance of the edge detection algorithm quantitatively, whose mathematical definition is presented in reference [62]. Table 3 provides the comparison results of the three methods in terms of edge confidence degree. It can be seen from Table 3 that the EDENP method has the highest edge confidence degree for images with both high and low texture, which means edges detected by EDENP have less false edges.
Through the above quantitative and qualitative results, it can be deduced that the method in reference [21] is nearly invalid for grayscale images with rich texture. For images with less textures, this method can get the fine edges of the objects. However, the edges are not smooth in some cases because of the false burr edge points. The approach in [24] cannot get the whole contours of the objects due to the discontinous edges detected for images with both rich and less rich textures. The EDENP algorithm has the highest performance and can obtain clear, continuous, and authentic edges of images with both rich and less rich textures. In this paper, only edge detection methods in the framework of MC are chosen for a comparison. From the above experimental results, we can see that the proposed algorithm has better performance compared with the existing tissue-like based edge detection methods. The fundamental reason for this is that with the help of "enzyme variables" in ENPS, the rules can be controlled flexibly, thus the existing Sobel edge detection algorithm can be programmed in "membrane computing language" easily.

Efficiency Evaluation
To better describe the computation efficiency of EDENP, a speedup ratio is defined as the elapsed time of algorithm on CPU platform divided by running time on GPU platform. The running times of images with different resolutions under GPU and CPU platform and corresponding speedup ratios for one image (camera) are illustrated in Table 4. From Table 4, we can see, although the computation times of EDENP are independent of resolutions theoretically, it takes different times to execute the EDENP algorithm for the same image at different resolutions. The reason for this is that the programs do not run on real bio-computers. Table 5 gives the speedup ratios results of the other seven images. It can be found that the lowest speedup is 53, and the maximum speedup can reach up to 262. It is obvious that the computing power of the proposed algorithm is much superior compared with the traditional algorithm implemented on CPU platform.

Conclusions
Membrane computing is a new branch of natural computing, and its amazing storage space and high parallel computing characteristics are very suitable for big data processing. Among various membrane systems, the ENPS can directly deal with numeric variables, and the enzyme variables can flexibly control the execution orders of different rules. In this paper, we attempt to apply ENPS to image processing, and take Sobel edge detection as an example. Compared with the previous works which are based on tissue-like P systems, the advantage of the proposed method is that it does not need to encode and decode the image data, and it is easy to write the program for algorithms with complex execution orders in "membrane computing language". The limitation of the proposed algorithm mainly has two aspects. One is that the execution of the algorithm is based on real biological computers. However, there are no universal biological computers at present, so it is difficult to evaluate the real computing efficiency of the proposed algorithm. The other shortage is that the space complexity is O(n 2 ), which means large storage space is needed for the proposed algorithm. In future research, we will simulate the algorithm on FPGA hardware and try to combine the ENPS with other, more complex image processing algorithms.