A Fusion Algorithm for GFP Image and Phase Contrast Image of Arabidopsis Cell Based on SFL-Contourlet Transform

A hybrid multiscale and multilevel image fusion algorithm for green fluorescent protein (GFP) image and phase contrast image of Arabidopsis cell is proposed in this paper. Combining intensity-hue-saturation (IHS) transform and sharp frequency localization Contourlet transform (SFL-CT), this algorithm uses different fusion strategies for different detailed subbands, which include neighborhood consistency measurement (NCM) that can adaptively find balance between color background and gray structure. Also two kinds of neighborhood classes based on empirical model are taken into consideration. Visual information fidelity (VIF) as an objective criterion is introduced to evaluate the fusion image. The experimental results of 117 groups of Arabidopsis cell image from John Innes Center show that the new algorithm cannot only make the details of original images well preserved but also improve the visibility of the fusion image, which shows the superiority of the novel method to traditional ones.


Introduction
The purpose of image fusion is to integrate complementary and redundant information from multiple images of the same scene to create a single composite that contains all the important features of the original images [1]. The resulting fused image will thus be more suitable for human and machine perception or for further image processing tasks in many fields, such as remote sensing, disease diagnosis, and biomedical research. In molecular biology, the fluorescence imaging and the phase contrast imaging are two common imaging systems [2]. Green fluorescent protein (GFP) imaging can provide the function information related to the molecular distribution in biological living cells; phase contrast imaging provides the structural information with high resolution by transforming the phase difference which is hardly observed into amplitude difference. The combination of GFP image and phase contrast image is valuable for function analyses of protein and accurate localization of subcellular structure. Figure 1 shows one group of registered GFP image and phase contrast image for Arabidopsis cell; it is obvious that there is a big difference between the GFP image and the phase contrast image. Due to low similarity between the originals, various fusion methods that had been widely used in remote image fusion [3][4][5], such as Wavelet/Contourletbased ARSIS fusion method [6], will result in spectral and color distortion, dark and nonuniform background, and poor ability of detailed preservation. Recently, Li and Wang have proposed SWT-based (stationary wavelet transform) [7] and NSCT-based (nonsubsampled Contourlet transform) [8] fusion algorithms which utilize the translation invariance of two kinds of transform to reduce the artifacts of fused image, but complicated procedure, high time-consumption, and low robustness hinder its fusion capability. In order to overcome these disadvantages, we bring sharp frequency localization Contourlet transform (SFL-CT) [9] into the fusion of GFP image and phase contrast image, in the manner of SFL-CT's merit of excellent edge expression ability, multiscale, directional characteristics, and anisotropy. We propose a new hybrid multiscale, and multilevel image fusion method combining intensity-hue-saturation (IHS) transform and SFL-CT. Different fusion strategies are utilized for the coefficients of different subbands in order to keep the localization information in GFP image and detailed information of high resolution in phase contrast image. The research conducts a fusion test of 117 groups of Arabidopsis cell images from  the GFP database of John Innes Center [10]. Visual information fidelity (VIF) [11] is also introduced to quantify the similarity inside and outside the fluorescent area between the fused image and original ones.
The outline of this paper is as follows. In Section 2, the SFL-CT and IHS transforms are introduced in detail. Section 3 concretely describes our proposed fusion algorithm based on the neighborhood consistency measurement. Experimental results and performance analysis are presented and discussed in Section 4. Section 5 gives the conclusion of this paper.

SFL-Contourlet Transform and IHS Transform
2.1. Traditional Contourlet Transform. In 2005, Do and Vetterli [12] proposed the Contourlet transform as a directional multiresolution image representation that can efficiently capture and represent smooth object boundaries in natural images. The Contourlet transform is constructed as a combination of the Laplacian pyramid transform (LPT) [13] and the directional filter banks (DFB) [14], where the LPT iteratively decomposes a 2D image into low-pass and highpass subbands, and the DFB are applied to the high-pass subbands to further decompose the frequency spectrum into directional subbands. The block diagram of the Contourlet transform with two levels of multiscale decomposition is shown in Figure 2(a), followed by angular decomposition. Note that the Laplacian pyramid shown in the diagram is a simplified version of its actual implementation. Nevertheless, this simplification serves our explanation purposes satisfactorily. By using the multirate identities, we can rewrite the filter bank into its equivalent parallel form, as shown in Figure 2(b), where ( ), = 1, 2, 3, is the equivalent filter of LPT for each decomposition level [15]. Obviously, using ideal filters, the Contourlet transform will decompose the 2D frequency spectrum into trapezoid-shaped regions as shown in Figure 2(c).
Due to the periodicity of 2D frequency spectrums for discrete signals and intrinsic paradox between critical sample and perfect reconstruction of DFB, it means that we cannot get perfect reconstruction and frequency domain localization simultaneously by a critically sampled filter bank with  the frequency partitioning of the DFB. When the DFB is combined with a multiscale decomposition as in the Contourlet transform, the aliasing problem becomes a serious issue. For instance, Figure 3(a) shows the frequency support of an equivalent directional filter of the second channel in Figure 2(b). We can see that Contourlets are not localized in the frequency domain, with substantial amount of aliasing components outside of the desired trapezoid-shaped support as shown in Figure 3(b).

Sharp Frequency Localization Contourlet Transform.
In order to overcome the aliasing disadvantage of Contourlet transform, Lu proposed a new construction scheme which employed a new pyramidal structure for the multiscale decomposition as the replacement of LPT [15]. This new construction is named as sharp frequency localization Contourlet transform (SFL-CT) [9], and its block diagram is shown in Figure 4.
In the diagram, ( ) represents the high-pass filter, and ( ) represents low-pass filter in the multiscale decomposition, with = ( 0 , 1 ). The DFB which is the same as in Contourlet transform (CT) is attached to the high-pass channel at the finest scale and bandpass channel at all coarser scales. The low-pass filter ( ) in each levels is downsampled by matrix , with normally being set as diagonal matrix (2,2). To have more levels of decomposition, we can iteratively insert at point +1 a copy of the diagram contents enclosed by the dashed rectangle. As an important difference from the LPT shown in Figure 2, the new multiscale pyramid can employ a different set of low-pass and high-pass filters for the first level and all other levels, and this is a crucial step in reducing the frequency-domain aliasing of traditional Contourlet transform. We leave the detailed explanation for this issue as well as the specification of the filters ( ) and ( ) to [9]. Figure 5 shows one Contourlet basis image and its corresponding SFL-Contourlet part in the frequency and spatial domains. As we can see from Figure 5(a), the original Contourlet transform suffers from the frequency nonlocalization problem. In sharp contrast, SFL-Contourlet produces basis image that is well localized in the frequency domain, as shown in Figure 5(b). The improvement in the frequency localization is also reflected in the spatial domain. As shown in Figures 5(c) and 5(d), the spatial regularity of SFL-Contourlet is obviously superior to the one of Contourlet. of the gray and color images [1] and defines three color attribute based on the human visual mechanism, that is, intensity ( ), hue ( ), and saturation ( ). stands for the information of the source image, stands for the spectrum and color attributes, and stands for the purity relative to the grayscale of some color. In IHS space, component and component are closely tied to the way that people feel about color, while component almost has nothing to do with the color component of the image.

IHS
There are various algorithms that can transform image from RGB to IHS space, common transformation model including sphere transformation, cylinder transformation, triangle transform, and single six cones [16]. We use triangle transform here. The formula of the forward and inverse transforms are as follows.

The Proposed Fusion Rule
From Figure 1(a), we can see that the background of the GFP image is partially dark; in order to avoid the influence of low contrast after fusion, the intensity component of the original GFP image is extracted by IHS transform which not only keeps most of the information from the original one, but also entirely improves the brightness of the fused image. In this way, we can explore a hybrid multiscale and multilevel fusion algorithm for biological cell image. We use SFL-CT to decompose the intensity components of GFP image and phase contrast image; different fusion schemes are used for different subband coefficients in order to keep a balance between the localization information in GFP image and detailed information of high frequency in phase contrast image. To get the protein distribution information of GFP image, the approximation (coarsest) subband coefficients of fused image are obtained with maximum region energy rule (MRE) [17]. To get structural information of the phase contrast image, coefficients of the finest detailed subband of fused image are based on maximum absolute value rule (MAV) [17]. To balance structural information and color molecular distribution information from the originals, a locally adaptive coefficient fusion rule named neighborhood consistency measurement (NCM) is adopted on coefficients of other detailed subbands. The schematic diagram is shown in Figure 6. MRE rule is defined as follows: where the regional energy is defined as where ( , ), ( , ), and ( , ) denote regional energy of original image , , and fused image in the coarsest scale and location ( , ). Ω( , ) represents a square region with 3×3 size whose center is located at position ( , ). ( , ) denotes the coefficient of the images = , , or within the region Ω( , ) in the coarsest subband and location ( , ). ( , ) means the average value of coefficients within Ω( , ).

Maximum Absolute Value (MAV) Rule.
After decomposing the input images using SFL-CT, the image details are contained in the directional subbands in SFL-CT domain. The directional subband coefficients with larger absolute values, especially for subband coefficients at the finest scale, generally correspond to pixels with sharper brightness in the image and thus to the salient features such as edges, lines, and regions boundaries. Therefore, we can use the maximum absolute value (MAV) scheme to make a decision on the selection of coefficients at the finest detailed subbands.

Neighborhood Consistency Measurement (NCM). Let
, ( , ) denote a region centered at coefficient , ( , ) in th level and th directional subband of image , and the energy of this region is defined as , ( , ). Then, The NCM is defined as a threshold for directional coefficients based on one region mentioned above. Let Ψ , ( , ) denote NCM as follows: It is not hard to see that the NCM is smaller than 1. In fact, NCM indicates whether the neighborhood is homogenous. Bigger NCM means being more homogenous.
Taking the number of directions in each detailed subband into consideration, we classify neighborhood into two classes: Nhd I and Nhd II which are shown in Figures 7(a) and 7(b). Nhd I is mainly used in horizontal and vertical subbands, and Nhd II is in other subbands. For instance, if the direction number is 8 or 16, we can use empirical distribution model as Figure 8.
We define a threshold which is normally 0.5 < < 1.  (1) Define the register original images: GFP image as image , phase contrast image as image , and fused image as image .
(2) Make IHS transform for image , and calculate the corresponding intensity components , hue component , and saturation component .     and their corresponding phase contrast images (8-bit grey scale) of the Arabidopsis. The former reveal the distribution of the labeled protein, and the latter present cell structures information.

Parameters Selection.
For the proposed method, the practical windows (Ω) in NCM rule are usually chosen to be of size 3×3, 5×5, or 7×7. We have investigated these practical windows and found that size 5 × 5 provides good results considering fusion clarity and time consumption. Apart from the sizes of the practical windows, the frequency parameters of SFL-CT are also needed to choose for improving the fusion performance. A larger number of experimental results demonstrate that the passband frequency and stopband frequency which should be 4 /21 and 10 /21, respectively, can not only provide pleasing fusion performance in most cases, but also keep good balance between fusion result and computation complexity.

Results Comparison.
We compare the proposed fusion rule with the traditional methods or rules. They include traditional IHS fusion method (T-IHS) [18], MRE and MAV fusion method based on IHS space (IHS + MRE and MAV) [1], and PCNN-based fusion method [19] in which all the images are decomposed by the nonsubsampled Contourlet transform (NSCT + PCNN), and our method (Hybrid NCM). Among them, MAV stands for the maximum absolute value rules; MRE stands for maximum region energy rules; MRE and MAV represents MRE rule for approximation subband and MAV rule for detailed subbands. The parameters of the above method are set as follows. For the rule of the fusion of MRE, neighborhood window is of size 3 × 3 pixels. SFL-CT makes a decomposition for 4 layers; the numbers of directions of each layer are (4, 8, 16, and 16); the filter for DFB is "pkva" filter; the set of NSCT + PCNN fusion algorithm is just the same as that in [20].
The fusion results, shown in Figure 9, which are obtained by four different methods demonstrate visual difference. It is obvious to see that the fused image using T-IHS method is unsatisfactory. The foreground and background are significantly nonuniform, especially along the cell outlines as there exist fuzzy blacks, so it is difficult to distinguish the inner information. However, the brilliance shown in Figures 9(d)-9(f) is largely improved, and the details of the images are also clearer. All in all, the location information of the cell structure in the phase contrast image and the distribution information of the protein are largely retained. Nevertheless, it is not easy to objectively judge the quality of the above three methods. For better judging these fusion results, the quantitative parameter that is visual information fidelity (VIF) [11] is taken into consideration. In the recent studies, large-scale subjective experiments assess VIF, a novel image similarity criterion, and prove it to be a good substitution for the subjective assessment. We know that there are two kinds of traditional evaluations that are subjective evaluation and objective evaluation. The former depends on the perception of human eye vision; different people would have different perception. The latter method has a little link with subjective factor, but it does not well measure the difference between the fusion image and the original image. As for the characteristic, that is, the little similarity between GFP fluorescence image and phase image, the VIF method, which is the combination of human visual system (HVS) and image characteristic statistics, is introduced into this paper to measure the quality of fusion image. This method can tell us the similarity between different regions of fusion image and the original image in quantitative aspect. The VIF value (the range is 0∼1) is closer to 1; then it indicates that the fusion image has more similarity to the original image. A number of experimental results have proved that the VIF method and the human subjective evaluation have a better similarity for image quality than the traditional methods such as root mean square error (RMSE), correlation coefficients (CCs), and mutual information (MI). Considering the difference in function orientation of the two kinds of images, especially the corresponding relationship between the fluorescence area in GFP images and the protein distribution in cells, the fluorescence area is firstly extracted from the original two images, then the VIF between fused image and phase contrast image is calculate, and thirdly the VIF between fused image and fluorescence image is calculated too. Fused image should keep high similarity with both phase contrast image and fluorescence image in fluorescence area. However, in the other area, only the similarity between it and the phase contrast image is considered. Therefore, this paper first segments both fused image and source image into fluorescence area and nonfluorescence area with Otsu method [20] and calculates VIF between fused image and source image in fluorescence area and nonfluorescence area, respectively. The calculation procedure is shown in Figure 10. Table 1 displays the calculation result of VIF of the fused image in Figure 9.
In the table, superscript fl refers to the fluorescent area while nfl refers to nonfluorescence area, represents GFP fluorescence image, and represents phase contrast image. VIF -fl refers to the similarity between fused image and GFP fluorescence image in fluorescence area, and VIF -fl refers to the similarity between fused image and phase contrast image in fluorescence area, while VIF -nfl refers to the similarity between fused image and phase contrast image in nonfluorescence area.
From Table 1, VIF -fl and VIF -nfl of the other three fusion methods are almost the same except T-HIS; the similar results indicate that all the detailed information of fused image comes from the phase contrast image no With luminance improved, the structural information will be well embedded in the fused image, which contributes the increase of VIF -nfl . The method we use can still get higher VIF -fl and VIF -fl , which indicates that the function information in GFP image and phase contrast image is well preserved in fluorescent area, and also the highest VIF -fl explains that SFL-CT can capture the structural information of the phase contrast image effectively. VIF distribution histogram of 117 groups of Arabidopsis cell fusion image is shown in Figure 11; the red squared line represents the VIF -fl , and the blue dotted line represents the VIF -fl . It is obvious that VIF -fl is higher than VIF -fl , which does coincide with the objective of using SFL-CT to outstand the inner structural information of the phase contrast image. With the increasing VIF within fluorescent area, the VIF in nonfluorescent area also tends to improve; this indicates the following: if the intensity in fluorescent area is strengthening, VIF will increase with the function information fully reflected; and once the brightness increases, the high resolution structural information of the image can be fully shown, and the corresponding VIF -fl will increase; the phase image is affected by the intensity whereas low in fluorescence area, structural information cannot be reflected very well which reduces VIF's numerical similarity. to balance the gray structural information and molecular distribution information for the fusion of GFP image and phase contrast image. In manner of SFL-CT's advantage of directional and excellent detailed expression ability, we use SFL-CT to decompose the intensity components of both GFP image and phase contrast image, and different fusion rules are utilized for coefficients of different subbands in order to keep the localization information in GFP image and detailed high-resolution information in phase contrast image. Visual information fidelity (VIF) is introduced to assess the fusion result objectively which quantifies the similarity inside and outside the fluorescent area between the fused image and original images. The experiment fusion results of 117 groups of Arabidopsis cell images from John Innes Center demonstrate that the new algorithm can both make the details of original images well preserved and improve the visibility of the fusion image and also show the superiority of the novel method to traditional methods. Although the results of the proposed method and NSCT + PCNN look similar, the former is much better in line with the image of fused image similarity degree which means that this algorithm has made full use of the advantages of SFL-CT to keep the structural information of the phase contrast image effectively. The complexity of the algorithm is obviously lower than the latter and more advantageous to the actual application. It is also needed to point out that from the experiment we find that VIF -fl is no longer equal with VIF -nfl when we try to improve the intensity of the fluorescent image to make a new fusion image reconstruction; this is partially due to the nonlinear relationship between similarity and intensity within fluorescent area and nonfluorescence area of the fused image. Otsu segmentation method can also cause certain disturbance to the calculation of VIF. One evaluation method cannot be perfect for different kinds of images, and a suitable fusion and evaluation method for biological cells is still a further problem to be solved.