Fuzzy based iterative matting technique for underwater images

The paper presents an iterative matting technique for extraction of underwater objects from images. The technique adopts histogram division and stretching to obtain multiple images of different contrast levels that exhibit all image details. For each contrast image, alpha matte is produced and is further reﬁned with every iteration. In the end, fuzzy weights are assigned to the alpha mattes obtained at different contrast levels that are combined using weighted average. The resultant alpha matte thus includes more accurate pixels from multiple alpha mattes and generates much reﬁned matte image. The proposed technique is tested, visually and quantitatively, on a manual dataset containing 50 images. The less MSE shows that the proposed technique achieves noticeably higher accuracy as compared with contemporary image matting techniques.


INTRODUCTION
The technical advances in recent years have made it possible to acquire deep underwater images. Accurate extraction and analysis of underwater objects is important for identification, tracking and marine life exploration. However, there are certain factors involved while processing underwater images such as, the lack of visibility due to attenuation of light in water, nonuniform illumination that causes green or blue colour to dominate while red colour almost disappear, and low contrast and resolution. Therefore, accurate extraction of underwater objects have become more complicated. Image matting extracts the foreground object from the image by decomposing the input image I into foreground I f and background I b images, that is where represents the opacity value of the pixel, that combines the foreground and background components of the image. The range of alpha lies in [0, 1] where = 0 represents the background pixel and = 1 represents the foreground pixel. If is the fractional value lies between (0, 1) that means the pixel is mixed by foreground and background. Since matting is an illposed problem and is not fully automated, hence user is required to provide extra information in the form of trimap [1], that divides the image into three regions: a region that is known to be This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2020 The Authors. IET Image Processing published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology entirely foreground, a region that is known to be entirely background and an unknown region as shown in Figure 1(b). The algorithm is constrained to set = 0 in the background region and = 1 in the foreground region. The aim of image matting is to determine for each unknown pixel considering the spatial and photometric knowledge about known foreground and background pixels. Many matting techniques have been introduced in literature to extract foreground object from images [1]. However, they provide limited accuracy due to poor contrast and non-uniform intensity of underwater scene.
In this paper, an iterative image matting technique for accurate extraction of underwater objects is proposed. Since underwater images are generally not clear and tend to lose important information due to attenuation of light in water. Thus, the accuracy of extracted objects is compromised. The proposed technique is iterative that utilises the information from different contrast levels of the image using histogram division and stretching. Alpha mattes are produced by applying image matting algorithm which are further refined at each iteration. In the end, weights are assigned to different alpha mattes using fuzzy inference system to generate accurate alpha matte. Simulation results evaluated using mean square error (MSE) are used to verify the proposed technique.
The remaining paper is organised as follows: Section 2 presents the related work of various image matting techniques. Section 3 describes the proposed matting technique for underwater images. The simulation results along with detailed visual and quantitative comparison are shown in Section 4. In the end, conclusion is drawn and presented in Section 5.

RELATED WORK
Previously proposed matting techniques can generally be grouped into four categories, that is, sampling based techniques [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21], propagation based techniques [22][23][24][25][26][27][28][29][30], optimisation techniques [31][32][33] and machine learning matting techniques [34][35][36][37][38][39][40]. Sampling based techniques estimate alpha values of unknown pixels by collecting known foreground and background samples. Yan et al. [2] presented a sampling based matting technique that partitions the unknown region into subregions. The alpha matte is estimated by extrapolating the definite foreground/ background samples to unknown regions using a spatial metric. The technique performs well around the edges of the image, but fails to generate good results in textured images. Wang and Cohen [3] estimate unknown pixels using known foreground/ background samples (as candidate set) to generate alpha matte. The technique generates erroneous alpha mattes in case of large spatial distance between true samples and unknown pixels. Gastal and Oliveira [4] technique selects the samples by emanating the rays in different directions, which are further shared by neighbouring pixels. Since neighbourhood pixels have similar characteristics, thus sharing pixels can reduce the computational complexity. Karacan et al. [5] proposed a sparse subset sample selection process to collect foreground and background samples. Since these techniques consider very limited number of samples, the true samples may miss which result to produce inaccurate alpha mattes. He et al. [6] take into account all the boundary samples as candidate, thus avoid missing true samples. However, processing all the samples one by one is computationally expensive. Shahrian and Rajan [7,8] proposed a matting method by introducing texture feature along with colour to select the candidate samples. The texture feature is helpful in images having same foreground and background colour but the technique does not perform well in highly textured images. To make the sample selection process more efficient, Shahrian et al. [9] presented a technique that builds a comprehensive set of samples consists of wide variation of colour distributions present in the image. However, the technique fails to generate accurate alpha matte for images having overlapped colour distributions. Liang et al. [10] and Cheng et al. [11] proposed parti-cle swarm optimisation (PSO) based search technique and random search technique respectively to improve sampling criteria. These techniques discover the optimal sample pairs among all the pixels in sample selection process. PSO based search process is more effective and robust than random search based sample selection process but its performance degrades when the distribution of unknown region is more discrete. Wu et al. [12] proposed random search based sampling criteria to select valid candidate samples which are combined with weighted least square filter to sharpen the foreground/ background boundaries. However, the technique produces inaccurate alpha mattes for low resolution images. Feng et al. [13] and Jonhnson et al. [14] used sparse coding to better select the foreground and background samples thus allows for better estimation of alpha mattes. However, these methods does not work well for images having same foreground and background colours. Li et al. [15] and Lu and Li [16] used depth information along with colour since depth maps provide strong edge information. These techniques overcome significant amount of errors produced due to overlap of colour distribution between foreground and background pixels. But since these methods mainly rely on the accuracy of depth maps, they fail to perform well when depth maps are inaccurate or image has multiple depth layers. Sun et al. [17] proposed saliency based technique by computing the colour saliency map of the image to improve the sampling process. However, it needs lot of manual interventions. Tan et al. [18] proposed unsupervised method of image matting which incorporates saliency information to get accurate matte with less computation. Ruzan and Tomasi [19] proposed a parametric sampling method that includes various isotropic gaussian models to estimate alpha mattes. As the alpha values are estimated independently for each unknown pixel, the resulting alpha matte may have discontinuities. Chuang et al. [20] presented a well defined Bayesian framework that utilises maximum a posteriori (MAP) technique to solve the matting problem. However, for complex scenarios (i.e. overlapping between foreground/ background distributions) Bayesian matting fails to produce accurate results. Sun et al. [21] presented a matting algorithm that utilises colour as well as texture features for foreground and background estimation and then apply the Bayesian framework to better estimate the alpha mattes.
Propagation based matting techniques compute the alpha values of unknown pixels by defining affinities between the neighbouring pixels. Sun et al. [22] proposed propagation based method that estimates the gradient of the matte by solving poisson equation. Since the method chooses the nearest foreground/ background samples for unknown pixels thus generates inaccurate alpha mattes for complex images where chosen samples are not estimated with precision. Grady et al. [23] estimates the alpha matte by utilising the random walk probabilities, however, due to the utilisation of probability function of the random walker, the estimated alpha matte becomes over smooth. Lee and Wu [24] utilised nonlocal principle to obtain alpha matte from sparse user input, however the method does not perform well for complex images. Levin et al. [25] proposed a cost function based on colour line model to find affinities between the pixels. The technique works well for certain images FIGURE 2 Overview of proposed framework but does not perform well in highly textured images. Similar to random walker, [26] estimates a weighted geodesic distance to generate fast and high quality alpha mattes with memory requirements. However, the weights are allocated in rather simpler way which will not work well for images having overlapped foreground and background colour distributions. Informationflow matting [27] defines pixel-to-pixel relation from known to unknown opacity region as well as within unknown region. However, known-to-unknown flow fails to perform well for images having transparent regions. Also, the technique requires dense trimap otherwise it will fail to find good foreground and background neighbours, which affects alpha propagation. KNN matting algorithm proposed in [28] and [29] is based on nonlocal principle that utilises closed-form solution [25] to find the affinities between pixels. KNN matting produces relatively good alpha mattes for images having high texture and complex background. However, it does not generate smooth results for images having thin regions. Levin et al. [30] proposed an unsupervised matting approach to extract alpha matte in complete automatic fashion by deriving an analogy between matting Laplacian and node affinities. The technique has some limitations in terms of memory requirements thus restrained from practical usage.
Many algorithms have been introduced that tend to combine both sampling and prorogation based techniques to optimise the matting process. Chen et al. [31] optimised the matting algorithm by combining local and nonlocal smooth priors. The algorithm works well for semi transparent images however, gives erroneous results around the boundary of the images. Wang and Cohen [32] and Tierney et al. [33] proposed an optimisation approach by combining segmentation and matting problem thus iteratively computes the alpha value for each unknown pixel using sparse user input. The technique requires lesser known pixels at the beginning due to iterative behaviour of the algorithm, thus reduces the user effort. However, the computational cost of the algorithm increases significantly.
In recent years, machine learning based matting methods have also been introduced. Zheng and Kambhamettu [34] proposed a technique which computes the alpha matte from the neighborhood pixels. The approach works well when unknown region is thin, but the matting results deviates greatly depending on the quality of trimap. Non-linear and support vector regressions use spatially diverse relationships between pixel features and alpha values to generate high quality mattes [35,36]. However, inappropriate selection of learning parameters may cause under-fitting or over-fitting problems. In [37,38] transductive inference frameworks are proposed to model models the matting. Hu et al. [39] and Cho et al. [40] estimates the alpha mattes using deep learning and convolutional neural network respectively. These methods are very effective for feature learning and improves the accuracy of alpha mattes however, high quality training samples are required to generate accurate results.

PROPOSED METHODOLOGY
The proposed matting framework is demonstrated in Figure 2.
Let I be the RGB input image. The first step is to convert the RGB image into HSV colour model and decompose the image into its respective channels. In HSV colour model, hue H controls the actual colour of the image, saturation S determines how widely separated the RGB values and the value V component determines the brightness of the image. It has been observed that underwater images are affected by non-uniform illumination where blue and green are dominant colour channels and red is the inferior colour channel that exhibits low percentage of intensity values. Thus, the pixels of blue and green bands are positioned towards right side, that is, at high intensity level, while the pixels of red band are shifted towards left side, that is, at low intensity level in image histogram. Moreover, poor contrast of underwater images lead to loss of image details. Based on these observations, the proposed technique divides the histogram into three regions, as illustrated in Figure 3, and performs histogram stretching on the S component of each region to produce three images of different contrast levels. The rational behind dividing the histogram and perform stretching on each region is that, underwater images have dark and bright areas. Performing global stretching leads to shift the bright pixels towards the high intensity level of the dynamic range, that will result to further increase the brightness of the bright areas, that is, over-enhanced bright areas. Similarly when the low-intensity pixels will shift towards smaller intensity level, it will decrease the brightness of the dark areas, that is, under-enhanced dark areas. These over/under-enhanced areas will result in loss of image details. Thus in order to obtain all the image details, image histogram is divided into three regions and stretching is performed on the S component of the image to produce low, medium and high contrast images. The histogram of S component is divided as where S (min) and S (max) refers to the minimum and maximum saturation value in the image histogram respectively. Thus the image histogram of S is divided into three regions, named as lower S l , middle S m and upper S u regions. The value of lower region ranges from minimum saturation value S (min) to saturation value S (a), middle region belongs to value S (a) to S (b) and upper region belongs to value S (b) to maximum saturation value S (max).
Histogram stretching of each region of S is performed using +Ś (min), (4) where S (in) and S (out ) are input and output pixels,Ś (min) and S (max) are minimum and maximum saturation values of output images respectively. The lower region S l is stretched from 5% to 100% of the output histogram towards the higher value. The middle region S m is stretched over the entire dynamic range, that is, 0-100%. The upper region S u is stretched from 0% to 95% of the output histogram towards the lower value. The minimum of 5% for lower region and maximum of 95% for upper region is set to reduce the effect of under-enhancing and over-enhancing in images respectively. The H , modifiedŜ i , and V channels are concatenated and converted back to RGB colour model to produce three images of different contrast levels X i , that is, the low contrast image X l , the medium contrast image X m and the high contrast image X h as shown in Figure 4 X where i ∈ {l, m, h}, and denotes the conversion operator. Each image along with the trimap T is taken as input in matting process [5] to generate alpha matteŝi. The technique selects a sparse set of samples from known foreground and background region (specified by the trimap) that best represent the unknown pixels. The representative samples are determined by computing the distance based on KL-divergence [41] between the samples obtained via super-pixels. The best foreground/background pair is then selected by optimising the objective function to determine the alpha value of unknown pixel̂i where denotes the image matting operation and s represents the size of super-pixels. We have used s = 15 in our experimentation.
As the accuracy of resultant matte image highly rely on the quality of trimap, thus to attain the best matting results, the unknown pixels in trimap should be as less as possible. In order to improve the accuracy of alpha matte and to lessen the uncertainty of pixels, the proposed technique utilises iterative approach to determine the alpha value of each unknown pixel. In each iteration, trimap is refined by utilising the information provided by the alpha mattes generated in previous iteration. The refined trimapT is generated as, where the thresholds 1 = 0.7 and 2 = 0.1 are empirically selected constants. In next iteration, the refined trimapT is used along with image X i . The alpha mattes obtained at different contrast levels are combined using weighted average to obtain final alpha matte , that is where w i are the weights assigned to each alpha mattêi. The proposed technique uses fuzzy based inference system for automatic weight assignment. Let w l , w m and w h are the weights assigned to alpha mattes l , m and h respectively. Weights are assigned on the basis of intensity differences between alpha mattes, in such a way that more difference in intensities corresponds to less weight assignment and vice versa. Three trapezoidal membership functions (MFs) x are defined for intensity differences where (x ∈ Low, Med, High) in the range of [0, 1]. Similarly, three output MFs are defined for weights estimation, that is, y where (y ∈ Low, Med, High) in the range of [0, 5], as shown in Figure 5.
Fuzzy IF-THEN rules for weights assignment are defined as, 1. IF Δ lm is High and Δ lh is High THEN w l is Low 2. IF Δ lm is Med and Δ lh is High THEN w l is Med 3. IF Δ lm is Low and Δ lh is High THEN w l is High 4. IF Δ lm is High and Δ lh is Med THEN w l is Med 5. IF Δ lm is Med and Δ lh is Med THEN w l is Med 6. IF Δ lm is Low and Δ lh is Med THEN w l is High 7. IF Δ lm is High and Δ lh is Low THEN w l is High 8. IF Δ lm is Med and Δ lh is Low THEN w l is High 9. IF Δ im is Low and Δ lh is Low THEN w l is High Similar rules are applied to assign weights w m and w h to alpha matteŝm (based on Δ ml and Δ mh ) and̂h (based on Δ hl and Δ hm ) respectively.

Dataset
The proposed technique is tested on different underwater images and simulation results are presented. A manual dataset consisting of 50 underwater images is constructed. For each image, trimap and ground truth are generated manually using the procedure described in [3]. The dataset contains various images of marine life with green/ blue colour dominance, overlapped foreground and background colour distribution, complex textured backgrounds and contrast variations.

Visual and quantitative comparison
Visual and quantitative comparison is performed with different state-of-the-art matting techniques including KL Sparse Matting [5], Weighted Colour and Texture Matting (WCTM) [7], Information-flow Matting [27] and KNN Matting [28]. The quantitative comparison is performed using MSE computed according to [3]. Figure 9 analyse the effect of proposed histogram division and stretching technique. Three images of different contrast levels are produced by dividing and stretching each region of image histogram. Each image gives different contrast information, which is utilised to generate three different alpha mattes (as shown in Figure 9). For example, tail area of low contrast image better discriminates the background from foreground, therefore, the alpha matte generated by low contrast image has better accuracy near tail region. The upper boundary between mouth and fins area of the fish is more clearer in medium contrast image, the effect can be seen in resultant alpha matte (Figure 9(e)). The high contrast image has lots of foreground and background colour similarity near tail area of the fish, which effects the matting results as seen in Figure 9(f). However, the lower boundary area between mouth and fins is more distinctive in high contrast image as compared to other images. Thus the alpha matte at this region is more accurate as compared to other images. Figure 9(g) shows resultant alpha matte after combining  The result clearly shows an improved alpha matte as compared to baseline KL-matting (Figure 9(h)). Figure 10 shows the performance of proposed iterative refinement matting technique. It can be observed in Figure 10(b-d) that each iteration improves the accuracy of alpha matte by reducing the amount of unknown pixels. In this way the uncertainty of the pixels has been reduced and more pixels has been assigned as either foreground or background.
Fuzzy based weighted averaging scheme is proposed to combine alpha mattes generated using different contrast images. The weights are assigned on the basis of intensity differences between the alpha mattes. Hence more weights are assigned to the pixels of alpha mattes having less intensity difference with each other. In this way, the erroneous pixels of alpha mattes are given less weight to reduce the overall amount of error. Figure 11 shows the comparison of proposed fuzzy based weighted averaging scheme with simple average. Figure 11(a-c) show the   alpha mattes generated after iterative refinement. It can be seen that Figure 11(c) gives more erroneous results near tail region of the fish and its intensity difference is greater than other images. Therefore, less weight is assigned to those pixels and their contribution in final alpha matte is less as compared to other alpha mattes. The visual comparison of proposed technique with existing techniques is presented in Figures 6 and 7. The results clearly demonstrate that proposed technique better discriminates the foreground and background pixels, as compared to existing techniques. For example, the image in Figure 7, row 1 has nonuniform illumination and foreground object, that is, fish, is overlapped with another fish in background. The image in Figure 6, row 4 has low contrast. The images in Figure 7, rows 2, 3 and 6 have green colour dominance, hence foreground and background have same colours and have high textured background. Similarly, the images in Figure 6, rows 2, 3 and 6 and Figure 7, rows 4 and 5 have blue colour dominance and also foreground objects have blur boundaries. Thus, the foreground is hard to extract from background. The alpha mattes generated by proposed technique are much accurate than existing competitive techniques as shown in visual results. Table 1 exhibits the quantitative evaluation of existing and proposed techniques using MSE. The MSE curve of all the 50 images computed against ground truths is presented in Figure 8. Since the MSE of proposed technique is less than existing techniques, it clearly demonstrates that the extracted alpha mattes of proposed technique are much more accurate as compared to the existing techniques.

CONCLUSION
The paper proposed an iterative image matting technique for accurate extraction of objects from underwater images. In order to obtain all image details, three different contrast images are produced using histogram division and stretching. Alpha matte for each image is generated and is further improved using iterative refinement process. In the end, fuzzy weights are assigned to the alpha mattes that are combined using weighted average to obtain resultant alpha matte. The proposed technique gives high quality alpha mattes of underwater images as compared to the state-of-the-art matting techniques.