Visual Perception Based Objective Stereo Image Quality Assessment for 3D Video Communication

,


INTRODUCTION
With the rapid development of internet, communication, multimedia technology, applications of 3D video communication technologies will be more extensive nowadays, such as three dimensional television, stereo video conferencing systems and consumer electronics (Fan et al., 2011;Shao et al., 2012;Gotchev et al., 2011).However, due to the tremendous data and the limit of signal transmit channel, stereo images often suffer distortion in the process of compression, transmission, reconstruction etc.Thus, Stereo Image Quality Assessment (SIQA) is a very important issue.
SIQA measures are generally divided into two categories: subjective and objective.Subjective testes have been standardized by the ITU (ITU-R Recommendation BT.500-11, 2002; ITU-T Recommendation P. 910,2008).As human eyes are the ultimate receivers of any visual information, the results of subjective approaches are reasonable and reliable.Some researchers adopted the results of subjective approaches to analyze the effects on the perceived quality of stereoscopic image.Analyzed the camera parameters, the display duration, quality-asymmetric coding and other factors on the perceived quality of stereoscopic image (Ijsselsteijn et al., 2000;Lambooij et al., 2009;Tam, 2007;Wang et al., 2009Wang et al., , 2012)).However, subjective methods are time-consuming, expensive and impractical in real systems.
Up to now, the research on the latter is relatively few, which may be easy to be real-time implemented and can be used to benchmark stereo image coding.Yang et al. (2008) proposed a SIQA metric based on Peak Signal to Noise Ratio (PSNR).Owing to different visual effects for different signals, it was not well consistency with HVS.Shen et al. (2009) proposed a SIQA metric based on Structural Similarity (SSIM).With the limit of the image pair number of metric validation, it did not well reflect metric performance.Horita et al. (2000) proposed a color SIQA metric.Based on the statistical characteristics, it is vulnerable to the impact of image content.Benoit et al. (2008 a, b) proposed SIQA metric based on single view image quality assessment SSIM and C4, which mainly discussed the correlation between subjective perception and proposed SIQA under different combination ways of Left-Right view Image Quality Assessment (LR-IQA) and Depth Perception Image Quality Assessment (DP-IQA).In additional, Benoit et al. (2008b) also verified the significance of SIQA on disparity map computed through different algorithms.Experimental results show that SIQA is enhanced when adding the disparity distortion contribution.The performances of SIQA based on SSIM and C4 has subtle differences and has certain influence on disparity obtained by different algorithms.Due to hard to obtain the accurate disparity, DP-IQA is hard to assess accurately the depth and realism.However, the combination of LR-IQA and DP-IQA is worth to be deliberate.Sazzad et al. (2009) and Akhter et al. (2010) proposed SIQA metrics under the hypothesis that LR-IQA and DP-IQA depended on the local features such as edge, texture and flat, which is combined of LR-IQA and DP-IQA.The disparity proposed is obtained through estimating on a certain range.Akhter et al. (2010) verified the effect on performance under disparity estimated and absolute different map and the experimental result showed it has little influence and disparity estimation need spend more time and increase the computational complexity significantly.At the same time, the parameters are more and were formed in terms of the statistical characteristics, which make reliability reduced and make applicability restrict.The disparity is hard to accurately reflect depth perception through disparity estimation.
In consideration of inconformity between human judgments (Mean Opinion Score) and the above proposed metrics and to build an assessment mode which is agreed with the human visual system, we propose a new Objective Stereo Image Quality Assessment (OSIQA) method in this study.The proposed OSIQA method includes two parts, new LR-IQA and DP-IQA metrics.The former is comprised of Left view Image Quality Assessment (L-IQA) and Right view Image Quality Assessment (R-IQA) and takes into account Watson model, which is a classic and comprehensive human visual model with contrast sensitivity, masking and error and HVS features such as contrast sensitivity.The latter takes into account structure distortion of original and distorted abstract difference map.

THE PROPOSED OBJECTIVE STEREO IMAGE QUALITY ASSESSMENT (OSIQA) METHOD
The most difference of stereo images is depth perception to signal view images, which can make us real and involving.The lack of depth information can lead to discrepancy between plain image and stereo quality measures.For example, the left or right view in stereo images being subjected to blurring distortion can still make us get relatively well stereo experience, whereas single view image assessment quality assessment metric does not correlate with the enhanced stereo perception quality.Therefore, in this study, we Here, LR-IQA metric is mainly used to assess the circumstance of image quality distortion, while DP-IQA metric is mainly used to assess stereo perception.In Fig. 1, the LR-IQA value Q s decreases with increasing distortion degree, however, the DP-IQA value Q d decreases with decreasing distortion degree.Therefore, let λ be a positive constant, then OSIQA value Q is given by: The proposed LR-IQA and DP-IQA metrics will be described in detail in the succedent sections.
Left-Right view Image Quality Assessment (LR-IQA) metric based on visual perception: HVS is an extremely complex information processing system (Campisi et al., 2007) and it is hard to understand deeply in the limit of many factors.An OSIQA metric, introduced visual perception characteristics such as human visual sensitivity, multi-channel characteristics, masking and stereo perception, will be able to improve the conformance between OSIQA metric and subjective perception.In the meantime, wavelet transform well presents multi-resolution decomposition characteristics of image and it can well describe multi-channel characteristics of HVS. Watson model may reflect the minimum visual error of stereo image.According to these analyses, we propose a new LR-IQA metric based on visual perception, which is comprised of Left view Quality Assessment (L-IQA) and Right view Quality Assessment (R-IQA) metrics.Its implementation includes three stages.In the first stage, L-IQA value is obtained.In the second stage, R-IQA value is obtained in the same process with L-IQA value.In the third stage, a pooling approach is employed to combine L-IQA and R-IQA metric values into a single quality score.L-IQA metric: Figure 2 shows the diagram of L-IQA metric.Watson model is first adopted to compute the minimum visual error in each-bands after wavelet transformation.Then, the proportion of perceived coefficients in each-bands can be counted based on minimum visual error.Finally, assessment value can be gained by additive weight using visual sensitivity.YUV color space is used since it makes visually understandable and the redundancy of its components is less than that in RGB color space.Watson model (Daly, 1992) was used to describe a perception model estimating just notice difference of image.Thus, it can estimate the perception impact of image by adding distortion.Watson model is expressed by: where l denotes wavelet decomposition level, f expresses frequency direction, A l,f represents basis functions of each-bands shown in Table 1, r shows display visual resolution, a indicates the minimum threshold, g f means direction function.When f is 0, 1, 2, or 3, the frequency direction is low-frequency, horizontal, vertical and diagonal, respectively.Table 1 shows basis functions of each-bands.Here, a = 0.495, f 0 = 0.401, g f is given by: According to the above discussion, Q l,f values of wavelet coefficients in Y component are shown in Table 2.As is well-known, the eyes are more sensibility to horizontal and vertical directions than diagonal direction.So, Q l,f values of horizontal and vertical directions are smaller than diagonal direction's in the same wavelet decompose level.Moreover, the sensibility is different for different wavelet decomposes level.The smaller Q l,f value is, the more the sensibility is.
Contrast Sensitive Function (CSF) describes the relationship between HVS and frequency information.CSF is a frequency function with a characteristic of band-pass filter.If the frequency is too high or too low, CSF is decreasing sharply.In addition, CSF can make human have the same visual sensitivity in different frequencies (Wandell, 1995).Mannos gained CSF approximate curve through experiments (Mannos and Sakrison, 1974) and computed it by: Majority energy of image is concentrated in lowfrequency including direct current component.So the weight of the low-frequency sub-band is set 1. The rest weights are gained by CSF integration of the corresponding frequency sub-band and is expressed as follows: where a and b denote the lower and upper of frequency range, respectively.
Afterwards, the abstract difference of left view image in wavelet domain is given by: where R left (l, f) and D left (l, f) denote the original and distorted wavelet coefficient of left view in the l-th level and the f-th direction, respectively.A left (l, f) denotes the abstract difference in the sub-band of the lth level and the f-th direction.
Then proportion of visual perception coefficient of abstract difference in the sub-band of the l-th level and the f-th direction can be calculated by: left left , lef , (, , ) / (, ) where, The number of visual perception coefficient of abstract difference which is larger than Q l,f value in the l-th level and the f-th direction N left (l, f) : The total number of the corresponding sub-bands In order to make the assessment metric more consistent with HVS, L-IQA metric is expressed by Eq. ( 8): LR-IQA metric: The assessment value Q r of R-IQA metric can be obtained in the same way with the assessment value Q l of the L-IQA metric described the above.According to the importance of Q l and Q r , the assessment value Q S of the LR-IQA metric is given by: where ω l and ω r denote the importance of the L-IQA and R-LQA metrics, respectively.ω l + ω r = 1.The larger ω l or ω r is, the more the importance of L-IQA or R-IQA metric is and vice versa.

DEPTH PERCEPTION IMAGE QUALITY ASSESSMENT (DP-IQA) METRIC
Stereo perception is the capabilities of feeling the depth.Stereo perception is directly related to the stereo image quality.The literature (Akhter et al., 2010) verified that it is little effect to stereo image quality using absolute difference map which is corresponding abstract differential value between left and right view images and the disparity through block matching which should spend much time in matching algorithms.Therefore, absolute difference map in the study is adopted in the DP-IQA metric.Besides, considering that it is much sensitive to edge region, the proposed DP-IQA metric is shown in Fig. 3. Its implementation includes four stages.In the first stage, absolute difference map is obtained.In the second stage, area segmentation map is obtained to original and distorted difference map.In the third stage, the assessment maps of different areas are computed.In the fourth stage, a pooling approach is employed to combine assessment maps into a single quality score.

Abstract difference map:
In order to reduce redundancy, original and distorted stereo images are converted into YUV space.Here, Y component only is considered.The original and distorted absolute difference maps are given in Eq. ( 9) and ( 10), respectively: where, Firstly, convolute the original and distorted absolute difference map with each Sobel template individually to obtain two gradient fields, which have the same size as the original abstract difference map.
Secondly, compute the gradient magnitudes of the two gradient fields.
Thirdly, determine the thresholds TH 1 = 0.12G max , where G max is the maximal gradient magnitude of the original absolute difference map.Finally, determine the pixels of strong and weak edges region.Assume that the gradient of pixel at (i, j) of original absolute difference map is P r (i, j) and the gradient of distorted absolute difference map is P d (i, j).The pixel classification is carried out according to the following rules.
R1: If P r (i, j) >TH 1 and P d (i, j) >TH 1 , then the pixel is considered as a strong edge pixel.

R2:
If P r (i, j) >TH 1 and P d (i, j) <TH 1 , or P r (i, j) <TH 1 and P d (i, j) >TH 1 , then the pixel is considered as a weak edge pixel.
DP-IQA assessment map: HVS is highly adapted to extract structural information from the viewing field.SSIM proposed by Zhou et al. (2004) is not well conformance to subjection perception due to lack of direction information.Therefore, to improve it, a new DP-IQA metric is presented by combining SSIM and direction similarity index.SSIM includes three parts, luminance comparison L (x, y), contrast comparison C (x, y) and structure comparison S (x, y) and is given by: (2 )/(( )( )) x y xy x y x y SSIM x y L x y C x y S x y where, u x and u y : The (local) sample means of signal x and signal y, respectively σ x and σ y : The (local) standard deviations of x and y σ xy : The (local) correlation coefficient between x and y C 1 and C 2 : Constants Next, in order to obtain direction similarity function D, the direction map of original absolute different map Ref and the direction map of distorted absolute different map Dis are elicited.Figure 4 shows the simple templates to extract direction information of absolute difference map.The direction similarity method is implemented as follows.
Firstly, convolute the original and distorted absolute difference maps with each direction template individually to obtain four gradient fields.
R5: If max {P d1 (i, j), P d2 (i, j), P d3 (i, j), P d4 (i, j)} = P d1 (i, j), then Dis (i, j) = 1 R6: If max {P d1 (i, j), P d2 (i, j), P d3 (i, j), P d4 (i, j)} = P d2 (i, j), then Dis (i, j) = 2 R7: If max {P d1 (i, j), P d2 (i, j), P d3 (i, j), P d4 (i, j)} = P d3 (i, j), then Dis (i, j) = 3 R8: If max {P d1 (i, j), P d2 (i, j), P d3 (i, j), P d4 (i, j)} = P d4 (i, j), then Dis (i, j) = 4 According to Fig. 3, strong edge assessment value Q q and weak edge assessment value Q w can be obtained according to the information mentioned above and Q q and Q w are given by: s where, N s = The number of strong edges in original or distorted absolute difference map s = The strong edges N w = The number of weak edges in original or distorted absolute difference map w = The weak edges Weighted pooling: The DP-IQA metric value Q d is gained by combining Q q and Q w and is given by: where   Continuous Quality Scale (DSCQS) (ITU-R Recommendation BT.500-11, 2002) test methodology was adopted.Meantime, the subjects scored up their judgments of quality and raw quality scores were generated.The subjective quality scores are given in terms of Difference Mean Opinion Score (DMOS) on a scale of 0 to 100 larger value indicating declined visual quality through processing of raw quality scores according to Kurtosis method in which the raw data of four subjects were rejected.Figure 6 is given the dependence of DMOS and distortion parameters with different distortion types.The subjective quality of stereo images drops with the increasing strength of different distortion types.It is consistent well with the tendency of the "ground truth" quality.In fact, the results illustrate the rationality of the experiments.
The performance criteria (Chou and Li, 1995) adopted in our tests includes the linear Correlation Coefficient (CC), Spearman Rank-Order Correlation Coefficient (SROCC) and Root Mean Squared Error (RMSE).All the performance criteria are computed based on DMOS and OSIQA scores.CC and RMSE are all calculated after the compensation by nonlinear mapping, while SROCC is independent of nonlinear mapping.The procedure of nonlinear mapping, adaptation of objective quality scores to fit subjective quality scores through a logistic curve, follows Eq. ( 16) (Zhang and Le, 2010) for all the quality measures under our tests: where DMOS ρ (Q) denotes the predicted DMOS and β k (k = 1, 2, 3 and 4) are fitting parameters.The fitting is done by the nonlinear regression over the dataset.Compared with CC, SROCC is generally considered as a less sensitive correlation measure because it operates only over data rank and ignores distance between data points.Considering SROCC as a widely used performance criterion, it is used in the following tests for completeness.

Analysis of weights:
In the proposed method, wavelet basis adopts bior 4.4 and l is four.Firstly, it can make sure ω 1 and ω 2 through DMOS and the L-IQA metric value Q l and R-IQA metric value Q r .The importance of L-IQA value Q l and R-IQA value Q r is different for Different distortion types (Tam, 2007).Generally speaking, the stereo image quality depends largely on better quality between left and right view image under fuzzy distortion and the stereo image quality is about half of left and right view image quality under block effect.Figure 7 shows the relation between ω 1 value and DMOS.The performance of CC and RMSE is obtained through the total 312 L-IQA values Q l , R-IQA value Q r and DMOS ρ .From Fig. 7, both of CC and RMSE exist a peak value when ω 1 is 0.6.Therefore, ω 1 is 0.6 and ω 2 is set to 0.4.Secondly, it can make sure ω 3 and ω 4 value through DMOS, strong edges assessment value Q q and weak edges assessment value Q w .Figure 8 shows the relation between ω 3 value and DMOS.Both of CC and RMSE exist a peak value when ω 3 is 0.6.Therefore, ω 3 is 0.6 and ω 4 is set to 0.4.
Finally, the λ value is measured through LR-IQA value Q s , DP-IQA value Q d and DMOS. Figure 9 shows the relation between λ value and DMOS.Both of CC and RMSE exist a peak value when λ is 0.2.Therefore, λ is set to 0.2.

The performance of the new LR-IQA and DP-IQA metrics:
The results of performance criteria concerning LR-IQA metric and DP-IQA metric are provided in Table 3 and 4, respectively.As shown in Table 3 and 4, the performance of LR-IQA metric and DP-IQA metric are out line with DMOS.Especially, both of CC and SROCC value are lower than 0.9 under JP2K distortion in Table 3 and under JPEG distortion in Table 4.The most difference between single view image and stereo image is the depth perception.Therefore, OSIQA value is comprised of LR-IQA value and DP-IQA value not only LR-IQA value or DP-IQA value.Neither LR-IQA nor DP-IQA is neglected in OSIQA method.
The performance of the proposed OSIQA method: Table 5 shows the performance between the proposed OSIQA method and DMOS.Both of CC and SROCC are more than 0.92 under all kinds of distorted types.The RMSE are lower than 7.The scatter plot of DMOS versus OSIQA scores computed by the proposed measure is shown in Fig. 10.With most points close to the fitted logistic curve in the scatter plot, the proposed measure is observed to provide satisfactory prediction of DMOS for most images.It means that the judgment

CONCLUSION
In this study, we have proposed an Objective Stereo Image Quality Assessment (OSIQA) metric based Watson model, which consider the depth perception and some characters of Human Visual System (HVS).Experimental results show the proposed OSIQA measure is proven to be an effective measure and is well consistent with human subjectivity.Both of CC and SROCC are more than 0.92 under five distortion types and RMSE is smaller than 6.7.However, the coefficient is only obtained by great many experiments.The coefficient needs to be improved in the future.

Fig. 1 :
Fig. 1: The proposed Objective Stereo Image Quality Assessment (OSIQA) method take this fact into account and propose an objective stereo image quality assessment method comprised of new LR-IQA and DP-IQA metrics, as shown in Fig. 1.Here, LR-IQA metric is mainly used to assess the circumstance of image quality distortion, while DP-IQA metric is mainly used to assess stereo perception.In Fig.1, the LR-IQA value Q s decreases with increasing distortion degree, however, the DP-IQA value Q d decreases with decreasing distortion degree.Therefore, let λ be a positive constant, then OSIQA value Q is given by:

Fig. 2 :
Fig. 2: Diagram of Left view Quality Assessment (L-IQA) model l_org and I r_org : The original images of left and right view, respectively I l_dis and I r_dis : The distorted images of left and right view, respectively X org and X dis : The original and distorted absolute difference maps, respectively Region segmentation: With large experiments, it is shown that the DP-IQA value of stereo image mainly depends on that of strong and weak edge regions.Edge is important feature of an image because it represent the visual separation between objects and is the most easily perceived.The importance of strong edges, weak edges, non-edges is gradually decreasing to human perception, which corresponds to their gradual decline of gradient magnitudes in the image gradient field.A simple edge detection operator, Sobel operator, is used to segment images.It has a symmetrical template.Sobel operator used has a horizontal template and a vertical template.Image segmentation is implemented as follows.

Finally, compute the
direction similarity of local region.Assume that D (x, y) denotes the direction similarity of the MN blocks with the center coordinate (x, y) between Ref and associated Dis.A(x,y) denotes the number of the same direction of Ref and associated Dis in the corresponding block, B (x, y) is the total number of Ref or Dis in the corresponding block.D (x, y) is given by D (x, y) = A (x, y) /B (x, y).

Fig. 9 :
Fig. 8: The relation between the weighting of ω 3 and DMOS Fig. 10: Scatter plot of DMOS versus OSIQA scores computed by the proposed measure

Table 1 :
The coefficient value of A l,f

Table 2 :
Q l,f value of Y component

Table 3 :
Performance of LR-IQA and DMOS