Estimation of Jones matrix, birefringence and entropy using Cloude-Pottier decomposition in polarization-sensitive optical coherence tomography.

Estimation of polarimetric parameters has been a fundamental issue to assess biological tissues that have form birefringence or polarization scrambling in polarization-sensitive optical coherence tomography (PS-OCT). We present a mathematical framework to provide a maximum likelihood estimation of the target covariance matrix and its incoherent target decomposition to estimate a Jones matrix of a dominant scattering mechanism, called Cloude-Pottier decomposition, thereby deriving the phase retardation and the optic axis of the sample. In addition, we introduce entropy that shows the randomness of the polarization property. Underestimation of the entropy at a low sampling number is mitigated by asymptotic quasi maximum likelihood estimator. A bias of the entropy from random noises is corrected to show only the polarization property inherent in the sample. The theory is validated with experimental measurements of a glass plate and waveplates, and applied to the imaging of a healthy human eye anterior segment as an image filter.


Introduction
Polarization-sensitive optical coherence tomography (PS-OCT) has been developed as an extension to conventional OCT [1] to observe the depth-resolved polarization property of a sample by measuring the state of a polarized electromagnetic (EM) field [2-6] or the linear map of the polarized EM fields [7][8][9]. Since everything about PS-OCT is more complicated than conventional OCT, it still has plenty of room left for improvement of PS-OCT in aspects of both the hardware and the software. Among them, Makita et al. raised one of the fundamental issues in PS-OCT that the mean of the phase retardation is significantly biased from the true value because of asymmetric distribution [10]. Various methods have been developed to apply averaging to such polarimetric parameters [11][12][13][14][15][16][17][18][19]. In another approach, an estimator of the phase retardation based on the likelihood of the distribution was also developed [20]. However, these approaches are limited to estimating only a partial aspect of the polarization property, because they have focused on one or some of parameters, for example, the phase retardation, the optic axis, Stokes parameters, Jones matrix or degree of polarization uniformity [21]. Lack of a sophisticated and unified method of the estimation is our motivation for this study to develop a new approach that resolves the estimation issue in PS-OCT.
Although the Jones matrix is an operator to characterize the sample's coherent property that changes the state of polarized light [22], it is also a subject of the measurement that inherently involves a stochastic process. In the measurement, the observed quantities need to be evaluated statistically. The most common approach to this may be Stokes-Mueller formalism [23]. Although Stokes-Mueller formalism has advantages consisting of only observable quantities and a capability of geometric treatment as an orthogonal group, it does not provide a clear perspective on statistical correlation between the field variables.
It has been known that the Jones and Stokes-Mueller formalisms can be understood abstractly using group theory [24,25]. In 1986, Cloude extended the polarisation algebra to the special unitary group of degree 4, SU(4), and showed that the positive semidefinite Hermitian matrix in SU(4) was useful in estimating the Jones matrix in SU(2) [26]. This theory has been advanced in the field of polarimetric radar imaging to detect and classify natural and artificial terrain objects from airborne and satellite radar systems [27]. Although there are some variations in how to call this incoherent decomposition method, we call it Cloude-Pottier decomposition after the book in [28]. There are only a few reports that used it for biomedical applications [29][30][31][32], and no reports in OCT to the best of our knowledge.
In this study, we develop a novel method to estimate a 4 × 4 covariance matrix that is converted from the measured Jones matrix. Applying Cloude-Pottier decomposition to the 4 × 4 covariance matrix, we estimate the Jones matrix and entropy that shows a randomness of the Jones matrix, and obtain the phase retardation and the optic axis of the sample from the estimated Jones matrix. To the best of our knowledge, this is the first time to apply the entropy for PS-OCT to show the randomness of the polarization properties. In addition, an underestimation of the entropy at a low sampling number in the ensemble average is mitigated by an estimator for the eigenvalues of the 4 × 4 covariance matrix [33]. Moreover, a bias in the entropy by random noises is estimated and corrected. Specifically, we derive equations that show relationships between the bias and noises for both an accumulated Jones matrix along the depth and a local Jones matrix that cancels the axial accumulation. Section 2 describes the details of the theory, and Section 3 shows experimental results to validate the theory and demonstrates image filtering by Cloude-Pottier decomposition for the anterior eye segment. Section 4 describes discussions about some topics that are relevant to our results, including other methods for parameterizing the depolarization.

Statistical description of the Jones matrix using the target covariance matrix
We write measured Jones matrix of the sample as where the first and second columns are measured Jones vectors in response to orthogonally polarized incident electromagnetic fields with each other. As the first step for the statistical analysis of the Jones matrix, it is rearranged to 1 × 4 target vector that is defined as where Tr(X) denotes a trace of the matrix X, and Ψ is an arbitrary complete basis set. All bases of the Ψ are used to create the target vector κ in Eq.
(2). The symbol κ is used here instead of k that is commonly used in the literature of polarimetric radar imaging to avoid confusion with a wavenumber. A Pauli basis set and a lexicographic ordering basis set are often used as the Ψ [27]. In this paper, we use the lexicographic ordering basis set to simplify the calculation and the representation. The lexicographic ordering basis set is defined as Using Eqs.
(1)-(3), the target vector is given as where the superscripted T denotes a transpose operation to the vector. Assuming that κ can be modeled as multivariate complex Gaussian distribution, the probability density function (pdf) of κ is described as [27,[33][34][35][36] where det(X) denotes a determinant of X, m is the dimension of κ. In the case of Eq. (4), m = 4. The superscripted dagger denotes complex transpose. C is a positive semidefinite covariance matrix that is defined as C = E{κκ † }, where E{X} denotes the expectation of X. C can be estimated by † 1 1 , which is known as the maximum likelihood estimator of C [34], and where n is a sampled number, κ i denotes the i-th target vector in the n averaged samples. It is also known that Z has a complex Wishart distribution as [27,[33][34][35][36] ( ) ( 1)/2 where Γ(X) is a gamma function.
In the case of {Ψ L }, we use a new symbol T instead of Z for clarity, and define T as [27] 2 where the overline denotes ensemble average, assuming that the sample is wide-sense stationary and ergodic in the region of interest. By definition, T is a positive semidefinite Hermitian matrix as well as C and Z.

Cloude-Pottier decomposition
In Cloude-Pottier decomposition [26][27][28], T is rearranged by matrix diagonalization as †  (10) where A is a diagonal matrix with elements of four eigenvalues, and U is a unitary matrix that consists of four 1 × 4 eigenvectors. The inequalities of the eigenvalues is given because T is a positive semidefinite Hermitian matrix. The eigenvalues in Eq. (10) are normalized as 4 , 1, 2,3, 4, so that 0 ≤ λ i ʹ ≤ 1. The eigenvalue λ i ʹ can be regarded as a probability that the corresponding eigenvector e i occurs in a sense of multinomial process [37]. In the case of {Ψ L }, the eigenvector e i is rearranged to a Jones matrix L i as As described above, this incoherent approach to derive L i by Cloude-Pottier decomposition is applicable for the measured Jones matrix in Eq. (1). However, it does not show the local polarization property of the sample, because all influence before and after the backscattering at the sample is accumulated in Eq. (1) along the depth. The cumulative attribute can be eliminated by calculating a local Jones matrix [10,38,39]. Instead of the cumulative Jones matrix in Eq. (1), we use Eq. (19) in [40] as the local Jones matrix, where we also processed the data by k-dependence correction. Since a product of two multivariate Gaussian distributions is also a multivariate Gaussian distribution [41], the statistical description in the above also holds for the local Jones matrix. In the case of the local Jones matrix, it was weighed as where the same notations in [40] are used in Eq. (13); sample ′ E  is a measured Jones matrix of the sample with k-dependence correction, and two Jones matrices that are axially separated for δz centered at a depth z are used. 2 X denotes an operator norm. The Cloude-Pottier decomposition is then used to estimate the local Jones matrix. A local phase retardation R i of L i is derived as where ( ) Assuming that the sample has a dominant scattering mechanism, L 1 and R 1 that are paired up with the largest eigenvalue λ 1 can be regarded as the Jones matrix and the local phase retardation of the dominant scattering mechanism, respectively [26]. In addition, when the sample is modeled as multinomial process [37], an expected value of the phase retardation can be defined as In addition, optic axis of L i is derived from an eigenvector that is paired to ( ) 1 i ξ . The eigenvector is converted to Stokes vector, and the angle relative to a reference can be calculated.
To show a degree of randomness of the Jones matrix in a region of interest, an entropy, which was originally developed for quantum mechanics [42] and has been applied to statistical optics [23,43,44], is derived in the von Neumann sense as [26,37,45] 4 log , where the logarithmic base is set to be m = 4 so that 0 ≤ H ≤ 1. When H = 0, the sample has completely homogeneous Jones matrix. When H = 1, the sample has a completely random Jones matrix. In practice, the measured entropy is between them.

Asymptotic quasi maximum likelihood estimator (AQ-MLE) of the eigenvalues
It is known that the entropy is underestimated if the number of samples for ensemble average of the covariance matrix is low [33,46,47]. To mitigate the underestimation of the entropy, we use AQ-MLE of the eigenvalues [33]. Since the probability density function of the eigenvalues is too complicated and is not an analytical expression, López-Martínez et al. introduced some approximations to derive the estimator in [33].
The eigenvalues of the ensemble averaged covariance matrix are estimated by the AQ-MLE as where Ĥ is the estimated entropy.

Correction of the bias in the entropy induced by noise for the non-localized cumulative Jones matrix
Since the entropy shows randomness of the measured Jones matrix, it includes both polarization property of the subject and influence of the noise. When the signal-to-noise ratio (SNR) is low, the bias to the entropy would be high, resulting in unreliable quantitative evaluation of the polarization property of the subject. Compared to polarimetric radar imaging, this influence may be more prominent in PS-OCT, where we are often interested in an axially deep position whose signal fades exponentially. We therefore develop a correction method to remove the bias of the entropy induced by the noise. As the first step, we develop the bias correction for the non-localized cumulative Jones matrix shown in Eq. (1). Since the random noise and the polarization property of the subject are independent, the following additivity is satisfied; measured subject noise , where H measured , H subject and H noise are the entropies of the measured data, the subject and the noise, respectively. In this section, we focus on the estimation of H noise to remove the bias in H measured . The 4 × 4 covariance matrix can be divided to two subspaces that are related to E 1 and E 2 . The entropy satisfies the following inequality called subadditivity [48]; where H noise (E 1 ) and H noise (E 2 ) are the subspace entropies, and H noise (E 1 ,E 2 ) is the explicit expression as a joint entropy of the noise in the covariance matrix. Assuming that the noises in the subspaces of E 1 and E 2 are independent, Eq. (20) holds equality as Equation (21) shows that the entropy of the noise in 4 × 4 covariance matrix can be derived from the summation of the entropies of the noises in the subspaces, namely, 2 × 2 covariance matrices. Nevertheless, it may be still difficult to utilize the pdfs of the covariance matrix and the eigenvalue [27,33,46] to derive H noise . Hence, we suggest yet another approach in the following.
H noise (E 1 ) and H noise (E 2 ) can be derived from the eigenvalues as where ( ) i j ζ is the j-th eigenvalue of the subspace composed of E i . The eigenvalue has the following relationship [23] with a degree of polarization (DOP) or a degree of polarization uniformity (DOPU) in the case of OCT [21]; where P (i) is the DOPU of the subspace composed of E i . Consequently, it turns out that H noise (E 1 ,E 2 ) can be derived if a relationship between the DOPU and the noise is derived. Assuming that the measured Jones matrix has an independent complex additive white Gaussian noise in each matrix element, the Jones matrix can be described as [49] 1H Note that Eq. (24) is a cumulative Jones matrix along the depth that is characteristic to PS-OCT as described above. Stokes parameters of [g 1H g 1V ] T are defined as and Stokes parameters of [g 2H g 2V ] T are also defined similarly. In [49], biases in s 0 (1) and s 1 (1) by the noise variance were canceled, and ensemble average of the Stokes parameters were used to calculate the DOPU. In contrast, since we would like to derive only the bias component in DOPU, we take a different approach in the following. The Stokes parameters without the noise components are derived as where δ = arg[E 1H E 1V *]. δ is influenced by polarization property of the subject but not by the signal intensity. In addition, it is eventually cancelled in P (i) as seen later. We then define other Stokes parameters using ensemble averaged signal intensities as where we implicitly assume that the signal amplitude is much higher than the noise amplitude for each element of the Jones matrix so that distributions of the measured amplitudes are Gaussian [50] and that the polarization property of the subject has a symmetric pdf of the signal intensity so that it does not bias the ensemble average in Eq. (27). DOPU that shows the bias by the relative relationship between the signal intensity and noise power is derived as P (2) is also derived similarly. Using Eqs. (21)-(23), (27) and (28), H noise is calculated from the measured signal intensities and noise powers. Note that Eq. (28) used ensemble averaged signal intensities but not ensemble averaged Stokes parameters, which enabled us to derive the DOPU of the bias rather than a bias-corrected DOPU of the sample. The additivity of the entropy shown in Eqs. (19) and (21) ensures our approach to separate the individual contributions.

Correction of the bias in the entropy induced by noise for local Jones matrix
Although the equations in Section 2.4 are effective for a certain axial depth, we have not considered cumulative attributes of the Jones matrix along the axial depth in PS-OCT. If a region of interest is set spatially including the axial direction, the resultant DOPU and entropy may have an artifact by high birefringence that changes the Jones matrices rapidly in the axial direction. Hence, we derive H noise of the local Jones matrix in this section.
To calculate the local Jones matrix, two Jones matrices that are axially separated for δz centered at a depth z are used as described in [40] and Eq. (13). Using the same notations in [40] and Eq. (13), we write all matrix elements down as The local Jones matrix is derived as Since the determinant in Eq. (30) has the effect of distorting the noise floor of the matrix, it complicates the estimation of the entropy. We therefore use the following equation instead; To calculate H noise of the local Jones matrix, we use Eq. (31) instead of Eq. (24) and follow the same procedure to calculate P (1) and P (2) . Unlike Eq. (24) where the third term in the right side is mutual information of ε 1 and ε 2 [48]. It can be rearranged using a conditional entropy as where Var(X) and E(X) denote variance and an expected value of X, respectively. Conditional variances that are necessary to calculate Eq. (34) are derived as Equation (34)  and similarly for all other elements.

Experimental results
To validate the theory experimentally, we show measurement results of three static samples; a glass plate (WG11010, Thorlabs, Newton, New Jersey, USA), quarter waveplates (QWP) at 633 nm (WPQ10ME-633, Thorlabs) and at 1310 nm (WPQ10ME-1310, Thorlabs). The QWP at 633 nm has a single-pass phase retardation of 141 nm at 1310 nm wavelength according to the manufacturer, which is converted to a double-pass phase retardation of 1.35 rad. Although it is not exactly a double-pass phase retardation of one eighth waveplate (EWP), we call it EWP for expedience. The PS-OCT system used for the experiment, which had a center wavelength of 1297 nm, was described in [40]. In the case of the measurements of the static samples, the axial separation δz that corresponds to a thickness of the sample is too long to ignore nonlinear frequency response of the signals between the OCT channels (cf [51].). To create correction parameters, we measured 2% intralipid as an axially-distributed non-birefringent scattering sample. Equations (9) and (10) of [40] were calculated for the axial depth domain, namely, q HV (z) and q 12 (z), but not for k-dependence in this case. They were averaged for 800 × 128 Ascans, and the depth-dependent signal amplitudes and phases were smoothed by 11th-order polynomial fit. Inverse matrices of the resultant Q HV (z) and Q 12 (z) were multiplied to the Jones matrix as shown in Eq. (15) of [40] to the axial depth domain.
For the analysis of the static samples, we extracted each single pixel of the front and back surfaces, respectively, to apply the ensemble averaging. We note that we did not use axially neighboring pixels around the front and back surfaces for the ensemble averaging in the case of the static samples.

Estimation of the phase retardation
To evaluate the estimation performance of Cloude-Pottier decomposition at high-entropy condition, the local Jones matrix between the front and back surfaces of the static samples were measured, and the phase retardation was estimated by mean, E(R) in Eq. (15) and R 1 in Eq. (14). The results of the glass plate, the EWP and the QWP are plotted in Fig. 1(a), 1(b) and 1(c), respectively. In Fig. 1, the plot of each A-scan shows the phase retardation calculated by each single A-scan. The plots of mean, E(R) and R 1 were calculated using all 0 to n-th A-scans for the estimation. The entropies of (a), (b) and (c) were 0.74, 0.62 and 0.66, respectively. The effective SNRs (ESNRs) [10,52] of (a), (b) and (c) were 3.50, 5.27 and 4.88 dB, respectively. The mean and E(R) were biased when the true value is close to the sides of the measurable range as shown in Fig. 1(a) and 1(c). The biases of the mean and E(R) were because of the asymmetric distribution and noise components in R i (i ≥ 2), respectively. The biases of the mean and E(R) in Fig. 1(b) were smaller than (a) and (c), because the distribution was nearly symmetric. In contrast, R 1 converged on the true value in all of Fig. 1(a), 1(b) and 1(c) with less than ~32 A-scans. Hence, R 1 would be the most effective way of unbiased estimation even if the SNR is low.

Estimation of the optic axis
Since the optic axis is wrapped when the phase retardation is exactly π [38], the EWP was measured to evaluate the estimation of the optic axis. The orientation was set from 0 to 180 degrees with a 10-degree step, and the orientation was estimated by the eigenvector of L 1 in Cloude-Pottier decomposition. The eigenvector was converted to the Stokes parameters, and a relative orientation to 0 degree that used 800 A-scans for the ensemble average of T was calculated. A wrapping over 90-degree orientation was unwrapped manually. The estimated optic axis orientation is shown in Fig. 2. To see the influence of the number of samples, the ensemble average of T was calculated using 8, 16,32,64 or 128 A-scans. This calculation was repeated in 800 A-scans to see the distributions of the estimation results. The entropy and the ESNR were 0.25 and 12.42 dB in average, respectively.    Fig. 1 and Fig. 2 showed that the Jones matrix could be estimated by L 1 in Cloude-Pottier decomposition without the bias.

Estimation of the entropy by AQ-MLE
To evaluate the AQ-MLE shown in Eqs. (17) and (18), we measured the glass plate, the EWP and the QWP with settings of balanced and unbalanced SNRs. The settings are summarized in Table 1, where signal intensities relative to the maximum Jones-matrix element are shown. Note that the terminologies of the balanced-and unbalanced-SNR settings indicate relative relationships of the signal intensities between the Jones-matrix elements, but these are not related to balanced photoreceivers of the photodetection system. Since it is impossible to evaluate whole of SU(2) space experimentally, we only use these settings in Table 1 to confirm that the entropies H, Ĥ, and H noise follow our theory experimentally in Sections 3.3, 3.4 and 3.5. In addition, the signals were attenuated arbitrarily to see the convergence of the entropy in various conditions. The results are shown in Fig. 3. In this evaluation of AQ-MLE, only the local Jones matrix were used to calculate the entropies.  Fig. 3, the entropy without AQ-MLE, H, was generally underestimated at small numbers of A-scans, n ≤ ~20. The signals with high H had high variance and required more n till they converged. The underestimation was mitigated in the entropy with AQ-MLE, Ĥ, at the small n, although the Ĥ still had remaining variance before the convergence. Both H and Ĥ converged to the same entropy at high n as expected in Eq. (17). As the signal attenuation was lower, H was lower and the difference between H and Ĥ was small even at small n. These observations were common in all of the samples and settings in Table 1, indicating that AQ-MLE was a robust and effective method to mitigate the underestimation of the entropy.  Table 1.

Estimation of H noise at a single depth
Since H subject of a static sample is 0 temporally, H measured of the static sample should equal H noise in Eq. (19). To validate the theory in Section 2.4 under this simple condition, H noise of the measured cumulative Jones matrix at a certain depth is calculated from the measured data with the same settings shown in Table 1. Using the equations in Section 2.4, H noise is estimated and plotted with the entropy Ĥ in Fig. 4. In Fig. 4, H noise mostly corresponded to Ĥ with no dependence on the samples and the signal settings. Looking at the plots closely, however, H noise slightly overestimated Ĥ. Although the reason of the overestimation is unclear, we speculate that a portion of the noise in the OCT channels might be correlated because of the depth-encoded configuration. When the entropy Ĥ is higher than 0.4, some plots showed the underestimation of Ĥ by H noise . It would be because the assumption of strong constant phasor plus a weak random phasor sum [50] was not valid.
The study protocol adhered to the tenets of the Declaration of Helsinki and was approved by the institutional review board of Tohoku University Graduate School of Medicine. First, a coherent Jones-matrix filter [40] was applied to the measured cumulative Jones matrix with a kernel size of 3 × 1 pixels (axial × lateral), which corresponds to 20.6 × 15 µm 2 in tissue. The local Jones matrix was then calculated with the axial separation of 2 pixels (13.6 µm in tissue). The spatially moving filter of Cloude-Pottier decomposition was applied to the local Jones matrix with a kernel size of 7 × 11 pixels (axial × lateral), which corresponds to 48.1 × 165 µm 2 in tissue. These spatial kernel sizes were determined by taking a trade-off between the sampling number and the loss of spatial resolution into consideration. A threshold was applied to the pixels by polarization-insensitive signal intensity [40] to exclude the noise area from the ensemble average of T. Figure 6 shows the images of the polarization-insensitive intensity (a), the local retardation R 1 (c), the entropy H without any correction (e) and Ĥ with the AQ-MLE and the correction of the bias H noise (g), respectively. An intensity threshold was applied to Fig. 6(c), 6(e) and 6(g) to color the noise area black. An attenuation coefficient image [53] that was created using the raw data of Fig. 6(a) is shown in Fig. 6(b). Composite images of Fig. 6(c), 6(e) and 6(g) with 6(b) were created by multiplying normalized Fig. 6(b) to individual RGB values of Fig. 6(c), 6(e) and 6(g), respectively.
In Fig. 6(c), trabecular meshwork and the peripheral sclera showed high local retardation as indicated by a white arrow. The contrast of the local retardation between conjunctiva and sclera was visible in Fig. 6(c) and 6(d). The sclera had more organized fibrous tissue at the anterior stroma around the equatorial region as indicated by a light-blue arrow in Fig. 6(c). Although pigmented tissues, such as the posterior region of the iris and ciliary body indicated by light-green arrows, showed high local retardation, these seems to be artificial as discussed in the results of the entropy.
In Fig. 6(e)-6(h), the entropy bias by H noise can be identified at low signal intensity regions. Although the posterior region of the iris and the ciliary body showed high entropy close to 1 in Fig. 6(e), the entropy with the bias correction in Fig. 6(g) showed moderately high entropy as indicated by the light-green arrows. This result is because of melanin in the pigmented tissues, which is common in the retinal pigment epithelium (RPE) [21,[54][55][56][57][58][59]. High entropy of the cornea indicated by red arrows in Fig. 6(e) was corrected in Fig. 6(g) except a part of the anterior corneal stroma, where polarization scrambling effect is seen if the probing beam illuminates the corneal lamellae in a nearly perpendicular direction [40]. The sclera showed inhomogeneous and moderately high entropy in all of the imaged area as shown in Fig. 6(g) and 6(h).
Another interesting region was indicated by a blue arrow in Fig. 6(e) and 6(g), which showed high entropy but low local retardation. Since this region corresponds to the corneoscleral limbus, where the corneal and scleral laminar tissues change transitionally with fibrovascular ridges [60,61], the anisotropic fibrous structure might have anisotropic optic axes in the kernel window and have resulted in the high entropy. It is known that such anisotropic fibrous structure can be found at various biological tissues [62][63][64]. Since the entropy is sensitive to both the phase retardation and the optic axis, the entropy is effective to find any type of anisotropic birefringence except for a special case with a combination of diattenuation and depolarization [65]. As shown in the corneoscleral limbus, the visualization of both the local retardation and the entropy was important to characterize the tissue. In addition, the optic axis imaging will provide more comprehensive understanding of the tissue birefringence [62][63][64], and remains for future studies.

Discussion
The experimental results in Section 3 validated the theory in Section 2. The phase retardation and the optic axis were estimated correctly without the bias. The underestimation of the entropy was mitigated by AQ-MLE. The influence of the noise to the entropy was estimated as H noise from the signal intensities and noise powers, and removed from the images in Fig.  6(g) and 6(h) to show only H subject . Although we created the composite images in Fig. 6(d), 6(f) and 6(h) by linearly multiplying the logarithmic attenuation coefficient image to preserve the original contrast in Fig. 6(b), this operation resulted in decreased brightness in most regions of the images. When subjective viewability matters more than the linearity of the operation, it may be effective to apply nonlinear image processing for contrast enhancement such as adjustment of dynamic range, gamma correction and adaptive methods [67,68].
Since our previous method of Jones-matrix averaging, coherent Gaussian spatial filter [40], was a coherent averaging process, it could not show the statistical property of the signals. In contrast, Cloude-Pottier decomposition enabled to show the statistical property using the entropy. Although these methods have different attributes that make a comparison in such aspect unavailable, both can calculate the local retardation. Using the same raw data of Fig. 6, the local retardation images of the anterior eye segment were calculated and shown in Fig. 7 to compare these processing methods. The kernel sizes of these filtering methods in Fig. 7(a) and 7(b) followed Ref [40]. for the coherent Gaussian spatial filter and Section 3.6 for Cloude-Pottier decomposition, respectively. Although both methods were effective to reduce the speckle noise, some differences were found qualitatively. First, the scleral local retardation in Fig. 7(a) was higher than Fig. 7(b). The coherent averaging in Fig. 7(a) might make the anisotropic birefringence look as high birefringence with speckle noise remained, while the incoherent averaging of Cloude-Pottier decomposition in Fig. 7(b) might make it depolarized and result in lower birefringence relatively. Second, the influence of the polarization scrambling at the ciliary body and the posterior region of the iris was high in Fig.  7(a) but was low in Fig. 7(b). This result indicated that the statistical treatment of the data by Cloude-Pottier decomposition could reduce the artifact in the local retardation caused by the polarization scrambling.  Cloude-Pottier decomposition relies on the mathematical theorems that the covariance matrix C follows the complex Wishart distribution and that Z and T are the MLE of C [34]. Together with various parameters calculated from L i and λ i , this multivariate distribution has also been utilized for the segmentation and classification of geographic objects [27,37,[69][70][71][72][73][74].
These successful applications in radar polarimetry encourage us to adapt their approaches for PS-OCT in future.
To visualize the polarization scrambling, DOPU was developed and applied to the identification of the RPE [21,[55][56][57][58][59]75,76]. In addition, the bias correction of DOPU was suggested [49]. DOPU and entropy are associated through Eq. (23) [23], which we used to derive H noise . Although DOPU is a parameter to characterize the uniformity of the state of EM fields, the entropy is a naturally scalable parameter to characterize the randomness not only of the state of EM field but also of the linear map of EM fields. Another parameter, depolarization index, was also suggested by Lippok et al. [77]. Since it is known that the depolarization index and the entropy have a restricted relationship [78], they would show similar properties. In addition, differential depolarization index was recently suggested by Ortega-Quijano et al., and was shown to be effective for enhancing the depolarization contrast [79] and insensitive to an artifact by the diattenuation [80] like Lorentz depolarization indices [65]. It would be interesting to compare these parameters, and remains for future studies.
Uniqueness of the Cloude-Pottier decomposition has been discussed [81,82]. On the one hand, however, there could be arbitrary ways of the matrix decomposition, as various approaches of the decomposition have been suggested in Mueller formalism [83,84]. On the other hand, a controversial argument was raised about the physical meaning of the decomposed components in the Cloude-Pottier decomposition [85], partly because the relationship between the coherence and the polarization is not fully understood yet. In another perspective, it is known that the concept of the target covariance matrix is mathematically related to the density matrix developed in quantum mechanics [86]. As such, understanding of PS-OCT may progress along with the related research areas.

Summary
In summary, we developed the theory on the unbiased estimation of Jones matrix, birefringence and entropy using Cloude-Pottier decomposition for PS-OCT. The experimental results of the glass plate, the waveplates and the anterior segment of the healthy eye proved the effectiveness of the estimation. Despite the complexity of the theory at a glance, the basic procedure of the calculation is simple; ensemble average of the target covariance matrix, eigendecomposition and some rearrangement. Our approach would be useful in assessing the statistical properties of the Jones matrix measured from biological subjects by PS-OCT and to evaluate the polarization properties of the subjects quantitatively, and additionally could have potential to help clinical examinations with further development.

Funding
This study was supported in part by Tomey Corporation. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.