Bayesian maximum likelihood estimator of phase retardation for quantitative polarization-sensitive optical coherence tomography

This paper presents the theory and numerical implementation of a maximum likelihood estimator for local phase retardation (i.e., birefringence) measured using Jones-matrix-based polarization sensitive optical coherence tomography. Previous studies have shown conventional mean estimations of phase retardation and birefringence are significantly biased in the presence of system noise. Our estimator design is based on a Bayes’ rule that relates the distributions of the measured birefringence under a particular true birefringence and the true birefringence under a particular measured birefringence. We used a Monte-Carlo method to calculate the likelihood function that describes the relationship between the distributions and numerically implement the estimator. Our numerical and experimental results show that the proposed estimator was asymptotically unbiased even with low signal-to-noise ratio and/or for the true phase retardations close to the edge of the measurement range. The estimator revealed detailed clinical features when applied to the in vivo anterior human eye. © 2014 Optical Society of America OCIS codes: (170.4500) Optical coherence tomography; (110.5405) Polarimetric imaging; (170.4460) Ophthalmic optics and devices; (170.4470) Ophthalmology; (170.3890) Medical optics instrumentation; (110.4500) Optical coherence tomography. References and links 1. M. R. Hee, D. Huang, E. A. Swanson, and J. G. Fujimoto, “Polarization-sensitive low-coherence reflectometer for birefringence characterization and ranging,” J. Opt. Soc. Am. B 9, 903–908 (1992). 2. J. F. de Boer, T. E. Milner, M. J. C. van Gemert, and J. S. Nelson, “Two-dimensional birefringence imaging in biological tissue by polarization-sensitive optical coherence tomography,” Opt. Lett. 22, 934–936 (1997). 3. M. Miura, M. Yamanari, T. Iwasaki, A. E. Elsner, S. Makita, T. Yatagai, and Y. Yasuno, “Imaging polarimetry in age-related macular degeneration,” Invest. Ophthalmol. Vis. Sci. 49, 2661–2667 (2008). 4. M. Yamanari, M. Miura, S. Makita, T. Yatagai, and Y. Yasuno, “Phase retardation measurement of retinal nerve fiber layer by polarization-sensitive spectral-domain optical coherence tomography and scanning laser polarimetry,” J. Biomed. Opt. 13, 014013 (2008). #210480 $15.00 USD Received 21 Apr 2014; revised 10 Jun 2014; accepted 11 Jun 2014; published 26 Jun 2014 (C) 2014 OSA 30 June 2014 | Vol. 22, No. 13 | DOI:10.1364/OE.22.016472 | OPTICS EXPRESS 16472 5. M. Pircher, C. K. Hitzenberger, and U. Schmidt-Erfurth, “Polarization sensitive optical coherence tomography in the human eye,” Prog. Retin. Eye Res. 30, 431–451 (2011). 6. B. Elmaanaoui, B. Wang, J. C. Dwelle, A. B. McElroy, S. S. Liu, H. G. Rylander III, and T. E. Milner, “Birefringence measurement of the retinal nerve fiber layer by swept source polarization sensitive optical coherence tomography,” Opt. Express 19, 10252–10268 (2011). 7. S. Zotter, M. Pircher, E. Götzinger, T. Torzicky, H. Yoshida, F. Hirose, S. Holzer, J. Kroisamer, C. Vass, U. Schmidt-Erfurth, and C. K. Hitzenberger, “Measuring retinal nerve fiber layer birefringence, retardation, and thickness using wide-field, high-speed polarization sensitive spectral domain OCT,” Invest. Ophthalmol. Vis. Sci. 54, 72–84 (2012). 8. S. Sakai, M. Yamanari, Y. Lim, N. Nakagawa, and Y. Yasuno, “In vivo evaluation of human skin anisotropy by polarization-sensitive optical coherence tomography,” Biomed. Opt. Express 2, 2623 (2011). 9. B. H. Park, C. Saxer, S. M. Srinivas, and J. S. Nelson, “In vivo burn depth determination by high-speed fiber-based polarization sensitive optical coherence tomography,” J. Biomed. Opt. 6, 474–479 (2001). 10. W.-C. Kuo, M.-W. Hsiung, J.-J. Shyu, N.-K. Chou, and P.-N. Yang, “Assessment of arterial characteristics in human artherosclerosis by extracting optical properties from polarization-sensitive optical coherence tomography,” Opt. Express 16, 8117–8125 (2008). 11. S. K. Nadkarni, “Optical measurement of arterial mechanical properties: from atherosclerotic plaque initiation to rupture,” J. Biomed. Opt. 18, 121507 (2013). 12. M. Villiger, E. Z. Zhang, S. K. Nadkarni, W.-Y. Oh, B. J. Vakoc, and B. E. Bouma, “Spectral binning for mitigation of polarization mode dispersion artifacts in catheter-based optical frequency domain imaging,” Opt. Express 21, 16353–16369 (2013). 13. D. K. Kasaragod, Z. Lu, J. Jacobs, and S. J. Matcher, “Experimental validation of an extended jones matrix calculus model to study the 3D structural orientation of the collagen fibers in articular cartilage using polarization-sensitive optical coherence tomography,” Biomed. Opt. Exp. 3, 378 (2012). 14. P. O. Bagnaninchi, Y. Yang, M. Bonesi, G. Maffulli, C. Phelan, I. Meglinski, A. El Haj, and N. Maffulli, “In-depth imaging and quantification of degenerative changes associated with achilles ruptured tendons by polarization-sensitive optical coherence tomography,” Phys. Med. Bio. 55, 3777–3787 (2010). 15. Y. Yang, A. Rupani, P. Bagnaninchi, I. Wimpenny, and A. Weightman, “Study of optical properties and proteoglycan content of tendons by polarization sensitive optical coherence tomography,” J. Biomed. Opt 17, 081417 (2012). 16. D. Fried, J. Xie, S. Shafi, J. D. B. Featherstone, T. M. Breunig, and C. Le, “Imaging caries lesions and lesion progression with polarization sensitive optical coherence tomography,” J. Biomed. Opt. 7, 618–627 (2002). 17. Y. Chen, L. Otis, D. Piao, and Q. Zhu, “Characterization of dentin, enamel, and carious lesions by a polarization-sensitive optical coherence tomography system,” Appl. Opt. 44, 2041–2048 (2005). 18. R. S. Jones, C. L. Darling, J. D. B. Featherstone, and D. Fried, “Remineralization of in vitro dental caries assessed with polarization-sensitive optical coherence tomography,” J. Biomed. Opt. 11, 014016 (2006). 19. B. Cense, T. C. Chen, B. H. Park, M. C. Pierce, and J. F. de Boer, “In vivo depth-resolved birefringence measurements of the human retinal nerve fiber layer by polarization-sensitive optical coherence tomography,” Opt. Lett. 27, 1610–1612 (2002). 20. B. Cense, T. C. Chen, B. H. Park, M. C. Pierce, and J. F. de Boer, “In vivo birefringence and thickness measurements of the human retinal nerve fiber layer using polarization-sensitive optical coherence tomography,” J. Biomed. Opt. 9, 121–125 (2004). 21. S. Guo, J. Zhang, L. Wang, J. S. Nelson, and Z. Chen, “Depth-resolved birefringence and differential optical axis orientation measurements with fiber-based polarization-sensitive optical coherence tomography,” Opt. Lett. 29, 2025–2027 (2004). 22. S. Makita, M. Yamanari, and Y. Yasuno, “Generalized jones matrix optical coherence tomography: performance and local birefringence imaging,” Opt. Express 18, 854–876 (2010). 23. C. Fan and G. Yao, “Mapping local retardance in birefringent samples using polarization sensitive optical coherence tomography,” Opt. Lett. 37, 1415–1417 (2012). 24. B. Cense, T. C. Chen, B. H. Park, M. C. Pierce, and J. F. de Boer, “Thickness and birefringence of healthy retinal nerve fiber layer tissue measured with polarization-sensitive optical coherence tomography,” Invest. Ophthalmol. Vis. Sci. 45, 2606–2612 (2004). 25. M. Yamanari, K. Ishii, S. Fukuda, Y. Lim, L. Duan, S. Makita, M. Miura, T. Oshika, and Y. Yasuno, “Optical rheology of porcine sclera by birefringence imaging,” PLoS ONE 7, e44026 (2012). 26. S. Nagase, M. Yamanari, R. Tanaka, T. Yasui, M. Miura, T. Iwasaki, H. Goto, and Y. Yasuno, “Anisotropic alteration of scleral birefringence to uniaxial mechanical strain,” PLoS ONE 8, e58716 (2013). 27. S. Sakai, M. Yamanari, A. Miyazawa, M. Matsumoto, N. Nakagawa, T. Sugawara, K. Kawabata, T. Yatagai, and Y. Yasuno, “In vivo three-dimensional birefringence analysis shows collagen differences between young and old photo-aged human skin,” J. Invest. Dermatol. 128, 1641–1647 (2008). 28. S. Sakai, N. Nakagawa, M. Yamanari, A. Miyazawa, Y. Yasuno, and M. Matsumoto, “Relationship between dermal birefringence and the skin surface roughness of photoaged human skin,” J. Biomed. Opt. 14, 044032 (2009). #210480 $15.00 USD Received 21 Apr 2014; revised 10 Jun 2014; accepted 11 Jun 2014; published 26 Jun 2014 (C) 2014 OSA 30 June 2014 | Vol. 22, No. 13 | DOI:10.1364/OE.22.016472 | OPTICS EXPRESS 16473 29. C. Fan and G. Yao, “Imaging myocardial fiber orientation using polarization sensitive optical coherence tomography,” Biomed. Opt. Exp. 4, 460 (2013). 30. M. J. Everett, K. Schoenenberger, B. W. Colston Jr, and L. B. Da Silva, “Birefringence characterization of biological tissue by use of optical coherence tomography,” Opt. Lett. 23, 228–230 (1998). 31. L. Duan, S. Makita, M. Yamanari, Y. Lim, and Y. Yasuno, “Monte-carlo-based phase retardation estimator for polarization sensitive optical coherence tomography,” Opt. Express 19, 16330–16345 (2011). 32. E. Götzinger, M. Pircher, B. Baumann, T. Schmoll, H. Sattmann, R. A. Leitgeb, and C. K. Hitzenberger, “Speckle noise reduction in high speed polarization sensitive spectral domain optical coherence tomography,” Opt. Express 19, 14568–14585 (2011). 33. C. Hitzenberger, E. Goetzinger, M. Sticker, M. Pircher, and A. Fercher, “Measurement and imaging of birefringence and optic axis orientation by phase resolved polarization sensitive optical coherence tomography,” Opt. Express 9, 780–790 (2001). 34. M. Yamanari, S. Makita, V. D. Madjarova, T. Yatagai, and Y. Yasuno, “Fiber-based polarization-sensitive fourier domain optical coherence tomography using b-scan-oriented polarization modulation method,” Opt. Express 14, 6502–6515 (2006). 35. M. Yamanari, S. Makita, and Y. Yasuno, “Polarization-sensitive swept-source optical coherence tomography with continuous source polarization modulation,” Opt. Express 16, 5892–5906 (2008). 36. M. Yamanari, S. Makita, Y. Lim, and Y. Yasuno, “Full-range polarization-sensitive swept-source optical coherence tomography by simultaneous transversal and spectral modulation,” Opt. Express 18, 13964–13980 (2010). 37. B. Baumann, W. Choi, B. Potsaid, D. Huang, J. S. Duker, and J. G. Fujimoto, “Swept source / fourier domain polarization sensitive optical coherence tomography with a passive polarization delay unit,” Opt. Express 20, 10218–10230 (2012). 38. Y. Lim, Y.-J. Hong, L. Duan, M. Yamanari, and Y. Yasuno, “Passive component based multifunctional jones matrix swept source optical coherence tomography for doppler and polarization imaging,” Opt. Lett. 37, 1958–1960 (2012). 39. M. J. Ju, Y.-J. Hong, S. Makita, Y. Lim, K. Kurokawa, L. Duan, M. Miura, S. Tang, and Y. Yasuno, “Advanced multi-contrast jones matrix optical coherence tomography for doppler and polarization sensitive imaging,” Opt. Express 21, 19412–19436 (2013). 40. J. F. De Boer and T. E. Milner, “Review of polarization sensitive optical coherence tomography and stokes vector determination,” J. Biomed. Opt. 7, 359–371 (2002). 41. S. Jiao, M. Todorovic, G. Stoica, and L. V. Wang, “Fiber-based polarization-sensitive mueller matrix optical coherence tomography with continuous source polarization modulation,” Appl. Opt. 44, 5463–5467 (2005). 42. K. H. Kim, B. H. Park, Y. Tu, T. Hasan, B. Lee, J. Li, and J. F. de Boer, “Polarization-sensitive optical frequency domain imaging based on unpolarized light,” Opt. Express 19, 552–561 (2011). 43. B. H. Park, M. C. Pierce, B. Cense, and J. F. de Boer, “Jones matrix analysis for a polarization-sensitive optical coherence tomography system using fiber-optic components,” Opt. Lett. 29, 2512–2514 (2004). 44. D. W. Scott, Multivariate density estimation: theory, practice, and visualization (Wiley, New York, NY [u.a., 1992). 45. D. Freedman and P. Diaconis, “On the histogram as a density estimator:L 2 theory,” Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 57, 453–476 (1981). 46. Y. Lim, M. Yamanari, S. Fukuda, Y. Kaji, T. Kiuchi, M. Miura, T. Oshika, and Y. Yasuno, “Birefringence measurement of cornea and anterior segment by office-based polarization-sensitive optical coherence tomography,” Biomed. Opt. Express 2, 2392–2402 (2011). 47. Y. Yasuno, M. Yamanari, K. Kawana, M. Miura, S. Fukuda, S. Makita, S. Sakai, and T. Oshika, “Visibility of trabecular meshwork by standard and polarization-sensitive optical coherence tomography,” J. Biomed. Opt. 15, 061705 (2010). 48. A. Miyazawa, M. Yamanari, S. Makita, M. Miura, K. Kawana, K. Iwaya, H. Goto, and Y. Yasuno, “Tissue discrimination in anterior eye using three optical parameters obtained by polarization sensitive optical coherence tomography,” Opt. Express 17, 17426–17440 (2009). 49. K. Kawana, T. Kiuchi, Y. Yasuno, and T. Oshika, “Evaluation of trabeculectomy blebs using 3-dimensional cornea and anterior segment optical coherence tomography,” Ophthalmology 116, 848–855 (2009). 50. Y. Yasuno, M. Yamanari, K. Kawana, T. Oshika, and M. Miura, “Investigation of post-glaucoma-surgery structures by three-dimensional and polarization sensitive anterior eye segment optical coherence tomography,” Opt. Express 17, 3980–3995 (2009).


Introduction
Polarization-sensitive optical coherence tomography (PS-OCT) [1,2] is an adaption of optical coherence tomography (OCT), which introduces additional contrast by including birefringence information of the tissue samples. It is particularly useful when investigating birefringent tissues such as those that consist of collagen, muscle, and nerve fibers. PS-OCT has been extensively applied to ophthalmology [3][4][5][6][7], dermatology [8], burn depth imaging [9], plaque characterization in cardiology [10][11][12], articular cartilage imaging [13], tendonopathy [14,15], and dentistry [16][17][18]. Phase retardation is the primary polarization quantity obtained by PS-OCT. The phase retardation image derived by PS-OCT has a similar depth resolution as that of the standard OCT. However, phase retardation is a cumulative quantity and its depth resolution is not obvious. This issue can be solved by calculating a secondary quantity; local phase retardation, which is also denoted as local birefringence. Local birefringence can be obtained by a depth-oriented linear regression of the phase retardation [19,20], Stokes parameter analysis [21], or Jones matrix analysis [22,23]. Birefringence is a non-cumulative quantity, so it enables depth-resolved tissue-discriminatory imaging. Birefringence imaging and investigation have been applied to ophthalmology [24][25][26], dermatology [8,27,28], and cardiology [29].
Despite the high-utility of birefringence and phase retardation imaging, it is difficult to accurately quantify these values. The noise in PS-OCT has a nonlinear influence on the phase retardation, and it adds an intractable systemic bias [22,30]. It significantly hinders the quantification of birefringence and generates image artifacts.
The systemic bias depends on the signal-to-noise ratio (SNR) of the OCT signal [22]. Typically, a SNR higher than 25 dB is required to obtain reasonably reliable phase retardation. The quality of the phase retardation is commonly improved by using a mean average of the phase retardation over multiple observations. Although this removes speckles, it does not reduce the systemic bias [31]. This is mainly due to the following facts. First, the nonlinear computations that derive the phase retardation from complex OCT signals result in an asymmetric noise distribution of the phase retardation. Second, the measurement range of PS-OCT is confined, typically to [0, π]. This results in aliasing and, consequently, the distribution of the measured phase retardation under noise is skewed. This effect is more prominent if the true phase retardation is close to the edge of the range.
An improved averaging method was proposed [32] for Hee-Hitzenberger's PS-OCT scheme [1,33]. Although some accuracy improvements were expected, this method was mainly designed to reduce speckle and does not specifically tackle the noise-induced systemic bias. In addition, this method relies on an intensity averaging of OCT signals. Therefore, it cannot be applied to phase-sensitive PS-OCT schemes such as the general Stokes-parameter-based PS-OCT and Jones-matrix-based PS-OCT.
In this paper, we propose an inductive method for obtaining a bias-free estimation of birefringence (i.e., phase retardation) from multiple measured PS-OCT values. This method is based on the statistical properties of the measured local birefringence under noise. The distribution of the measured local birefringence is defined using a Monte-Carlo method in which additive Gaussian noise is applied to simulated raw complex OCT signals. We thus numerically determine the nonlinear noise properties of the measured local birefringence. We apply a Bayesian statistical approach to calculate the probability distribution of the true birefringence under specific measured birefringence values. The final estimation of the true birefringence is calculated by applying a maximum likelihood estimator (MLE) to the obtained probability distribution of the true birefringence. We validated this Bayesian maximum likelihood estimator (Bayesian MLE) using numerical methods and experiments. We have also demonstrated that the Bayesian MLE reveals birefringence properties of in vivo normal and pathologic anterior eyes that are more consistent with anatomical expectations than the conventional mean averaging technique.

Local birefringence measurement using Jones matrix OCT
Jones matrix OCT (JM-OCT) [34-39] was utilized in this study among several other schemes for PS-OCT [40][41][42]. In a JM-OCT image, each pixel represents a Jones matrix. JM-OCT uses the Jones matrix to provide the polarization properties of tissues such as the round trip phase retardation, diattenuation, relative optic axis orientation, and scattering intensity. Local birefringence can also be obtained using a further depth resolved analysis of the phase retardation [22,23].
Details of JM-OCT theory have been previously published [22,34,35,39]. In this section, we summarize the theory and process for localized birefringence measurements using JM-OCT for a better understanding of our approach.
JM-OCT requires at least two different input polarization states and polarization diversity detections of interference signals to determine the Jones matrix of the sample. From the detected interference signals, we can obtain four OCT signals that correspond to four combinations of the two incident and two detection polarizations. These four OCT signals are associated with the round trip Jones matrices of the sample, J s (z), the illumination optics, J in , and the collection optics, J out . Their relationships can be described as [34,35,39] where superscripts (1) and (2) denote the two orthogonal input polarization states, subscripts out H and out V denote the horizontal and vertical components of the polarization diversity detection, and E(z) represents complex OCT signals along the depth z. J all (z) is the Jones matrix corresponding to each pixel of a JM-OCT image. In this and the following equations, the transversal positions are omitted for simplicity, without losing generality. The round trip Jones matrix of the sample (J s (z)) cannot be determined from this equation because J in and J out are unknown. We can alternatively derive the similar matrix J s (z) [43], where z 0 represents the depth location of the sample surface. The eigenvalues of this similar matrix are obtained using matrix diagonalization [22,35,39]. Because the similarity transform preserves the eigenvalues of the matrix, the eigenvalues of this similar matrix are identical to those of J s (z). Hence, the round-trip phase retardation of the sample can be obtained using the phase difference between the eigenvalues. The phase retardation obtained by Eq. (2) is a cumulative quantity of the depth. Therefore, it does not give a correct quantification of the birefringence at a local point. The local polarization properties of the sample around z are obtained by modifying Eq. (2) to where J s (z 2 )J s (z 1 ) −1 represents the localized polarization property of the sample between two depth points z 1 ≡ z − Z d /2 and z 2 ≡ z + Z d /2, with Z d a small depth separation. It is clear that the obtained Jones matrix J l (z; Z d ), which is denoted as the local Jones matrix, is a similarity transform of J s (z 2 )J s (z 1 ) −1 . Therefore, its eigenvalues are identical to those of J s (z 2 )J s (z 1 ) −1 . Assuming homogeneous birefringence with uniform optic axis orientation in the region between z 1 and z 2 , a local phase retardationδ (z; Z d ) can be obtained from the phase difference between two eigenvalues of J l (z; Z d ) [22]. The local birefringence around z is then derived from the linear relationship of the local phase retardation and the depth separation, where k represents the wavenumber corresponding to the center wavelength of the OCT probe beam. Thus, JM-OCT can perform local birefringence imaging.

Noise modeling for the Jones matrix OCT
The purpose of this paper is to present an estimator that corrects the measurement error in the phase retardation obtained by JM-OCT, or equivalently the local birefringence. Therefore, it is important to properly model the effect of the noise in JM-OCT. As described in Section 2.1, JM-OCT derives the phase retardation from four OCT signals. In these four OCT signals, the noise can be modeled as an additive noise with a complex Gaussian distribution. In our matrix formulation, it is written as with where N(z) is a noise matrix, and n 1···4 are random complex noises. The complex noises are independent from each other, and their real and imaginary parts obey zero-mean Gaussian distributions with the same standard deviation σ n . If we consider this noise, the equation of local Jones matrix [Eq. (3)] is modified to where J l (z; Z d ) is a local Jones matrix affected by the noise. In a practical JM-OCT measurement, the phase retardation obtained from this local Jones matrix is affected by noise. The process of deriving the phase retardation from the Jones matrix is nonlinear, so its distribution is no longer Gaussian or symmetric if noise is present. In addition, as shown in previously published numerical analyses, the mean and mode of the measured phase retardation are not representative of the true phase retardation [22,31].
Although the distribution of the phase retardation is too complex to be fully analytically described, Makita et al. showed that it can be defined by three independent parameters; the angle of two incident polarization states on a Poincaré sphere ζ , the true phase retardation, and a parameter called the effective signal-to-noise ratio (ESNR) γ, as depicted by Eq. (38) of Ref. [22]. The ESNR is the harmonic mean of the signal-to-noise ratios (SNRs) of the OCT signals that are used to obtain J l (z), and is defined as where SNR (1,2) (z) is the SNR of the OCT signals obtained with the first or second incident polarization states and is defined as where 2σ 2 n is the power of the complex noise defined by the standard deviation of the distribution of the real and imaginary parts of the additive noise σ n . In practical implementations of JM-OCT, ζ becomes a fixed parameter (i.e., a constant), which is defined by the system design. Hence, the distribution of the measured phase retardation under noise can be fully characterized by only two independent parameters; the true phase retardation and ESNR, γ.
As previously discussed, the distribution of the measured phase retardation under noise is neither symmetric nor Gaussian. Therefore, its mean does not approach the true phase retardation, even if we have a large number of measurements. If the ESNR is high, the distribution becomes narrow and the mean provides a rational estimation of the true phase retardation. However, the required ESNR for a rational estimation is around 25 dB [22,31], which is not always achievable in clinical measurements. This limitation of the mean estimation of phase retardation (and similarly of local birefringence) has motivated us to develop an alternative estimator that is capable of properly estimating the local birefringence at a low ESNR.

Distributions of measured and true birefringence
The estimator proposed in this paper is based on the association between two distributions.
The first distribution is the probability distribution of the measured local birefringence b at a specific true birefringence value β and a specific ESNR γ. And it is represented by a distribution function of f (b; β , γ). Since this is a probability distribution, therefore it is normalized so that its integral on b is unity. Note that this distribution is characterized by two parameters, β and γ, as discussed in Section 2.2. In practice, b is a function of z, as shown in Eq. (4). However, we have omitted it for simplicity without sacrificing generality. Practically, this distribution function can be determined using a histogram of the measured birefringence obtained by multiple measurements at a single point.
The second distribution is the probability distribution of the true birefringence when a specific birefringence and ESNR has been measured. And it is expressed by a distribution function p(β ; b, γ). This distribution is also referred to as a posterior distribution, because it is a probability distribution determined only after the measurement was performed. The purpose of a birefringence measurement is to know the true birefringence, so this probability distribution is our main interest.
When this distribution is known, we can use several techniques to estimate the true birefringence. For example, the mean estimation of the true birefringence of a measured birefringence of b with ESNR γ is and similarly, the maximum likelihood estimation (mle) is Although we are actually interested in finding p(β ; b, γ), it is an inverse problem. And hence, it cannot be directly obtained from measurements or numerical simulations. Note that, in this paper, the maximum likelihood estimation is denoted as mle (in lowercase), while the maximum likelihood estimator is denoted as MLE (in uppercase). In our estimation scheme, we obtain p(β ; b, γ) from f (b; β , γ) using Bayes' rule. Bayes' rule for the probability distributions leads to where π(β ) represents a priori knowledge of the probability distribution of the true birefringence, which is called as the prior distribution. This equation also can be presented in the following form The parameter β and variable b of f (b; β , γ) in Eq. (12) are treated as a variable and a parameter in Eq. (13), respectively, without a loss of generality. In this form, f (β ; b, γ) is referred to as a likelihood function of β .

Bayesian maximum likelihood estimator of the true birefringence
The basic strategy for our estimator is to first estimate the probability distribution of the true birefringence for a measured birefringence and ESNR. This estimation uses Eq. (13) with a numerically predefined f (β ; b, γ). Then, the mle of the true birefringence is derived from the posterior distribution.
We require at least one measurement for this estimation, but a larger number of measurements increases the accuracy of the estimation. In this section, we first describe the estimation algorithm for a single measurement and then extend this to the multiple measurements case. Finally, we also define a metric that quantifies the reliability of the estimation.

Bayesian maximum likelihood estimator for a single measurement
Assume that we have measured a birefringence of b 1 and an ESNR of γ 1 . For this measurement, Eq. (13) says that the posterior distribution of the true birefringence p 1 (β ) is where the subscripts to the right of vertical line indicate that the observed measurement set was (b 1 , γ 1 ). Note that π(β ) in Eq. (13) is assumed to be a uniform distribution to derive Eq. (14), because we do not have any a priori knowledge on the true birefringence before taking the measurement. p 1 (β ) is the probability distribution of the true birefringence given the measured birefringence of b 1 and ESNR of γ 1 . Here the ESNR is determined by the measured OCT signal energy and the predefined noise energy. When implementing this estimator, f (β ; b 1 , γ 1 ) (or more generally f (β ; b, γ)) is obtained from numerical simulations, as described in Section 3.1. The mle of the true birefringence is then obtained by substituting Eq. (14) into Eq. (11) to get β

Bayesian maximum likelihood estimator for multiple measurements
The mle for multiple measurements is an extension of the estimation process for a single measurement. Assume that we have measured two sets of birefringence and ESNRs for (b 1 , γ 1 ) and (b 2 , γ 2 ). The first estimation of the posterior distribution of the true birefringence p 1 (β ; b 1 , γ 1 ) is obtained from the first measurement set, as in Eq. (14). This distribution is then updated using the second set of measurements by substituting (b 2 , γ 2 ) into Eq. (13) and using p 1 (β ; b 1 , γ 1 ) as the prior distribution π(β ). The posterior distribution of the true birefringence after two measurements is This equation can be generalized for N measurements by repeating the process to obtain Eq. (16) and The mle after N measurements is calculated by substituting It is clear that the mle with a single measurement [Eq. (15)] is a specific case of Eq. (18) with N = 1.

Reliability measure of the estimation
One additional benefit of the Bayesian MLE is that it provides not only the true birefringence estimation, but also the full probability distribution. We can use this distribution to derive a quantitative metric for the reliability of the estimation.
In this study, we have used the locally integrated likelihood value around the estimated true birefringence as the reliability measure. Here Δβ is a small predefined interval of true birefringence, which is typically the resolution of a numerically generated likelihood function (see Section 3.1). Note that the posterior distribution p N (β ) is normalized because it is a probability density function, i.e., S p N (β )dβ = 1. S represents the range of the true birefringence, which is typically predefined when numerically computing f (β ; b, γ). This locally integrated likelihood L is used to form a pseudo-color image of the estimated birefringence, as described in Section 3.4.

Numerical implementation of the Bayesian maximum likelihood estimator
To numerically implement a Bayesian MLE based on Eq. (18), we need to predefine the likelihood function f (β ; b, γ). In our implementation, the likelihood function is numerically defined by a series of Monte-Carlo calculations, which are based on the noise model described in Section 2.2.
A single set of Monte-Carlo calculations consists of 65,536 trials. For each trial, two Jones matrices (J all (z 1 ) and J all (z 2 )) are generated such that the birefringence between them is set to a particular value depending on the true birefringence β . Gaussian noises are added to each entry of the Jones matrices, where the standard deviation of the noise (σ n ) is set to ensure a ESNR value of γ. The measured birefringences b of each trial are calculated using Eqs. (4) and (7). An averaged shifted histogram [44] is derived from the distribution of b, where the bin size is defined by the Freedman-Diaconis rule [45], and the number of bins is 1024. This histogram is used as a distribution function. Namely, a single set of Monte-Carlo calculations provides the distribution function of the simulated b under a particular true birefringence β and ESNR γ, This Monte-Carlo calculation is repeated for 201 true birefringence values ranging from 0 to 0.0088 with a resolution of 0.000044, and for ESNR values from 0 to 40 dB with a resolution of 1 dB. The true birefringence range corresponds to the phase retardation range of 0 to π rad under a chosen depth separation (Z d ), which is 37 μm (6 pixels) in this paper. This series of Monte-Carlo calculations provides a set of distribution functions f (b; β , γ) with several values of β and γ. And hence, in the implementation, f (b; β , γ) can be represented as a three-dimensional (3-D) numerical array. By transposing this 3-D array, it can be regarded as a set of likelihood functions with several b and γ values, f (β ; b, γ). Lanczos interpolation is then used to obtain likelihood for 1024 true birefringence values from f (β ; b, γ), which ensures a birefringence estimation resolution of 0.0000086. The Lanczos interpolation has a kernel size of 40 sampling points (Lanczos' parameter of 20). To avoid ringing artifacts, the data is extrapolated to 10 data points at both edges before interpolation. Note that this Monte-Carlo calculation is only performed once, and the resulting 3-D numerical array is stored for the estimation process.
In the estimation process, the 3-D array is indexed by each set of measured values of (b i , γ i ). A 1-D array of f (β ; b i , γ i ) is obtained for each measurement. When using multiple measurements, the posterior distribution [Eq. (17)] is defined as an entry-wise product of the 1-D arrays for each measurement. Finally, a maximum search is performed on the product to implement Eq. (18), and an mle is obtained.

Jones matrix OCT system
A swept-source Jones matrix OCT was utilized to experimentally validate the method. The light source is a polygon-based wavelength swept laser (HSL-2000, Santec, Aichi, Japan), which has a sweeping frequency of 30 KHz and a center wavelength of 1.3 μm. The depth resolution was measured to be 9.2 μm in tissue.
Two orthogonal polarization states of the incident light were multiplexed by polarization dependent phase modulation, using a fiber-inline electro-optic modulator (PC-B3-00-SFAP-SFA-130-UL; EOspace, Redmond, Washington). The probe arm of the interferometer was attached to a scanning head designed for clinical anterior-eye measurements. The spectral interference signals were detected by a polarization diversity detection scheme, where the interference light was split into horizontal and vertical polarization components and independently detected using two balanced photodetectors. The combination of the modulation based multiplexing of the incident beam and the polarization diversity detection simultaneously provided four OCT signals, which correspond to the four entries of J all [Eq. (5)].
More details of this system can be found in Ref. [46].

Data acquisition protocol
The Bayesian MLE requires multiple sets of measurements at each position in a sample. We used two different protocols to obtain the data sets in this study. Protocol-1 is repetitive B-scan measurement, which scans the same location on a sample multiple times. In this protocol, we obtain multiple birefringence values at exactly the same location of the sample.
Protocol-2 is based on a single B-scan. The mle is obtained from the pixels within a small spatial region, which is called a kernel. With this protocol, the MLE uses the birefringence from pixels at slightly different positions. This protocol relies on an assumption that the true birefringence does not significantly vary within the kernel. Despite this additional assumption, this protocol is particularly useful for volumetric in vivo measurements, where it is hard to obtain multiple B-scans at a single location.

Image formation
A pseudo-color birefringence image enhances the clinical understandability of a birefringence tomography. In this map, the brightness of each pixel is defined by the scattering intensity The scattering intensity is defined as an average of the intensity of two OCT images from two JM-OCT detectors with a single incident polarization state. We use the incident polarization state whose images appear closer to the reference depth (and hence have higher SNR). This scattering intensity is insensitive to sample birefringence.

Numerical validations
We first numerically validated the Bayesian MLE using simulated data sets generated by the Monte-Carlo method.
For the first numerical validation, a data set was generated with ESNR values from 5 to 40 dB at1-dB steps, with seven set true birefringence values. The depth separation for the local birefringence calculation was Z d = 37 μm. For each combination of the ESNR and true birefringence, we simulated 1024 measurements and calculated the Bayesian mle and mean birefringence. The results were then converted into phase retardation for easy interpretation. Fig. 1 shows the Bayesian mle (blue curve) and mean phase retardation (red curve) over the ESNR range, where each curve corresponds to a true birefringence value, indicated by a dashed horizontal line. It is shown that the Bayesian MLE provides rational estimation through the ESNR range except at ESNR less than 6 dB with the true phase retardation of π/6 rad. On the other hand, the mean phase retardation significantly departs from the true birefringence even at relatively high ESNR of around 15 to 20 dB. Especially for true phase retardations of 0 and π rad, the mean phase retardations are significantly biased from the true phase retardations even at 25 dB. It is clear that Bayesian MLE provides significantly more rational estimation than the mean phase retardation, especially for low ESNR.
For the second validation, 25 measurements were simulated for 10-and 20-dB ESNR. The true birefringence was set to β = 0.002, and the depth separation for the local birefringence calculation was Z d = 37 μm. The phase retardation values for each simulated measurement are shown in Figs. 2(a) (10-dB ESNR) and 2(b) (20-dB ESNR) represented by cross (×) symbols, and the true birefringence is represented by horizontal dashed lines. The Bayesian mle and the mean measured birefringence are also plotted in Figs. 2(a) and 2(b) respectively as blue and red curves with circles. In these plots, the horizontal axis indicates the number of measurements used by the estimation. It is clear that as the number of measurements increases, the mle asymptotically approaches the true birefringence. On the other hand, the mean erroneously converges to a value that has a significant bias when compared with the true birefringence for the case of 10-dB ESNR [ Fig. 2(a)].
The likelihood functions f (β ; b, γ) corresponding to the 1st, 11th, and 21st simulated measurements are shown in Figs. 2(c) (for 10-dB ESNR), and 2(e) (20-dB ESNR), respectively with red, green and blue curves. Similarly, the posterior distributions p 1 (β ), p 11 (β ), and p 21 (β ), generated using the 1st (red), 1st to 11th (green), and 1st to 21st (blue) measurements are shown in Fig. 2(d) (10-dB ESNR), and Fig. 2(f)  . It is clear that the posterior distributions become sharper as more measurements are used, as expected. This sharpening can be regarded as an increase in the locally integrated likelihood as the number of measurements increases, as shown in Fig. 2(a) (10-dB ESNR) and Fig. 2(b) (20-dB ESNR). It is noteworthy that in the case of 10-dB ESNR, the mle approaches the set birefringence, although the sharpness of the posterior distribution is relatively low.
We repeated this validation 50 times to more quantitatively examine the unbiasedness and repeatability of the Bayesian estimation. For 10-dB ESNR, the mean and standard deviation of the estimated birefringence over 50 examinations were 0.00197 ± 0.00039 (mean ± standard deviation) for mle and 0.00235 ± 0.00029 for the mean estimation, while the true birefringence was set to be 0.00200. For 20-dB ESNR, it was 0.00200 ± 0.00009 for mle and 0.00201 ± 0.00009 for the mean estimation. These results indicate that the Bayesian mle appears with lower systemic bias than mean birefringence.

Experimental validations with waveplates
We quantitatively validated the method using samples with known phase retardations, including a non-birefringent glass plate, a one-eighth waveplate, and a quarter waveplate, which had round-trip phase retardations of 0, π/2 and π rad, respectively. The double-pass phase retardation and separation (Z d ) between the front and back surfaces was measured using JM-OCT, and the birefringence was obtained using Eq. (4). The estimated birefringence values were then reconverted into double-pass phase retardations so that the results were easy to understand. Different ESNR settings were obtained by changing the probe power using a neutral density filter in front of the sample . 1024 A-lines were obtained for a single position on a sample, and then used in the estimation.
It should be noted that the birefringences of the one-eighth and quarter waveplates do not fit the set true birefringence range of the numerically obtained likelihood function described in Section 3.1. And hence, we numerically generated tailored likelihood functions for the waveplate and glass plate validations so that the true birefringence values were between 0 and π/2kZ d , which corresponds to a 0 to π phase retardation range. Figure 3(a) shows the Bayesian mles (blue crosses) and means of the measured phase retardation (red circles) at each set ESNR. The data points plotted in the upper, middle, and lower regions respectively correspond to the estimations of the quarter waveplate, one-eighth waveplate, and glass plate. The purple dashed lines represent the expected retardations of π, π/2, and 0 rad.
For the one-eighth waveplate (middle plots), both the Bayesian MLE and mean provide good estimations around π/2 rad for all ESNR settings, as expected from the error properties of JM-OCT [22]. expected retardation (π rad), while the mean underestimates the value. This underestimation is especially significant for low ESNRs. This estimation bias of the mean is more significant for the glass plate (lower plots), where the phase retardation is overestimated for low ESNRs.
On the other hand, the Bayesian mle agrees well with the expected retardation (0 rad) over the whole ESNR range. Figures 3(b)-3(d) show the Bayesian mles (blue curves) and mean phase retardations (red curves) of the quarter waveplate, the one-eighth plate, and the glass plate. The purple horizontal dashed lines indicate the expected true phase retardations, and the horizontal axes represent the measurement index and number of measurements used in the estimations. The black dots indicate the measured phase retardations, where the horizontal axis represents the measurement index, e.g. the indexes of A-lines. The estimations were obtained using the measured phase retardations plotted to the left or at the position where the estimation is plotted. The locally integrated likelihood L of the posterior distribution is also plotted (black curve) as a reliability measure of the Bayesian mles. The ESNRs were 9.0 ± 4.8 dB (mean ± standard deviation) for the quarter waveplate [ Fig. 3(b)], 9.1 ± 4.3 dB for the one-eighth wave plate [ Fig. 3(c)], and 8.6 ± 5.5 dB for the glass plate [ Fig. 3(d)].
The mean phase retardation converges to incorrect values for the quarter waveplate (red curve in Fig. 3(b)) and the glass plate (red curve in Fig. 3(c)). However, the Bayesian mle approaches the expected true birefringence in all cases (blue curves). The insets of Figs. 3(b)-(d) show enlarged plots at the region with small number of measurements. It is evident that the Bayesian mles converge on the expected true phase retardation values even with small number of measurements.

In vivo birefringence measurements
To demonstrate the applicability of the Bayesian MLE for in vivo measurements, we used images of a normal anterior eye chamber and a bleb after glaucoma surgery (trabeculectomy). The former was examined using the repetitive B-scan protocol (Protocol-1 of Section 3.3) and the latter was examined using a spatial kernel (Protocol-2). The research protocol adhered to the tenets of the Declaration of Helsinki and was approved by the institutional review board of University of Tsukuba.

Birefringence estimation by repeated B-scans
A normal anterior eye chamber close to the iridocorneal angle was scanned using JM-OCT in vivo. This measurement was performed with Protocol-1, i.e., 12 B-scans covering 6 mm transversal range were obtained at the same location on the eye. An adaptive Jones matrix averaging [39, 46] with a kernel size of 3 (depth) × 5 (transversal) pixels was performed on the measured Jones matrix J all (z) to reduce noise. Local birefringence was obtained using a depth separation (Z d ) of 37 μm (6 depth pixels). The mean birefringence and Bayesian mle at each position were obtained using the 12 measurement values from the 12 B-scans. Figure 4 shows the average scattering intensity of 12 B-scans (a), a representative cumulative phase retardation (b), the mean local birefringence (c), the Bayesian mle of local birefringence (d), the locally integrated likelihood map of the mle, and (e) a pseudo-color birefringence map created by combining the mean intensity, Bayesian mle, and the locally integrated likelihood.
The cumulative phase retardation image [ Fig. 4(b)] clearly reveals a banded appearance around the trabecular meshwork (arrows). This distinctive banded appearance is a characteristic feature that highlights the trabecular meshwork [47]. However, it is hard to segregate the particular trabecular meshwork tissue from this cumulative phase retardation image. This difficulty can be overcome using local birefringence imaging based on either the mean birefringence [ Fig. 4(c)] or Bayesian MLE [ Fig. 4(d)]. In both images, the trabecular meshwork is delineated as a low birefringence region within highly birefringent sclera, as indicated by the white arrowheads in Figs. 4(c) and 4(d). It is clear that a higher contrast and clearer delineation were obtained when using Bayesian MLE. Although the birefringence properties of trabecular meshwork have only been investigated in some PS-OCT studies based on cumulative phase retardation [47], the birefringence properties obtained by the Bayesian MLE are consistent with previously reported cumulative phase retardation images.
Sclera is expected to be highly birefringent because it consists of densely packed collagen. This high birefringence appears around the region indicated by arrowheads in Fig. 4(b), which shows rapid changes in the phase retardation along the depth. The high birefringence is visible in both the mean local birefringence image [ Fig. 4(c)] and the Bayesian mle image [ Fig. 4(d)] as depicted as green regions. The Bayesian mle image clearly revealed fine birefringent structures within the sclera, which are hard to see in the mean birefringence image.
The region above the sclera consists of episclera and conjunctiva. Our previous study demonstrated that this region has low birefringence by comparing the phase retardation image of a porcine eye with its Masson's trichrome histology [48]. This region can be clearly seen as a low birefringence layer, as indicated by the yellow arrowheads in both the mean birefringence [ Fig. 4(c)] and Bayesian mle images [ Fig. 4(d)]. Note that the random appearance at the surface of the sample, which is because the surface of the sample was included in the small depth region for calculating the local birefringence.
The phase retardation appears relatively high at the limbus (green arrowheads in Fig. 4(a)). Since this region does not consist of highly birefringent tissues, this appearance would be an artifact caused by low scattering intensity. The mean birefringence image [ Fig. 4(c)] exhibit the same artifact (moderately green appearance indicated by green arrowheads). In contrast, this region appeares with nearly no birefringence in the Bayesian mle image [ Fig. 4(d)].
A notable contrast difference between the mean birefringence and Bayesian mle images can be seen at the uvea, as indicated by the red arrowheads in Figs. 4(c) and 4(d). Although, the uvea has high values in the mean birefringence image, this is an artifact caused by low signal intensity and low ESNR. The Bayesian mle image has lower birefringence values at the uvea when compared with the sclera, as expected.
The locally integrated likelihood map [ Fig. 4(e)] is a measure of the reliability of the estimation. According to this map, the Bayesian mle at the sclera, episclera, and conjunctiva is highly reliable. The Bayesian mle at the uvea is moderately reliable, but the estimated birefringence is consistent with anatomical prediction. The reliabilities and birefringence estimations can be combined into a single pseudo-color image using the method described in Section 3.4, as shown in Fig. 4(f). This image reduces the risk of misinterpreting the birefringence image and is more suitable for clinical diagnoses. For example, the high-birefringence-artifact at the surface in Bayesian mle image [ Fig. 4(d)] is successfully bleached.

Birefringence estimation from a spatially localized kernel
The performance of the Bayesian MLE with a spatial kernel (Protocol-2) is demonstrated by measuring a trabeculectomy bleb after glaucoma surgery.
The subject is 71 year-old Japanese woman. A trabeculectomy was performed on her left eye two years before the JM-OCT examination. The intraocular pressure was measured to be 15 mmHg with topical application of an ocular hypotensive agent, which indicates that the bleb is not functioning. So it is rational to expect fibrosis in the bleb region.
A volumetric JM-OCT scan was performed on a 12-mm × 12-mm transversal region with a horizontal-fast raster scanning protocol. The JM-OCT volume comprises 128 B-scans, each of which consist of 512 A-scans. An adaptive Jones matrix averaging [39, 46] with a kernel size of 3 (depth) × 5 (transversal) pixels was performed on the measured Jones matrix J all (z) to reduce noise. The local birefringence was obtained with a depth separation (Z d ) of 37 μm (6 pixels). The Bayesian mles of each pixel were obtained by using the measured birefringence and ESNRs within a 3 × 3-pixel spatial kernel. Figure 5 shows a representative horizontal cross section of the scattering OCT (a), cumulative phase retardation (b), mean birefringence within the kernel (c), Bayesian mle (d), locally integrated likelihood map (e), and pseudo-color map (f). A fluid pool can be seen in the scattering image, as indicated by an asterisk (*) in Fig. 5(a). Moderate scattering regions can be seen at the right of the fluid pool (arrowheads), which are known to be associated with bleb function [49].
The cumulative phase retardation image [ Fig. 5(b)] shows large changes of retardation along the depth at the moderate scattering regions (arrowheads), which is an indicator of fibrosis. However, the cumulative nature of the phase retardation obscures the detailed birefringence structure. Both the mean birefringence image some marked differences between the mean birefringence and the Bayesian mle images. The region surrounding the fluid pool has low birefringence values in the Bayesian mle image, but has high to moderate values in the mean birefringence image. These high birefringence values in the mean image would be an artifact of the low scattering intensity and low ESNR over this region.
As was the case for the anterior eye chamber, the pseudo-color image [ Fig. 5(f)] provides a less-ambiguous understanding of the birefringence properties of the bleb. In particular, we can clearly see high birefringence values in the moderate scattering region as indicated by arrowheads. This indicates that there is fibrosis in this region, and it suggests that the functionality of this bleb would be degraded [50].

Computational speed
The entire Bayesian mle process takes a significant amount of time, but the most computationally intensive part of the method is only performed once for each PS-OCT design. Therefore, the estimation can be performed within a practical computational time. The Bayesian mle process consists of two steps: generating the likelihood function using a Monte-Carlo method and estimating each pixel using Eq. (18).
The second step takes 30.5 s for a B-scan of 512 × 614 pixels and 9 measurements, when using a custom-made program written in LabVIEW 2011 on a Windows 7 PC with Intel Core i7 Q720 CPU (4 cores, at 1.6 GHz clock frequency) and 8 GB RAM. We have not parallelized the process in our implementation, but this would be easy to do because the estimation process for each pixel is independent. We can easily reduce the time taken by the second step using parallel computing methods such as multi-core CPU processing or GPU processing.
The first step of generating the likelihood function takes significantly longer than the second step. In our implementation using LabVIEW 2011, it takes approximately 23 hours for a likelihood function with 201 true birefringence configurations and 41 ESNR configurations. 65,536 Monte-Carlo incidences are calculated for each configuration, which results in a total of 540,082,176 incidences. This calculation does take a long time, but it is not prohibitive to practical applications because it only needs to be performed once for each system design. The generated likelihood function can be stored and repeatedly used for all subsequent estimations. In addition, it can easily be parallelized because the Monte-Carlo incidences are independent.

Comparison with Monte-Carlo based non-Bayesian estimator
The method presented in this paper has a similar framework with our previously published method [31], so called Monte-Carlo-based phase retardation estimator. Both of the methods use Monte-Carlo method to characterize the effect of noise in the phase retardation (or birefringence) measurement, and estimate the true phase retardation (or birefringence) from multiple measurements by using the characterized noise property. In practice, the Bayesian MLE presented in the current paper (new method) was developed as an improved version of the previous Monte-Carlo-based estimator (old method).
The main difference between these two methods is that the new method explicitly distinguishes measured and true birefringence, while the old method was formulated only based on the distribution of measured phase retardation. In the old method, a distribution transform function was determined by using simulated distribution of measured phase retardation, and the transform function was applied to experimentally measured phase retardations. Namely, this method manipulates the measured phase retardation and treats the manipulated result as a true phase retardation. Although it has provided rational estimation of phase retardation, its theoretical legitimacy was not fully clear.
On the other hand, the new method clearly distinguishes the measured and true birefringence in its theory. And these two types of birefringence are related by a well-established formula; Bayes' rule. And hence, the new method has more rational theoretical basis. This higher theoretical legitimacy would have resulted in the better estimation performance especially with samples with true phase retardation close to the edges of the measurable range, i.e., 0 and π rad.

ESNR enhancement by adaptive Jones matrix averaging
The Bayesian MLE requires not only a measured birefringence value but also an ESNR value to estimate the birefringence. As described in Eq. (8), the ESNR is defined from SNRs of pixels which are utilized to compute the birefringence. In our implementation the SNRs are computed from measured signal energies and measured noise floor. It should be noted that, a true SNR is a statistical value, while the computed (measured) SNR is an estimation of the true SNR. Although the estimated SNR well agree with the true SNR for high SNR values, it significantly departs from true SNR for low SNR. And hence, the ESNR defined from the measured SNRs also significantly departs from the true ESNR if the SNRs are low. It consequently results in failure of birefringence estimation.
To avoid this failure, adaptive Jones matrix averaging was applied before applying Bayesian MLE for in vivo measurement, which is frequently presented with low SNR. Since adaptive Jones matrix averaging is essentially a complex averaging, it enhances the SNR. The SNR values are then defined from signal intensities of the averaged OCT images and its noise floor.
In our implementation, the adaptive Jones matrix averaging was applied by using spatial kernel with a size of 5 (transversal) × 3 (depth) pixels. Although this averaging reduces the image resolution, it enhances the SNRs and consequently enhances the estimation accuracy of ESNR. In addition, the reduced resolution relaxes the required spatial accuracy of the pixel collocation utilized for Bayesian MLE. Namely, all pixels utilized for the estimation of a single location should be correlated to each other, structurally, optically, or pseudo-optically. The reduction of resolution by Jones matrix averaging numerically enhances the pseudo-optical correlation. It can positively affect in vivo measurement with protocol-2 (with spatial kernel) or measurement of a sample which has non-negligible motion.

Effect of speckle in estimation
In general, an OCT image is known to be dominated by speckle. Because of the speckle, some pixels in a pixel set utilized for estimation of single point can have bright appearance as bright speckles, while some others have dark appearance as dark speckles. It is noteworthy that the Bayesian MLE intrinsically apply higher weight to the bright speckles than the dark pixels. And hence the mle is not dominated by the estimation error which had potentially occurred by the dark pixels. The details of this intrinsic weighting process are following.
In the formulation in Section 2.2, the speckle is intrinsically modeled as fluctuation of signal (J all ). Namely, the speckle effect has been accounted by the model of the Bayesian MLE. In this model, bright and dark speckles are associated with high and low ESNRs.
In the estimation process, the likelihood function of each measurement is defined not only by the measured birefringence value but also by the measured ESNR value. In general, the likelihood function with high ESNR is sharp, while that with low ESNR is broadened as exemplified in Figs. 2(c) (10-dB ESNR) and 2(d) (20-dB ESNR). As described in Eq. (17), a posterior distribution is defined as the product of all likelihood functions of each pixel. It is clear that the sharp likelihood function of the bright speckle more likely dominate the shape of the posterior distribution than the broadened likelihood function of the dark speckle. This property of the Bayesian MLE results in intrinsic weighting of pixels based on its ESNR values.
This intrinsic weighting process can also be explained in information theory. The sharper likelihood function of a brighter speckle has higher amount of expectancy of information than broader likelihood function. And hence brighter speckle has more contribution to the estimation than the dark speckle.
Since the measured ESNR value is less reliable with darker pixels as discussed in Section 5.3, this intrinsic weighting would positively affect to the accuracy of the estimation.

Conclusions
We have presented a Bayesian MLE for birefringence measurements using JM-OCT. This estimator is based on a Bayesian rule that associates the distribution of the measured birefringence and the probability distribution of the true birefringence. This MLE rationally provides an estimation of the true birefringence and is asymptotically unbiased.
A Monte-Carlo method was utilized to obtain a likelihood function, which was then used to describe the association between the measured and true birefringence distributions. This numerical derivation of the likelihood function is an integral part of the estimator. It simplifies the implementation and accounts for the complex noise properties of JM-OCT.
The performance of the Bayesian MLE was examined using numerical simulations and experiments on samples with known birefringence. The results demonstrated that the Bayesian MLE performed well.
We also applied the Bayesian MLE to in vivo anterior eyes. The Bayesian mle images provided an improved contrast when compared to mean birefringence images, both for the anterior eye angle and a trabeculectomy bleb structure. In particular, trabecular meshwork was clearly visible at the anterior eye angle, and the high birefringence that is associated with fibrosis was visible in the trabeculectomy bleb. Further clinical studies would further emphasize the clinical utilities of this estimator.