Stochastic inversion of ocean color data using the cross-entropy method

Improving the inversion of ocean color data is an ever continuing effort to increase the accuracy of derived inherent optical properties. In this paper we present a stochastic inversion algorithm to derive inherent optical properties from ocean color, ship and space borne data. The inversion algorithm is based on the cross-entropy method where sets of inherent optical properties are generated and converged to the optimal set using iterative process. The algorithm is validated against four data sets: simulated, noisy simulated in-situ measured and satellite match-up data sets. Statistical analysis of validation results is based on model-II regression using five goodness-of-fit indicators; only R2 and root mean square of error (RMSE) are mentioned hereafter. Accurate values of total absorption coefficient are derived with R2 > 0.91 and RMSE, of log transformed data, less than 0.55. Reliable values of the total backscattering coefficient are also obtained with R2 > 0.7 (after removing outliers) and RMSE < 0.37. The developed algorithm has the ability to derive reliable results from noisy data with R2 above 0.96 for the total absorption and above 0.84 for the backscattering coefficients. The algorithm is self contained and easy to implement and modify to derive the variability of chlorophyll-a absorption that may correspond to different phytoplankton species. It gives consistently accurate results and is therefore worth considering for ocean color global products. © 2010 Optical Society of America OCIS codes: (010.4450) Oceanic optics; (010.7340) Water; (100.3190) Inverse problems. References and links 1. J. Zaneveld, “New developments of the theory of radiative transfer in the ocean,” in “Optical Aspects of Oceanography,” N. Jerlov, ed. (Academic Press,, London, 1973), pp. 121–134. 2. S. Duntley, “Light in the sea,” J. Opt. Soc. Am. 53, 214–233 (1963). 3. H. Gordon, O. Brown, and M. Jacobs, “Computed relationship between the inherent and apparent optical properties of a flat homogeneous ocean,” Appl. Opt. 14, 417–427 (1975). 4. A. Morel and L. Prieur, “Analysis of variation in ocean color,” Limnology and Oceanography 22, 709–722 (1977). 5. R. Walker, Marine Light Field Statistics, Wiley serie on pure and Appl. Opt. (John Wiley & Sons, INC., NW, 1994). 6. J. Kirk, “The relationship between the inherent and apparent optical properties of surface waters and its dependence on the shape of the volume scattering function,” (Oxford University Press, 1994), p. 283. 7. H. Gordon, O. Brown, R. Evans, J. Brown, R. Smith, K. Baker, and D. Clark, “A semianalytical radiance model of ocean color,” J. Geophys. Res. 93, 10909–10924 (1988). #117678 $15.00 USD Received 24 Sep 2009; revised 27 Nov 2009; accepted 27 Nov 2009; published 4 Jan 2010 (C) 2010 OSA 18 January 2010 / Vol. 18, No. 2 / OPTICS EXPRESS 479 8. S. Garver and D. Siegel, “Inherent optical property inversion of ocean color spectra and its biogeochemical interpretation I. Time series from the sargasso sea,” J. Geophys. Res. 102, 18607–18625 (1997). 9. F. E. Hoge and P. E. Lyon, “Satellite retrieval of inherent optical properties by linear matrix inversion of ocean radiance models: An analysis of model and radiance measurement errors,” J. Geophys. Res. 101, 16631–16648 (1996). 10. Z. Lee, K. Carder, C. Mobley, R. Steward, and J. Patch, “Hyperspectral remote sensing for shallow waters. 1. A semianalytical model,” Appl. Opt. 37, 6329–6338 (1998). 11. S. Maritorena, D. Siegel, and A. Peterson, “Optimization of a semianalytical ocean color model for global-scale applications,” Appl. Opt. 41, 2705–2714 (2002). 12. Z. Lee, K. Carder, and R. Arnone, “Deriving inherent optical properties from water color: A multiband quasianalytical algorithm for optically deep waters,” Appl. Opt. 41, 5755–5772 (2002). 13. H. Gordon, “Inverse methods in hydrologic optics,” Oceanologia 44, 9–58 (2002). 14. Z. Lee, K. Carder, C. Mobley, R. Steward, and J. Patch, “Hyperspectral remote sensing for shallow waters: 2. Deriving bottom depths and water properties by optimization,” Appl. Opt. 38, 3831–3843 (1999). 15. A. Albert and P. Gege, “Inversion of irradiance and remote sensing reflectance in shallow water between 400 and 800 nm for calculations of water and bottom properties,” Appl. Opt. 45, 2331–2343 (2006). 16. M. S. Salama, A. Dekker, Z. Su, C. Mannaerts, and W. Verhoef, “Deriving inherent optical properties and associated inversion-uncertainties in the Dutch lakes,” Hydrology and Earth System Sciences 13, 1113–1121 (2009). 17. H. Zhan, Z. Lee, P. Shi, C. Chen, and K. Carder, “Retrieval of water optical properties for optically deepwaters using genetic algorithms,” IEEE Trans. Geosci. Remote Sens. 41, 1123–1128 (2003). 18. M. Chami and D. Robilliard, “Inversion of oceanic constituents in case i and ii waters with genetic programming algorithms,” Appl. Opt. 41, 6260–6275 (2002). 19. P. Kempeneers, S. Sterckx, W. Debruyn, S. De Backer, P. Scheunders, Y. Park, and K. Ruddick, “Retrieval of oceanic constituents from ocean color using simulated annealing,” in “Geoscience and Remote Sensing Symposium,” Vol. 8 of IGARSS (IEEE International, 2005), vol. 8 of IGARSS, pp. 5651–5654. 20. W. Slade, H. Ressom, M. Musavi, and R. Miller, “Inversion of ocean color observations using particle swarm optimization,” IEEE Trans. Geosci. Remote Sens. 42, 1915–1923 (2004). 21. R. Souto, H. Campos Velho, S. Stephany, and M. Kampel, “Chlorophyll concentration profiles from in situ radiances by ant colony optimization,” in “4th AIP International Conference and the 1st Congress of the IPIA,” Vol. 124 of Journal of Physics (2008), pp. 1–12. 22. M.S. Salama and A. Stein, “Error decomposition and estimation of inherent optical properties,” Appl. Opt. 48, 4947–4962 (2009). 23. G. Dueck and T. Scheur, “Threshold accepting: A general purpose optimization algorithm appearing superior to simulated annealing,” Journal of Computational Physics 90, 161–175 (1990). 24. W. Gong, Y. Ho, and W. Zhai, “Stochastic comparison algorithm for discrete optimization with estimation,” in “Proceedings of the 31st IEEE Conference,” Vol. 1 of Decision and Controle (1992), pp. 795–800. 25. F. Glover, “Tabu search: A tutorial,” Interfaces 20, 74–94 (1990). 26. R. Rubinstein, “The cross-entropy method for combinatorial and continuous optimization,” Methodology and Computing in Applied Probability 2, 127–190 (1999). 27. R. Rubinstein and D. Kroese, The Cross-Entropy Method: A unified approach to combinatorial optimization, Monte-Carlo simulation, and machine learning, Information Science and Statistics (Springer, New York, 2004). 28. Z. Lee, “Remote sensing of inherent optical properties: Fundamentals, tests of algorithms, and applications,” Tech. Rep. 5, International Ocean-Colour Coordinating Group (2006). 29. R. Pope and E. Fry, “Absorption spectrum (380-700nm) of pure water: II, Integrating cavity measurements,” Appl. Opt. 36, 8710–8723 (1997). 30. C. Mobley, Light and water radiative transfer in natural waters (Academic Press, 1994). 31. A. Bricaud, A. Morel, and L. Prieur, “Absorption by dissolved organic-matter of the sea (yellow substance) in the UV and visible domains,” Limnology and Oceanography 26, 43–53 (1981). 32. O. Kopelevich, “Small-parameter model of optical properties of sea waters,” in “Ocean Optics,” Vol. 1 Physical Ocean Optics, A. Monin, ed. (Nauka, 1983), pp. 208–234. 33. T. Petzold, “Volume scattering functions for selected ocean waters,” in “Light in the Sea,” Vol. 12, J. Tyler, ed. (Dowden, Hutchinson and Ross, Stroudsburg, Pa. USA, 1977), pp. 150–174. 34. V. Singh, Entropy-based parameter estimation in Hydrology, Vol. 30 of Water Science and Technology Library (Kluwer Academic Publishers, Dordrecht, 1998). 35. C. Shannon, “A mathematical theory of communication,” Bell Syst. Tech. J. 27, 379-423, 623-656 (1948). 36. S. Kullback and R. Leibler, “On information and sufficiency,” The Annals of Mathematical Statistics 22, 79–86 (1951). 37. D. Kroese, S. Porotsky, and R. Rubinstein, “The cross-entropy method for continuous multi-extremal optimization,” Methodology and Computing in Applied Probability 8, 383–407 (2006). 38. R. Rubinstein and D. Kroese, Simulation and the Monte Carlo Method, Wiley Series in Probability and Statistics (2008), 2nd ed. 39. R. Srinivasan, Importance sampling: Applications in communications and detection (Springer-Verlag, Berlin, #117678 $15.00 USD Received 24 Sep 2009; revised 27 Nov 2009; accepted 27 Nov 2009; published 4 Jan 2010 (C) 2010 OSA 18 January 2010 / Vol. 18, No. 2 / OPTICS EXPRESS 48


Introduction
The aim of ocean color data inversion is to determine inherent optical properties of the water upper-layer from observed remote sensing reflectance. Theoretically, inherent optical properties (IOPs) can be derived from the radiance distribution and its depth derivative [1]. Ocean color data provide, however, the radiance at the surface in a few directions only. Therefore, semi-analytical models were developed to facilitate the inversion of ocean color data. These models are based on approximations that link remote sensing reflectance and the IOPs [2][3][4][5][6]. The general form of these models is that water remote sensing reflectance is proportional to the backscattering coefficient and inversely proportional to the absorption coefficient. Inversion of ocean color data using semi-analytical models has been investigated in many studies [7][8][9][10][11][12]. The scientific procedure to derive IOPs from ship/space borne ocean color data can be divided into three steps: i-forward modeling, use a semi-analytical ocean color model; ii-parametrization, define the minimal set of IOPs whose values completely characterize the observed remote sensing reflectance using the forward model; iii-inversion, use of ocean color observations to infer the actual values of IOPs. While the first two steps are mainly inductive, the third step is deductive. This paper is devoted to the third step, i.e. explain, implement and validate an inversion method for ocean color data. It is out of the scope of this work to review the literature (C) 2010 OSA on ocean color inversion methods and compare them, one may consult [13] for more details on this subject.
Generally, inversion of ocean color data falls under one of two methods, namely analytical deterministic or stochastic methods. Deterministic methods are based on gradient or pseudogradient techniques and have been extensively used for ocean color inversion [11,[14][15][16]. The main drawback of gradient-based methods is that they do not properly handel non-convex objective functions or many local optima. On the other hand, stochastic methods are less prone to be trapped in local optima and can deal with non-convex functions. The basic idea of stochastic methods is to systematically partition the region of feasible solutions into smaller subregions and move between them using random search techniques. Stochastic optimization techniques have been recently adopted for ocean color inversion, e.g. genetic algorithms [17,18]. Maritorena et al. [11] used simulated annealing to optimizing for the parameters in their semianalytical ocean color model. On the other hand, Kempeneers et al. [19] employed simulated annealing to derive ocean constituents. Swarm optimization [20] and ant colony method [21] were also used to derive water optical constituents from remote sensing reflectance. Salama and Stein [22] used entropy-based method to decompose and quantify the errors of derived IOPs.
There are other stochastic methods that have not been investigated yet for ocean color inversion, e.g. threshold acceptance [23], stochastic comparison method [24], tabu search [25] and cross-entropy [26]. Cross-entropy is one of the most significant developments in stochastic optimization and simulation in recent years [27]. It is a stochastic iterative method that searches for sequence of solutions which converges probabilistically to the optimal solution. The main objective of this paper is to develop a stochastic inversion algorithm for ocean color data based on the cross-entropy method. The performance of the algorithm and its stability to noise will be analyzed using simulated data. Validation exercises will be carried out against ocean color data of in-situ measurements and satellite match-up.
The reminder of this paper is organized as follow: in Section 2 we describe the ocean color paradigm: used ocean color model and its parametrization. The principles of cross-entropy are introduced in Section 3.1, followed by mathematical derivations for ocean color inversion in Section 3.2. The implementation of the inversion algorithm is presented in Section 3.3. In-situ measurements and ocean color satellite match-up data are described in Section 4 along with the employed initial values and statistical analysis. Inversion's performance and stability are analyzed in Section 5, followed by extensive validation exercise with in-situ data in Section 6. Thoroughly discussions of the developed algorithm, its results, limitations and possible extension are presented in Section 7. Main conclusions of this work are listed in Section 8.

Semi-analytical ocean color model
Remote sensing reflectance leaving the water surface can be related to physical and biological properties of water constituents using the model [7]: where, Rs w (λ ) is remote sensing reflectance leaving the water surface at wavelength λ ; g i are constants taken from [7]; t and n w are the sea−air transmission factor and water index of refraction, respectively. Their values are taken from [7,11,28]. The parameters b b (λ ) and a(λ ) are the bulk backscattering and absorption coefficients of the water column, respectively.
Four independently-varying constituents are considered to affect the optical properties of the water column, namely: phytoplankton green pigment i.e. chlorophyll-a (Chla), dissolved organic matter or gelbstoff, detritus and suspended particulate matter (SPM). The bulk absorption a(λ ) and backscattering b b (λ ) coefficients are modeled as being the sum of absorption and where, the subscripts denote the contribution of: water (w), chlorophyll-a (chla), combined effects of detritus and glebstoff (dg) and suspended particulate matter (spm). The absorption and backscattering coefficients of water molecules, a w and b b,w , were obtained from [29,30], respectively. The total absorption of chlorophyll-a a chla is approximated as [14]: where a 0 (λ ) and a 1 (λ ) are empirical coefficients. The absorption effects of detritus and gelbstoff are combined due to the similar spectral signature [11] and approximated using the model [31]: where s is the spectral exponent. The backscattering coefficient of SPM b b,spm is parameterized as [32]: where y is the spectral shape parameter of backscattering. The scattering phase function of SPM was assumed to follow the Petzold's San Diego Harbor scattering phase function [33]. Derived IOPs are called the set of IOPs and expressed in a vector notation as iop [16]: 3. Inversion of ocean color data using the cross-entropy method

Entropy and cross-entropy
Entropy is a numerical measure of information associated with probability distribution of derived IOPs or any hydrological parameter [34]. For a population with N sets of IOPs it is expressed as the Shannon entropy [35]: where iop is the set of derived IOPs [Eq. (7)]; E is the expectation; g(iop) is the probability distribution function (pdf) of the IOPs. The base of the logarithm is taken as e in which case the entropy is measured in "nats". Shannon's entropy calculated by Eq. (8) is defined to be the average amount of information contained in the IOPs. It should be noted that the entropy of IOPs does not depend on the actual values of IOPs, but only on its distribution g(iop). The joint-entropy between two pdfs g and f is: The cross-entropy is the Kullback-Leibler distance [36] which measures the divergence between the two distribution g and f as: Based on Eq. (10), the lower the expected cross-entropy, the closer is the distribution f to g. Therefore, minimizing the cross-entropy leads to the maximal similarity between the two distributions and vice versa. Eq. (10) can be rewritten in terms of Eqs. (8) and (9) as: where, H (g) and H (g, f ) are the entropy of g and the joint-entropy between g and f , respectively.

Inversion
The unknown IOPs [Eq. (7)] can be derived by matching modeled reflectance [Eq. (1)] to observed water remote sensing reflectance. The sought solution iop is the set of IOPs that produces the best-fit spectrum to the observed spectrum, Rs w (λ ). The best-fit spectrum can be searched by designing a performance function that measures the agreement between the observed and modeled reflectance. This function φ (iop) is arranged so that small values represent a close agreement, i.e. least-square: where, m is the number of spectral bands; Rs w (i) and Rs w m (i) are the observed and modeled water remote sensing reflectance at the ith wavelength, respectively. The objective now is to search for a set of IOPs, iop, that minimizes Eq. (12) to a very small value ε min , such: The basic idea of cross-entropy method [27,37] is to generate a family of probability distribution functions pdfs for the IOPs, f , and then converge them to an optimal pdf g. The optimal distribution g has all of its mass concentrated around the sought solution of the inversion problems iop, i.e. the variance of the optimal pdf is zero.
We start by converting the deterministic problem in Eq. (12) to a random one. The randomization can be performed by computing the probability of φ (iop) as being less than a certain value ε such: Equation (14) can be associated with an estimation problem of the form [38]: where E f is the expectation with respect to the pdf f ; are the discrete probability densities of IOPs; iop i is a randomly generated set of IOPs from f (iop i , iop * ) using initial mean iop * . Equation (15) is generally called the associated stochastic problem (ASP).
In importance sampling method [39], the ASP (15) can be rewritten using the optimal density g as: The change of measure with density in Eq. (16) is [26]: The idea now is to choose the reference parameter iop * such that the distance between f and g * is minimal. A suitable measure is the cross-entropy in Eqs. (10) and (11). The optimal pdf can be obtained by minimizing D(g * , f ) in Eq. (11) which is equivalent to maximizing −H (g * , f ) in Eq. (9). Substituting Eq. (17) in (9) and maximizing it for the reference parameters iop * we will have: For pdfs belonging to the natural exponent family solution of (18) can analytically be derived as [26,27]: Equation (19) can be used as an update formula of iop * in our iterative procedure as described in the next Section 3.3. Equation (19) is basically the average of IOP vectors iop i=1,...N that produced best-fit spectra to the observed spectrum, thus φ i ε.
The proofs of Eqs. (15) to (19) are out the scope of this work. The detailed derivations of the cross-entropy method and various applications are given in [26,27]. Their symbols are used, as possible, in this manuscript. De Boer et al. [40] provides an excellent tutorial on the crossentropy method. Its application to continuous multi-extremal optimization is given in [37] with approachable examples. Many links to references and examples on cross-entropy method can be found at http://www.cemethod.org.

Algorithm
Practically, the IOPs are derived using iterative procedure such that ε approaches ε min and the probability of the solution iop approximates 1, i.e. a degenerated pdf around iop with zero variance.
The algorithm is implemented in the following steps: 1. For the first iteration, t = 0, choose the initial values of the mean μ 0 = iop * 0 and standard deviation σ 0 . Sections 4.2 and 7.6 give more details on initial values.

Generate IOPs vectors
. Accept or reject each generated IOP depending whether its value is within a predefined physical bounds, i.e. constraints.
3. Forward the generated IOPs vectors to spectra using Eq. (1). Keep the reference between each IOPs set and its remote sensing reflectance.
5. Set ε equal to the sample quantile (1 − ρ), where ρ is predefined value, e.g. ∼10 −2 and evaluate the indicator function I φ (iop i ) ε . This is simply achieved by ordering the values of φ (iop i ) and selecting the elite samples that belong to the sample quantile (1 − ρ). 6. Corresponds these elite samples to their "elite" IOPs.
7. Derive an updated value of iop * from Eq. (19) and compute the new value of σ t+1 from the elite sample of IOPs.
9. If ε = ε min and σ t γ, where γ is a predefined small value ∼ 10 −5 , terminate, otherwise set t = t + 1 and iterate from step 2 to 9. The criterion σ t γ is equivalent to ∼ 1 in Eq. (16). This stopping criterium is adjusted for noisy data to cope with fluctuated values of ε as follow. Keep track of the last best ten candidates which have smallest values of ε.
Iterate while the variance of these ten candidates is larger than, say 10 −5 .

Data sets
Evaluation and validation of the proposed inversion method is carried out using four data sets: simulated, noisy-simulated, in-situ measured and ocean color match-up data sets.

Analysis
The generated IOPs are constrained to their physical bounds as described in 3 . Initial values of σ 0 are set as ς μ 0 , where ς is a factor varying between 2 and 10 at step 2 interval. This choice was to make the algorithm self contained and to limit the freedom of initialization to μ 0 . Optimal values of σ 0 are then searched using simulated annealing method [43]. Energy function is set to Eq. (15), with the updating in Eq. (19), and "temperature" parameters is set to σ 0 . The procedure iterates through a range of values and selects σ 0 that has the minimal ε. Using simulated annealing to initialize σ results in an algorithm with two loops, an outer loop iterating through σ 0 and an inner loop, which is basically the algorithm in (3.3). A normal distribution, N(μ, σ ), is used to generate the pdfs of IOPs. This, however, does not imply that the actual variability of IOPs follows the normal distribution, because the variance of the final pdf is zero. The number of samples in the generated pdfs is set to 100 and maximum number of iteration is set equal to 100. Goodness-of-fit parameters (slope, intercept, bias and R 2 ) between derived and known values are computed using model-II regression [44] for log-transformed data. The slope and intercept are for a model-II regression line between derived and known values. Perfect fit leads to unity slope and zero intercept and bias. The intercept is computed for the derived values on the Y axis. The values of R 2 is computed as the squared correlation coefficient between derived and known values. The bias is estimated for the log transformed data as: Computing the bias as shown in Eq. (20) means that negative bias indicates overestimation while positive bias indicates underestimation. The root mean square of error RMSE is calculated using Eq. (2.1) from [28] as: (21) where n is the number of data points. The fraction of valid retrieval fr is also computed during the statistical analysis. Derived IOP is considered un-valid if its value falls outside the defined constraints or trapped to zero solution. In our analysis we will use goodness-of-fit parameters obtained from inversion of noise-free IOCCG data set as a benchmark for comparison. In other words, we will compare goodness-of-fit parameters resulting from the inversion of noisy, NOMAD and SeaWiFS match-up data sets to those computed from the optimal situation of noise-free IOCCG data set.   to remote sensing reflectance using the table of Neckel and Labs [46], hereafter called MERIS-NER. The noise were generated using the normal distribution with mean equal to MERIS-NER and standard deviation equal to that of IOCCG simulated spectra. Additional condition was imposed such that the generated values are within ±70% of the original signal. It is believed that a maximum of ±70% off the observed reflectance value is a realistic threshold for an acceptable noise level. This mechanism of adding noise will leverage an average noise level, i.e. noise to signal ratio, about ± 32% of the original reflectance. The average is calculated over all spectral bands and over all spectra. Table 2 shows the introduced noise level per wavelength averaged over the IOCCG spectra as obtained from the noise level of MERIS-NER and standard deviation of IOCCG reflectance. Table 3 shows goodness-of fit parameters between derived and known IOPs as computed from the model-II regression. Derived values of a chla (440) are the most affected among other IOPs. The slope of the regression line is now off by 50% from the 1:1 line with large intercept   (5), was assumed to be comparable to the sum of measured values of detritus and gelbstoff absorptions. Figure 2 shows derived versus measured values of IOPs and Table 4 details goodness-of-fit parameters.

IOCCG data set
The method is adequate to derive the absorption coefficients of Chla, dg and the total absorption. The R 2 values are above 0.8 with slope and intercept that are of comparable magnitudes to the values in Table 1. The slope of derived a chla (440) is 4.5% off-unity with intercept of about -0.31. The RMSE value is merely twice, 0.64, as that of simulated IOCCG data while the bias is positive and about 0.18. Goodness-of-fit parameters of derived a dg (440) are slightly degraded when compared to the values in Table 1. The slope is off the 1:1 line by 20% with negative intercept ∼-0. 26 with positive bias of ∼0.8. The fit of derived total absorption coefficient follows the same trend as that presented in Table 1, i.e. it has better fit to measured values than a chla (440) or a dg (440) alone with R 2 ∼0.91. The slope is less than 4% off-unity with negative intercept. RMSE and bias values are, respectively, about 3 and 35 folds larger than those in Table 1.
Derived values of backscattering coefficients are less accurate in comparison to other IOPs and Table 1. The values of b b,spm (550) are underestimated at the lower end and over estimated at the higher end. There are also 5% of non valid retrievals. The solution was, basically, trapped to zero in these data points. The regression line has a slope which is 34% off-unity and a large intercept up to 2. RMSE and bias values are two folds, 0.363 and -0.108, larger that their counterpart in Table 1. The negative bias of b b,spm (550) indicates that there is slight overestimation. R 2 values is now 0.73 as compared to 0.99 in Table 1. Possible reasons for deriving less accurate IOPs from the validation data sets are discussed later in Section 7.

SeaWiFS match-up data set
Match-up data set contains spectra of SeaWiFS and NOMAD-measured IOP(s) that have similar geographical location and sampling time. SeaWiFS match-up data set consisted of 132 matches for the absorption coefficients and 29 matches for the backscattering coefficient. Figure 3 shows derived IOPs from SeaWiFS spectra versus measured values from NOMAD data set. Goodness-of-fit parameters are detailed in Table 5. Derived values of absorption coefficient are relatively accurate with R 2 being above 0.7. Especially chlorophyll-a absorption with R 2 value above 0.85, 5% off-unity slope and small intercept value ∼-0.2. The RMSE value is comparable to that in Table 1 and did not exceed 0.5. The magnitude of total bias, 0.064, is even smaller than its counterpart in Table 1, in absolute values. The positive bias of 0.064 indicates that the a chla (440) are slightly underestimated. Goodness-of-fit parameters of derived a dg (440) are comparable to those obtained from the optimal case ( Table 1). The slope is still ∼15% off-unity with intercept of about -1. The RMSE value has increased to 1 and R 2 is reduced to 0.7. Total absorption is derived with 8% off-unity slope and -0.4 intercept. The RMSE and bias values slightly increased, compared to Table 1, to 0.45 and 0.25. There is also a strong correlation, R 2 ∼0.91 between derived and measured values. Derived values of backscattering coefficient, similar to the NOMAD case, are less reliable. Model-II regression line has a slope that is 17% off-unity and large intercept of 0.93. The RMSE values slightly increased to 0.25 and the R 2 value dropped to 0.49.

Performance with simulated data
The developed inversion algorithm showed a good performance with simulated IOCCG data. The slope of model-II regression line was close to unity for all IOPs with intercept that did not exceed 0.35. The RMSE values were also acceptable and less than 0.5 whereas the R 2 value were above 0.96 for all IOPs and reaching 0.99 for the total absorption and backscattering coefficients. The RMSE value, ∼ 0.32, of Chla absorption coefficient is one eighth of RMSE ∼ 2.26 value reported in Salama et al. [16]. They used the same ocean color model (section 2) and IOCCG data set but employing a nonlinear minimization technique, i.e. constrained levenberg-marquardt.
The underestimation of a dg (440) and overestimation of a chla (440) compensated each other when computing the total absorption coefficient resulting in the smallest bias 0.003 which is almost 20 folds less than that of the backscattering coefficient.
The results of Fig. 1 support the statement made in Section 4.2: assuming a normal distribution to carry out the inversion does not imply that the actual variability of IOPs follows the normal distribution, because the variance of the solution pdf is zero. The discreet distribution of known values in Fig. 1(a) is uniform and close to uniform for known values in Fig. 1(b) and Fig. 1(c).

Stability
Measurement errors are related to intrinsic noise of the sensor and to residuals from subsequent corrections. Each sensor, ship or space borne, has a noise level that is related to its specification and sensitivity losses over time. In case of noisy data the cost function [Eq. (12)] will have a noise component. Generated IOPs and the updating in Eq. (19) are, however, independent of sensor noise. Therefore it is expected that the inversion method will filter out observation's noise as it was shown in Fig. 4. There is no mathematical prove for this noise-filtering effects but we could numerically demonstrate it in Table 3. The ability of cross-entropy method to filter noise during optimization was also demonstrated by Rubinstein and Kroese [27, chapter 6, pp.203-226] using numerical examples. This is one of the major advantages of the proposed method. It can basically filter the noise up to a maximum of ±70% of the observed signal, equivalent to 32% averaged over spectra and wavelengths ( Table 2). Figure 4 is shown to give a visual perception of the introduced noise level and the stability of inversion from noisy data. In Fig. 4 we plot the best-fit spectra to the noisy-data from which we derived the IOPs, noisy data and original noise-free spectra. We selected 6 spectra, indexed in the data itself as 1, 100, 200, 300, 400, 500. Figure 4 clearly shows how the method was able to filter considerable range of the noise. The derived IOPs are also acceptable for the imposed noise level with R 2 > 0.7 and RMSE < 0.6 ( Table 3). Chlorophyll-a absorption coefficient is the most affected by the introduced noise. This is because noise can randomly introduces dips in reflectance at the absorptions bands of chlorophyll-a around 440 nm and 665 nm which might be interpreted as high chlorophyll-a absorption. Derived values of absorption coefficient seem to be stable to sensor noise as the sum a dg (440) and a chla (440) may compensates the over/under estimations in both parameters. The small value of bias in derived b b,spm can be explained by the random nature of noises. Equal under/over estimation might be introduced by noise, thus eliminating each others when computing the bias.

Validation
The develop inversion algorithm was validate using ocean color data of in-situ measurements and satellite match-up. The derived values of a chla (440) and a(440) are in general reliable.
The slope values of model-II regression were 3-6 % off-unity with small values of intercept and bias and RMSE values below 0.6. The R 2 exceeded 0.8 and 0.9 for Chla and total absorption coefficients respectively. Derived values of a dg (440) were less accurate in both data sets with slope values about 15-20% off-unity and large values of intercept, bias and RMSE. An overestimation of a chla (440) is associated with underestimation of a dg (440) and vice versa. This type of degeneracy, is due to the overlapped absorption peaks at 440 nm of Chla, detritus and glebstoff. This trend was clearly illustrated by increased accuracy of total absorption coefficient over the accuracy of individual components, i.e. a chla (440) and a dg (440). Although derived values of backscattering coefficient were accurate and somehow stable to noise (Tables  1 and 3), less reliable results of SPM backscattering were obtained from NOMAD and Sea-WiFS match-up data. Model-II regression line deviated by up to 35% from unity with large intercept values. Moreover, the solution was trapped to zero in 5% of backscattering data. The R 2 values were reduced to 0.7 for the NOMAD and to 0.49 for SeaWiFS match-up. The sample size of SPM backscattering in the SeaWiFS match-up was, however, small 29. The small sample number increased the vulnerability of statistics in Table 5 to be influenced by outliers, i.e. the two upper-left points in Fig. 3(c). For instance, removing these two points will improved the goodness-of-fit of derived b b,spm (550) in SeaWiFS match-up data set to: [slope = 1.286, intercept = 1.532, RMSE = 0.143, bias = 0.024 and R 2 = 0.855]. The fraction of valid retrieval in this case will be f r = 27/29 = 0.931. There are other sources of uncertainty than the overlapped blue-absorption feature of a chla (440) and a(440) and the small sample of backscattering  in SeaWiFS match-up data, these are discussed in (7.4).

Uncertainty
Ocean color data inversion is associated with many sources of uncertainty that may affect the accuracy of derived IOPs. These sources can be summarized in four major components:// i-Spectral characteristics: NOMAD and SeaWiFS match-up spectra have on average 10 and 13 spectral bands, respectively. Their bands cover the spectral range from 411 nm to 683 nm. In IOCCG data set we used 33 bands covering the spectral range from 400 nm to 720 nm at 10 nm interval. The spectral characteristics of observed remote sensing reflectance has direct effect on the accuracy of derived IOPs. Large number of spectral bands will increase inversion's degree-of-freedom, i.e. number of band minus number of unknowns. In turn, there will be a higher probability in obtaining a better spectral fit to the observed spectrum, hence derived IOPs are less ambiguous. The extension of the spectral bands to 720 nm may improve the accuracy of derived Chla absorption and SPM backscattering coefficient. The red absorption peak of Chla round 665 nm is unique and facilitate a good separation of a chla (440) and a dg (440) from the total absorption. Further in the red part of the spectrum, >680 nm, water remote sensing reflectance is almost a direct function of SPM backscattering coefficient [47]. This direct functionality of b b,spm (550) and Rs w (> 680) may stabilize the inversion at this spectral range leading to more accurate values of b b,spm (550).
ii-Measurement error: We showed in Section 5.2 that although derived IOPs were stable to sensor noise, their accuracies were slightly degraded. In addition to sensor noise, measurement errors could be related to residuals from subsequent corrections. For example, ocean color satellite spectrum contains additional residuals from atmospheric correction (AC) and post-AC adjustments [48] that affect the retrieval of IOPs. The accumulation of sensor noise and subsequent correction error will increase the uncertainty of derived IOPs [22].
iii-Model approximation: Employed approximations in the forward-model (Section 2) may not precisely describe the optical processes that have caused the observed spectrum. The model and its parametrizations did not take into consideration the bidirectional effects of remote sensing reflectance. Assuming an isotropic angular distribution of the up-welling radiation may impose an additional error component that will propagate to the derived IOPs [49,50]. Moreover, each of the used parametrization has its own limitation. Equation (4) ignores the different phytoplankton species and the wide variability of Chla absorption as measured in nature [51-53]. Equation (5) combines the absorption effects of detritus and gelbstoff in one spectral shape and magnitude. In Eq. (6) the backscattering ratio of SPM is set equal to the Petzolds integrated volume scattering data ∼0.0182 which may not represent the actual values of sea particles [54]. Moreover, the power law of the backscattering spectral shape as modeled in Eq. (6) is inaccurate in the presence of absorption from non algae particles [55].
iv-Uniqueness: There are many sets of IOPs that may have caused the observed spectrum [56]. The proposed method derived most of these sets as elite samples and update the next iteration. The final solution can, therefor, be regarded as the optimal average of all probable sets of IOPs. This averaging on the one hand reduces the probability of having a spiked solution, i.e. large error, and on the other hand derives a smoothed solution between all possible sets of IOPs. Derived IOPs have, thus, an intrinsic error component that is associated with the used inversion method.
Spectral characteristics and measurement error are the major reasons of deriving more accurate IOPs values from IOCCG than from NOMAD and SeaWiFS match-up data set.

Convergence and processing time
The convergence of the method is guaranteed [27, proposition 3.19 page 83]. Table 6 gives insight about the average number of iterations that were needed for convergence. From Table 6 we can approximate the maximum number of iteration at 95% of confidence to be 31 iterations, i.e. 15 + 2 × 8. In Table 6 we, however, did not consider the preprocessing iterations to select the optimal initial value of σ 0 , as described in Sections 4.2. The values of Table 6 can roughly be multiplied by factor of five to consider the total iterations that were needed to select the optimal initial values σ 0 and derive the IOPs.
The total processing times averaged over spectra per data set are shown in Table 7. Total processing time was calculated as the overall time needed to generate the initial pdf, using simulated annealing, and derive the IOPs using cross-entropy. Processing time per spectrum seems to be a function of bands number and the degree of noise. On average, noisy-IOCCG spectrum has the largest processing time followed by noise-free IOCCG, NOMAD and SeaWiFS spectra as averaged over the corresponding data set. There might be a trade-off between processing time and number of bands. NOMAD data set has an average of 10 bands per spectrum and longer processing time in comparison to SeaWiFS match-up data set which has an average of 13 bands per spectrum and the shortest processing time. These observations (Tables 6 and 7) are not conclusive and are meant to give the reader an idea about number of iterations and processing time as related to used data sets.

Limitation
The major limitation of the presented inversion algorithm is its sensitivity to the starting pdf. Initializing a pdf from a normal distribution, N(μ, σ ), requires the two parameters μ 0 = iop and σ 0 . The initial values of σ 0 were set as ς μ 0 to reduce the freedom of initialization. We used simulated annealing [43] to define the optimal value of σ 0 . Defining the value of σ 0 using simulated annealing will limit the initialization to μ 0 on the cost of speed and increased dependency on μ 0 . Values of μ 0 were initialized using the expressions of Lee et al. [14] as briefed in appendix (A).

Algorithm extension
The proposed inversion algorithm can easily be extended to include the variability of Chla absorption that may correspond to different phytoplankton species. There are two approaches to derive the natural variability of Chla absorption: i-include one unknown for each Chla absorption curve; ii-find the Chla absorption curve that results in the best-fit. The first approach, i, is based on the assumption that Chla absorption varies within one observed spectrum or satellite pixel. This approach seems to be more suitable for coarse satellite pixel and /or productive coastal and estuarine waters, where different phytoplankton species may co-exist and/or the variability of Chla absorption is large. Whereas the second approach, ii, assumes that Chla absorption is constant for a spectrum, but vary between different spectra, or satellite pixels. This approach, ii, is more suitable for open ocean waters and do not require large increase of the number of unknowns as in the first approach i. We, therefore applied approach ii to derive the variability of Chla absorption from NOMAD and SeaWiFS match-up data sets. Hereafter, we give an explanation on how to derive the variability of Chla absorption from ocean color data using a modified version of our inversion algorithm (3.3). We simply added another unknown to Eq. (7). This unknown, denoted as ι, is the index of a Chla absorption curve in the used data set. We used reported values of normalized Chla absorption curves [51-53]. Equation (4) is adapted to become a chla (λ ) = a chla (440) × a chla (λ ), where a chla (λ ) is the normalized absorption coefficient, i.e. Chla absorption normalized to its value at 440 nm. Values of ι are drawn from a uniform distribution such that each normalized absorption coefficient gets an equal chance to be selected during the inversion. The elite samples in step 6, Section 3.3, is adjusted for ι such that only the best elite sample is selected. We applied the modified algorithm on NOMAD and SeaWiFS match-up data sets, only derived values of a chla (440) and a dg (440) will be discussed.
Deriving the natural variability of Chla absorption coefficient has slightly improved the accuracy of a chla (440) and a dg (440) when derived from NOMAD and SeaWiFS match-up spectra. The RMSE between known and derived values of normalized Chla absorption coefficients is computed using similar form to Eq. (21). Figure 5 shows the RMSE values of a chla (λ ) variability as derived from NOMAD and SeaWiFS match-up data sets. Since we normalized by a chla (440), small values of RMSE are expected in the vicinity of the blue absorption feature of Chla. RMSE values, however, increase from 440 nm to longer wavelength reaching to a maximum value around 560 nm and gradually decrease to a local minimum around 675 nm, the red absorption feature of Chla. RMSE values in Figs. 5(a) and 5(b) approximately mirror a standard Chla absorption curve with RMSE values of NOMAD being larger than those of SeaWiFS, especially at the red part of the spectrum. This is due to the reduced number and extent of NOMAD's spectral bands. The missing bands were mainly in the red spectral region which increased the uncertainty of derived Chla absorption at the red bands. The advantage of this modification is that we were able to derive an indication of Chla variability, rather than using the fixed regression coefficients a 0 and a 1 in Eq. (4) or one curve of normalized Chla absorption coefficient.
Potential implication of including phase function variation was not considered in the current work. Future work should, however, include changes in phase function, especially spectral ones.

Conclusions
In this paper we developed a stochastic inversion algorithm based on the cross-entropy method to derive inherent optical properties from ocean color data. The proposed inversion method was validated using four data sets: IOCCG and its noisy version, NOMAD and SeaWiFS match-up data sets. The followings are concluded: 1. The method is self contained and easy to implement with far fewer inputs, e.g. information about curvature is not needed. The only needed input is the observed reflectance and some initial values, in case the initializations of Lee et al. [14] can not be applied.
2. In all validation exercises, the derived IOPs have acceptable accuracy, R 2 > 0.7 (after removing the outliers) and RMSE did not exceed 1.1.
3. The derived IOPs are stable to sensor noise. The inversion can filter noise up to a maximum of ±70% of observed remote sensing reflectance.
4. The method can easily be modified to derive the variability of chlorophyll-a absorption coefficient that may correspond to different phytoplankton species.
5. The developed inversion was validated against variety of data sets, simulated, measured and satellite match-up. With careful initialization, the proposed inversion could be used for global derivation of IOPs from ocean color data.

A. Initial values
The initial values of μ 0 = iop were adopted from the work of Lee et al. [14] and used to initialize the pdfs of IOPs from N(μ 0 , σ 0 ): where, r 1 and r 2 are the ocean color ratios: r 1 = Rs w (440)/Rs w (550) and r 2 = Rs w (440)/Rs w (490), respectively. The values of σ 0 were related to μ 0 by a scaling factor ς and computed using simulated annealing.