Fast and optimal broad-band Stokes/Mueller polarimeter design by the use of a genetic algorithm

A fast multichannel Stokes/Mueller polarimeter with no mechanically moving parts has been designed to have close to optimal performance from 430-2000 nm by applying a genetic algorithm. Stokes (Mueller) polarimeters are characterized by their ability to analyze the full Stokes (Mueller) vector (matrix) of the incident light. This ability is characterized by the condition number, $\kappa$, which directly influences the measurement noise in polarimetric measurements. Due to the spectral dependence of the retardance in birefringent materials, it is not trivial to design a polarimeter using dispersive components. We present here both a method to do this optimization using a genetic algorithm, as well as simulation results. Our results include fast, broad-band polarimeter designs for spectrographic use, based on 2 and 3 Ferroelectric Liquid Crystals, whose material properties are taken from measured values. The results promise to reduce the measurement noise significantly over previous designs, up to a factor of 4.5 for a Mueller polarimeter, in addition to extending the spectral range.


Introduction
Polarimeters are applied in a wide range of fields, from astronomy [1,2,3], remote sensing [4] and medical diagnostics [5,6] to applications in ellipsometry such as characterizing gratings [7], nanostructures [8] and rough surfaces [9,10,11]. As all polarimeters are based on inverting so-called system matrices, it is well known that the measurement error from independent Gaussian noise is minimized when the condition number (κ) of these system matrices is minimized [12,13]. It has been shown that κ = √ 3 is the best condition number that can be achieved for such a system, and that this optimal condition number can be achieved by several different approaches using various optical components (e.g. rotating retarders [14], division of amplitude [15,16], and liquid-crystal variable retarders [17]). In many applications it is necessary to perform fast spectroscopic measurements (e.g. by using a Charge-Coupled Device (CCD) based spectrograph) [18]. In that case, the wavelength dependence of the optical elements will cause the polarimeter not to be optimally conditioned over the full range simultaneously. A system based on two Ferroelectric Liquid Crystals (FLC) has been reported to be fast and reasonably well conditioned over the visible or near infrared spectral range [18,19,20]. By introducing a third FLC a similar system has been proposed to have an acceptable condition number from the visible to the near infra-red (430 − 1700 nm) [21]. The design of a system having the best possible condition number over a broad spectrum is a challenging optimization problem due to the large number of parameters; many optimization algorithms are prone to return local optimums, and a direct search is too time consuming. To avoid this time-consuming exhaustive search, we decided to employ the Genetic Algorithm (GA). A GA simulates evolution on a population of individuals in order to find an optimal solution to the problem at hand. Genetic Algorithms were pioneered by Holland [22], and are discussed in detail in e.g. Ref. [23]. GAs have previously been applied in ellipsometry to solve the inversion problem for the thickness and dielectric function of multiple thin layers, see e.g. Ref. [24,25,26].

Overdetermined polarimetry
A Stokes polarimeter consists of a polarization state analyzer (PSA) capable of measuring the Stokes vector of a polarization state, see Fig. 1. The PSA is based on performing at least 4 different measurements along different projection states. A measured Stokes vector S can then be expressed as S = A −1 b, where A is a system matrix describing the PSA and b is a vector containing the intensity measurements. A −1 denotes the matrix inverse of A, which in the case of overdetermined polarimetry with more than 4 projection states will denote the Moore-Penrose pseudoinverse. The analyzing matrix A is constructed from the first rows of the Mueller matrices of the PSA for the different states. The noise in the measurements of b will be amplified by the condition number of A, κ A , in the inversion to find S. Therefore κ A should be as small as possible, which correspond to do as independent measurements as possible (i.e. to use projection states that are as orthogonal as possible).
A Mueller matrix M describes how an interaction changes the polarization state of light, by transforming an incoming Stokes vector S in to the outgoing Stokes vector S out = MS in . To measure the Mueller matrix of a sample it is necessary to generate at least 4 different polarization states by a polarization state generator (PSG) and measure the outgoing Stokes vector by at least 4 measurements for each generated state. The measured intensities can then be arranged in a matrix B = AMW, where the system matrix W of the PSG contains the generated Stokes vectors as its columns. These generated Stokes vectors are found simply as the first column of the Mueller matrix of the PSG in the respective states. M can then be found by inversion as M = A −1 BW −1 . The error ∆M in M is then bounded by the condition numbers according to [27] ∆M The condition number is given as κ A = A A −1 , which for the the 2-norm can be calculated from the ratio of the largest to the smallest singular value [28]. ∆A and ∆W are calibration errors, which increase with κ when calibration methods using matrix inversion are applied. The PSG can be constructed from the same optical elements as the PSA, placed in the reverse order, which would give κ A = κ W ≡ κ.  to κ 2 , it is very important to keep this value as low as possible. If 4 optimal states can be achieved (giving κ = √ 3), no advantage is found by doing a larger number of measurements with different states, compared to repeated measurements with the 4 optimal states [14]. If, however, these optimal states can not be produced (κ > √ 3), the condition number, and hence the error, can be reduced by performing more than 4 measurements. For a FLC based polarimeter this can be done by using 3 FLCs followed by a polarizer as PSA, with up to 3 waveplates (WP) between the FLCs to increase the condition number (see Fig. 2). A PSG can be constructed with the same elements in the reverse order. Since each FLC can be switched between two states (this switching can be described as a rotation of the fast axis of a retarder by +45 • ), 2 3 = 8 different states can be analyzed (generated) by the PSA (PSG). To accurately measure the Stokes vector, the system matrix A needs to be well known. For a Mueller polarimeter generating and analyzing 4 states in the PSG and PSA, the eigenvalue calibration method (ECM) [29] can be applied. The ECM allows the measuring of the actual produced states by the PSA and PSG (A and W), without relying on exact knowledge or modeling of the optical components. However, the ECM is based on the inversion of a product of measured intensity matrices B for measurements on a set of calibration samples. This product becomes singular for a system analyzing and generating more than four states. A workaround of this problem is to choose the subset of 4 out of 8 states which gives the lowest κ value, and build a B matrix of those states to find 4 of the 8 rows (columns) of A (W). More rows (columns) of A (W) can then be found by calibrating on a different subset of the 8 states, giving the second lowest κ value, and so on. By repeating the calibration on different subsets of states, all the 8 rows (columns) of A (W) can be found with low relative error ∆A / A ( ∆W / W ).

Mutation
Mating contest Development Fig. 3. The four essential processes in a genetic algorithm are shown above. Sexual reproduction is performed by multi-point genetic crossover, giving rise to the next generation of individuals. Mutation can be simulated with simple bit negation (e.g. 0 → 1 and vice versa). Development is the process where a genotype is interpreted into its phenotype, i.e. the binary genome is interpreted as a polarimeter design. In the mating contest, one evaluates the fitness of each individual's phenotype, and let the more fit individuals reproduce with higher probability than the less fit individuals.

Genetic optimization
In order to optimize κ(λ ), one can conceivably employ a variety of optimization algorithms, from simple brute-force exhaustive search to more advanced algorithms, such as e.g. Levenberg-Marquardt, simulated annealing, and particle swarm optimization. Our group has previously performed optimization of a polarimeter design based on fixed components, namely, two FLCs and two waveplates. In this case, the optimization problem reduces to searching the space of 4 orientation angles. With a resolution of 1 • per angle, this gives a search space consisting of 180 4 ≈ 10 9 states to evaluate; on modern computer hardware, this direct search can be performed. In order to optimize the retardances of the components as well, the total number of states increases to about 10 9 2 = 10 18 . Obviously, brute force exhaustive search is unfeasible for such large search spaces. A GA performs optimization by simulating evolution in a population of individuals (here: simulated polarimeters). The three pillars of evolution are variation, heritability, and selection. Our initial population must have some initial genetic variation between the individuals; hence, we initialize our population by generating random individuals. Heritability means that the children have to carry on some of the traits of their parents. We simulate this by either cloning parents into children (asexual reproduction) or by performing genetic crossover (sexual reproduction) in a manner that leave children with some combination of the traits of their parents. Finally, selection is done by giving more fit individuals a larger probability of survival. 1 For a sketch of the essential processes involved in a GA, see Fig. 3.
Our GA builds directly on the description given by Holland [22], using a binary genome as the genetic representation. In this representation, a string of 0s and 1s represent the genome of the individual. To simulate mutation in our genetic algorithm, we employ logical bit negation; i.e. 0 → 1 or vice versa. Sexual reproduction is simulated by using multi-point crossover, i.e. simply cutting and pasting two genomes together, as described by Holland [22].
The interpretation of the genome into a phenotype (development), in this case a polarimeter design, is done in a straightforward way. For each variable in the polarimeter's configuration, i.e. for each orientation angle and each retardance, we select m bits in the genome (typically, m = 8) and interpret this number as an integer in the range from 1 to 2 m . The integer is subsequently interpreted as a real number in a predefined range, e.g., θ ∈ [0 • , 180 • ]. In order to avoid excessively large jumps in the search space due to mutations, we chose to implement the interpretation of bits into integers by using the Gray code, also known as the reflected binary code. The most important parameter values in our GA are shown in Table 2. Making good choices for each of these parameters is often essential in order to ensure good convergence.
After determining the phenotype, we must assign to each simulated polarimeter individual a fitness function (also known as the objective function). In order to do this, we first calculate κ(λ ). As discussed, κ −1 (λ ) maximally takes on the value 1/ √ 3. Hence, we define an error function, e, as In Eq. (2), λ n = λ min + (n − 1)∆λ , with n = 1, 2, . . . , N λ and ∆λ = 5 nm. λ min and N λ are determined by the wavelength range we are interested in. The choice of taking the difference between κ −1 (λ ) and the optimal value to power 4 is done in order to "punish" peaks in the condition number more severely. As GAs conventionally seek to maximize the fitness function, we define an individual's fitness as This definition is convenient because f takes on real and positive values where higher values represents more optimal polarimeter designs.

Results
For the case of a polarimeter based on 3 FLCs and 3 WPs, we have minimized κ(λ ) by varying the orientation angle, θ , and the retardance, δ , of all the elements. This yields a 12-dimensional search space, i.e., 6 retardances and 6 orientation angles. θ is the angle between the fast axis of the retarder (WP or FLC) and the transmission axis of the polarizer (see Fig. 2), taken to be in the range θ ∈ [0 • , 180 • ]. The retardance, δ , is modeled using a modified Sellmeier equation, where A UV , A IR , λ UV , and λ IR are experimentally determined parameters for an FLC (λ /2@510 nm, Displaytech Inc.) and a Quartz zero order waveplate (λ /4@465 nm) taken directly from Refs. [19] (for the FLCs, A IR = 0). L is a normalized thickness, with L = 1 corresponding to a retardance of λ /2@510 nm for the FLCs and λ /4@465 nm for the waveplates. Each L and θ are represented by 8 bits each in the genome. We use experimental values to ensure that our design is based on as realistic components as possible. The 3-FLC polarimeter design scoring the highest fitness function is shown in Table 1. The wavelength range for which we optimized the polarimeter was from 430 to 2000 nm. To visualize the performance of this design, we show a plot of κ −1 (λ ) in Fig. 4. The inverse condition number, κ −1 , is larger than 0.5 over most parts of the spectrum, which is close to the optimal inverse condition number (κ −1 = 1/ √ 3 = 0.577). This is a great improvement compared to  [21], which oscillates around κ −1 ≈ 0.33. The new design promise a decrease in noise amplification by up to a factor of 2.1 for a Stokes polarimeter, and up to factor of 4.5 for a Mueller polarimeter. In addition the upper spectral limit is extended from 1700 nm to 2000 nm. Shorter wavelengths than 430 nm were not considered as the FLC material will be degraded by exposure to UV light. Previous designs often suffer from κ −1 (λ ) oscillating as a function of wavelength, whereas our solution is more uniform over the wavelength range we are interested in. This uniformity in κ(λ ) will, according to Eq. (1), give a more uniform noise distribution over the spectrum.
To give some idea of how fast the GA converges, a plot of f (see Eq. (3)) as a function of the generation number is shown in Fig. 5. The mean population fitness (µ) and standard deviation (σ ) is also shown. As so often happens with genetic algorithms, we see that the maximal and average fitness increases dramatically in the first few generations. Following this fast initial progress, evolution slows down considerably, before it finally converges after 600 generations. The parameters used in our GA to obtain these results are shown in Table 2.
A design using fewer components, in particular 2 FLCs and 2 waveplates, does have advantages. These advantages include increased transmission of light, as well as reduced cost and complexity with respect to building and maintaining the instrument. In addition some applications have weight and volume restrictions [3]. For these reasons, we have performed genetic   Table 2. Genetic Algorithm parameters. The "crossover rate" is the probability for two parents to undergo sexual reproduction (the alternative being asexual reproduction). The parameter "crossover points" refer to the number of points where we cut the genome during crossover (sexual reproduction). "Mutation rate" is the probability for any given individual to undergo one or several bit flip mutations in one generation.

Parameter
Value Crossover rate 0.7 Crossover points 2 Mutation rate 0.2 Population size 500 optimization of the 2-FLC design. In Fig. 6, we show the performance of two polarimeter designs for the wavelength ranges 430 − 1100 nm (compatible with an Si detector) and 800 − 1700 nm. Both of these polarimeter designs show condition numbers which are considerably better than previously reported designs. The numerical parameters of the two designs based on 2 FLCs are shown in Table 3.
Our optimization algorithm can, with little effort, be applied to a wider range of polarimeter design. Any optical component can be included into our GA; for example, one can include fixed Fig. 6. Condition number for two designs using 2 FLC retarders and 2 waveplates. By optimizing κ(λ ) over a narrower part of the spectrum, we can design good polarimeters with fewer components. The polarimeter designs labeled "Visible" and "IR" show our two designs, optimized for 430 nm < λ < 1100 nm and 800 nm < λ < 1700 nm, respectively. For comparison with our "NIR" design, we show the previous simulated design from Ref. [30]. The curve labeled "Commercial" shows the measured condition number of a commercial instrument (MM16, Horiba, 2006) based on the same (FLC) technology.
waveplates of different materials, prisms, mirrors, and other types of liquid crystal devices. The material of each component could also be a variable, which could help alleviate the dispersion problem. The only requirement is that the retardance of the component in question must be possible to either model theoretically or measure experimentally. It is possible to optimize a polarimeter for a different wavelength range, simply by changing program inputs. Focusing on a wavelength range which is as narrow as possible typically results in higher condition numbers than reported here. Evaluating different technologies, materials and components for polarimetry should thus be relatively straightforward. The task is not computationally formidable: we have used ordinary desktop computers in all our calculations.

Conclusion
In conclusion, we have used genetic algorithms to optimize the design of a fast multichannel spectroscopic Stokes/Mueller polarimeter, using fast switching ferroelectric liquid crystals. We have presented three polarimeter designs which promise significant improvement with respect to previous work in terms of noise reduction and spectral range. Our approach requires relatively little computational effort. One can easily generate new designs if one should wish to use other components and materials, or if one wishes to focus on a different part of the optical spectrum. We hope that our designs will make polarimetry in general, and ellipsometry in particular, a less noisy and more efficient measurement technique.