Single-pulse, Kerr-effect Mueller Matrix LiDAR Polarimeter

We present a novel light detection and ranging (LiDAR) polarimeter that enables measurement of 12 of 16 sample Mueller matrix elements in a single, 10 ns pulse. The new polarization state generator (PSG) leverages Kerr phase modulation in a birefringent optical fiber, creating a probe pulse characterized by temporally varying polarization. Theoretical expressions for the Polarization State Generator (PSG) Stokes vector are derived for birefringent walk-off and no walk-off and incorporated into a time-dependent polarimeter signal model employing multiple polarization state analyzers (PSA). Polarimeter modeling compares the Kerr effect and electro-optic phase modulator–based PSG using a single Polarization State Analyzer (PSA) and a scattering sample yielding similarly good performance for both. We include results from an experimental demonstration of the Kerr effect PSG. © 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement


Introduction
Stokes vector and Mueller matrix polarimetry has been applied broadly in both laboratory and remote sensing research. Polarimeters can be used to measure the state of polarization of light (Stokes vector polarimetry) or to characterize the polarization dependent reflection, transmission, and scattering of materials (Mueller matrix polarimetry). For example, laboratory polarimetry has found utility in a diverse range of applications including biological tissue analysis [1,2], nanostructure and metamaterial characterization [3,4] and materials characterization [5,6]. In the remote sensing domain, polarimetry has been employed for active imaging contrast enhancement of scattering targets [7][8][9], characterization of airborne particles [10][11][12], and vegetation classification [13,14].
Traditional Mueller matrix polarimeter designs, such as illustrated in Fig. 1(a), employ linear polarizers and rotating quarter wave plates to measure the Mueller matrix of a sample [15,16]. In this design, a series of laser pulses are transmitted through the PSG with each pulse exiting the PSG with a different polarization state due to the rotating quarter wave plate. Likewise, in the PSA, different return polarization states are analyzed for each pulse due to the rotating retarder. A sample's Mueller matrix is estimated through serial measurements of the signal transmitted/scattered from the sample, with each measurement employing a different illumination polarization state and receiver analyzer state. In general, to measure the full Mueller matrix, a minimum of 16 individual measurements is required and often results in a measurement duration of seconds to minutes, depending on the polarimeter design. More recent designs have employed electro-optic crystals and liquid crystals to produce the diversity of polarized transmit and analyzer states necessary to measure the Mueller matrix [17][18][19][20]. Although these alternative designs provide shorter measurement times compared to the rotating quarter wave design, they are ultimately limited by the need for multiple laser pulses for serial measurements. For measurements requiring high temporal resolution or high spatial resolution from a moving platform, multi-pulse measurement times may be unacceptable. Another drawback of conventional systems is the complex opto-mechanical designs and control electronics which drive up cost, size, weight, and power of sensors. LiDAR polarimeter design. The PSG exploits the Kerr effect by passing the pump laser through a birefringent optical fiber. The detector processing chain uses a beam splitter (if necessary) followed by one or more sub-PSA, each followed by a detector (D1, D2, D3). Each sub-PSA may be composed of a rotated stationary quarter wave plate followed by a linear polarizer or just a linear polarizer.
We propose a new architecture that simultaneously reduces hardware complexity and reduces measurement time to nanoseconds, or less, depending on laser pulse width and PSA detector bandwidth. The basic architecture is illustrated in Fig. 1(b); the PSG transmits a probe pulse with a well-defined distribution of polarization states in time and the PSA analyzes the signal scattered from the sample in time. If one knows the temporal polarization distribution of the probe pulse and the temporal polarization distribution of the scattered signal measured through the PSA, the intervening scattering sample Mueller matrix can be estimated. The probe pulse is produced by coupling a linearly polarized pump laser to a birefringent optical fiber; through the nonlinear Kerr effect, the time varying instantaneous pump power induces a time varying retardation in the fiber, which produces a temporal distribution of polarization states. Recently, we analyzed a more conventional way of producing this probe pulse through the use of an electro-optic retarder [21]. In this approach a 20 ns high-voltage ramp is applied to a Pockels cell as a laser pulse passes through. The voltage ramp induces a time varying retardation, which produces a probe pulse with a time varying polarization state. In contrast to the Pockels cell approach, instead of applying a time varying voltage, we apply a time varying pump laser power to produce a time varying laser polarization. This is particularly interesting for a PSG as it is a very simple opto-mechanical device with no electrical control and can be fusion spliced to a fiber laser to create a monolithic PSG. In addition, as will be shown below, the fiber length, pump power, and pump polarization angle can be tailored to optimize the probe pulse polarization distribution relative to measurement accuracy. The general PSA system is of the division of amplitude type and is shown in Fig. 1(b); it employs three sub-PSAs, which are generally sufficient to characterize the Stokes vector [22]. For example, one may employ a linear polarizer at zero degrees, a linear polarizer at 45 degrees, and a circular analyzer. In Fig. 1(b), each detector block measures both the transmitted and rejected signals from each polarizer so that information about the two orthogonal states is available to construct each Stokes vector element. The detector bandwidth is chosen to allow high fidelity measurements of the polarization modulation at each sub-PSA. Due to the static birefringence axes of the fiber and Pockels cell, both of these systems are limited to measuring 12 of 16 sample Mueller matrix elements [21]. As will be shown, reducing the PSA system to a single elliptical analyzer, measuring right and left elliptically polarized light, enables measurement of six Mueller matrix elements. An elliptical analyzer is a quarter wave plate at angle theta followed by a linear polarizer. Although using just a single elliptical analyzer limits measurement of the full Mueller matrix in general, using this concept in a monostatic configuration is consistent with measuring the Mueller matrix of scattering surfaces that might be encountered in a remote sensing application as their matrices tend to be diagonal [23][24][25][26][27][28]. In what follows, we will begin with a general PSA design employing three sub-PSAs but then narrow the treatment to the case focused on diagonal Mueller matrices. This is relevant to LADAR as the world is generally composed of scattering materials having diagonal Mueller matrices and is also relevant to laboratory applications requiring characterization of surfaces.
In the remainder of this paper we will develop the underlying theory describing the polarimeter operation and experimentally analyze the concept. In Section 2 we will derive the Stokes vector generated by the Kerr effect in the birefringent optical fiber and compare it to the Stokes vector generated with a Pockels cell. In Section 3 we will develop a polarimeter system model and consider the constraints linear algebra place on Mueller matrix measurement for a single PSA receiver. In Section 4, the Cramer-Rao Lower Bound is employed to estimate measurement accuracy and optimize system parameters. In Section 5. Monte-Carlo simulations are used to quantify measurement accuracy versus signal to noise. Lastly, Section 6 outlines measurement from an experimental demonstration.

Nonlinear optical polarization state generator
The Kerr effect is a well-known third order nonlinear optical process associated with processes such as self-focusing, phase modulation, and four-wave mixing. Although considerable research has focused on understanding the influence of self and cross-phase modulation on the polarization state of a laser transmitted through an optical fiber, it is not clear that this knowledge has transitioned to a useful application. As will be demonstrated below, phase modulation can be leveraged as a polarization state generator to produce an optical pulse with time varying polarization for an opto-mechanically simple, single pulse polarimeter PSG.
Consider the polarization maintaining fiber in Fig. 2; it has an indices of refraction n x and n y along the x and y axes, respectively. The inherent birefringence in the fiber, δn = n x − n y , results in a phase shift proportional to δn and fiber length L; it is independent of the laser power. At sufficiently high pump laser power there occurs a nonlinear phase shift proportional to the instantaneous optical power. Given the temporally varying power distribution, one can expect the pump laser to induce a temporally varying relative phase shift, or polarization. Now consider a field of the following form propagating in the fiber: where U, V represent slowly varying envelopes of z, t, k 0 = 2π/λ 0 , k x = k 0 n x , and k y = k 0 n y . We will use the following set of coupled differential equations corresponding to the limit of strong birefringence: where v gx = c/n gx , v gy = c/n gy , with n gx and n gy representing the group refractive indices along the x,y axes. β 0x,y = d 2 k/dω 2 denotes the dispersion coefficient evaluated at the carrier frequency ω 0 , n 2 stands for the nonlinear Kerr coefficient associated with silica glass, and A = 2/3 for glass fibers. [29] We assume a monochromatic laser with a hyperbolic secant pulse shape of the form where P 0 = A 2 0 , A 0 is the peak electric field strength, and τ 0 is a temporal half-width parameter. The pulse's Full Width at Half Maximum (FWHM) is related to τ 0 by We will consider an optical pulse of approximately 10 ns in duration [FWHM], so the dispersion coefficient in Eqs. (2) and (3) can be neglected. Now, we will analyze two cases: (1) a system ignoring walk-off effects, and (2) a system including walk-off effects.

System without walk-off effects
In this case, the solution of Eqs. (2) and (3) is given by: where τ = t/τ 0 − z/(v g τ 0 ), where τ 0 is related to the FWHM width of the power envelope through FWHM ∼ 1.76τ 0 , v g is the average group velocity, and α is the angle between the linearly polarized field and the fiber x-axis (herein designated the "slow axis") at the fiber input facet.

System with walk-off effects
In this case, the solution of Eqs. (2) and (3) is given by: where ψ and φ are given by: where l w = (τ 0 c)/(n x − n y ) represents the walk-off length, = zδn/τ 0 c, and F(τ − ), G(τ + ) are given by: Figures 3 and 4 show the evolution of the Stokes parameters, defined as: where E x , E y are obtained from Eq. (1). In Fig. 3, we observe that the walk-off effect is a slight shift of the values of the Stokes parameters along time, and it does not represent a problem for such large pulse duration (10 ns). Figure 4 shows the pulse polarization content on the Poincaré sphere.
In the limit of no walk off, the Stokes vector can be written as where we have written the instantaneous optical power as P(τ) = Ao 2 sech 2 τ and all Stokes components are normalized to S 0 . The transverse coordinate z should be evaluated at the fiber length L to obtain the Stokes vector at the fiber output. The term S 1 is dependent only on the optical power P(τ) and the input polarization orientation with respect to the fiber slow axis α. S 2 and S 3 terms include phase shift due to the birefringence of the PM fiber (δn) and a time varying phase term due to instantaneous power induced phase modulation (P(τ)n 2 cos(2α)).  If a Pockels cell is used to generate the time varying polarization probe pulse, the resulting PSG Stokes vector will have a form similar to Eq. (18) but the phase term in S 2 and S 3 will be ψ(t) = k 0 δnz + πV(t)/V π . Here V(t) is the time varying voltage applied to the crystal and V π is the half-wave voltage. The Kerr effect uses the time varying pulse power P(t) instead of voltage V(t) to achieve time varying polarization [30].
Through the Kerr phase dependence on P(τ) * z in Eq. (19), the temporal polarization distribution can be controlled through peak power, pulse shape, and fiber length. For Mueller matrix measurement, it is important to choose α to balance the energy partitioned between probe beam Stokes components in order to sample the target Mueller matrix components with sufficient sensitivity. The polarization angle, α, yielding equal power in each of the Stokes polarization states of Eq. (18) can be found from the conservation of energy S 2 1 + S 2 2 + S 2 3 = S 0 and by setting S 1 = S 2 = S 3 . From this consideration one finds α ≈ 27 o . Lastly, although Eq. (18) is written in the coordinate system of the fiber birefringence axes, it may be desirable to rotate the coordinates by α into the laboratory reference frame [24,31], leading to (20)

Model of receiver hardware
The receiver diagram in Fig. 1 allows for the possibility of using a beam splitter followed by multiple PSAs. There is a wide variety of PSAs that could be used; we consider only the following options for brevity: 1. Horizontal/Vertical linear polarizers (denoted H/V).
In each pair of outputs, the first (e.g. "H") is the polarization which is transmitted through the cube polarizer and the other (e.g. "V") is rejected from the polarizer and is propagated on a separate optical path to another detector. These three options provide direct measurement of the Stokes vector elements. We will explore all three options when practical, though the nominal configuration will be just option 3 with no beam splitter, with the QWP rotation angle θ left unspecified, which allows for generalization to an elliptical L/R PSA rather than simply a circular L/R PSA. Considering the atmospheric transmission A(R), the transmitter efficiency T t (p), the receiver efficiency T r (p), and the detector response D det , the terms affecting the received LiDAR signal can be grouped as where R is the one-way range to target and p denotes the detector polarization channel (H, V, P, M, R, L). Note that the detector response here is the optical response; the temporal detector response will be accounted for later. The terms T t (p), T r (p), and D can be calibrated at the instrument level. Let D = 2R/c indicate the round-trip travel time to target, where c is the speed of light. The signal exiting polarization channel p is where Q(R) is the laser beam-target overlap integral, M p is the known Mueller matrix [24,31] of polarization analyzer p, and M is the unknown Mueller matrix of the target. The scalar Q(R) and the factor A(R) within k(R, p) are potentially unknown but can be canceled out later by normalization the estimated Mueller matrix. Unless otherwise noted, throughout this section S lab can represent the transmitted Stokes vector from either the Kerr-based or Electro-Optic (EO)-based transmitter. Leaving θ unspecified for option 3, the Mueller matrices for the PSAs considered here are [24,31] Now we can find the output intensities (i.e. the first element of the Stokes vector) from each possible PSA pair. Let N ∈ {2, 4, 6} be the number of signals to be recorded by the fast detectors in Fig. 1; for example, N = 2 for any single analyzer. We assume that the beam splitter evenly divides the input intensity into N/2 equal streams, with a fraction 2/N of the intensity going into each PSA pair. Using (23)-(25) the possible output intensities are The estimation process must be carried out using a discrete time representation. We will model the conversion from continuous to discrete time by filtering the detector output though a filter with impulse response h(t) and frequency response H(f ) and then instantaneously sampling the filter output with sampling period T. We assume that the filter's frequency response has cut-off frequency F c ; ideally H(f ) will be a rectangle, but we need not assume that. Herein, we use the term "detector bandwidth" to refer to F c .
We will use k to index the sample time and K samples are collected per polarization channel. The effects of the detector filter, sampling, and additive noise v[k] can be represented by where denotes convolution. To account for the time dependence and the effects of this convolution, define using the transmitted Stokes vector from (18), where j is the index of the four elements of the Stokes vector. Then define sampled versions, stacked into four K × 1 vectors, The samples from the polarization channel outputs can be represented in vector form by Assume (temporarily) that all 6 polarization channels are used and that all 16 Mueller matrix elements are to be estimated. Then counting both outputs of each of (26)- (28), there are six output channels that are each sampled K times by the detectors. Stacking these samples per where v is a vector of samples of the additive noise process. We have included the unknown scale factor Q within the unknown vector x; the method for dealing with Q will depend on the application for which the estimates of x will be used. A detailed inspection of (26)-(28) is necessary to reveal the structure of H. It takes the form k r k r cos 2 2θ k r cos 2θ sin 2θ −k r sin 2θ k l −k l cos 2 2θ −k l cos 2θ sin 2θ k l sin 2θ where ⊗ denotes a Kronecker product. In a simpler scenario where only N of the 6 possible polarization channels are used and only M of the 16 possible Mueller matrix elements are assumed to be non-zero, we can omit rows and columns of H via where 1 denotes an "indicator" matrix. The N × 6 polarization channel indicator 1 P is obtained by deleting all but the N relevant rows of a 6 × 6 identity matrix, and the M × 16 Mueller matrix indicator 1 X is obtained by deleting all but the M relevant rows of a 16 × 16 identity matrix. Left-multiplying by 1 P or 1 X will select only the P or M rows of interest, and right-multiplying by 1 T X will select only the M columns of interest. Note that H k depends strongly on the rotation of the receiver's QWP, θ; and H s depends heavily on the transmitter parameters (i.e. for a Kerr system, the optical fiber parameters δ n , n 2 , and α; or for an EO system, the rotation of the transmitter's EO retarder, α, as well as the voltage ramp parameters V max and β, the ramp rate). Specifically, ignoring the detector's temporal impulse response, where is Hadamard (element-by-element) multiplication; and P and ψ contain K samples of the instantanteous optical power P(τ) and the phase argument from the S 2 and S 3 components of (18), respectively. The time samples are evaluated at times

Noise model
Since the detector output voltage is proportional to incident power, we define the Signal to Noise Ratio (SNR) as a ratio of detector voltage values. However, additive Gaussian noise present within the detector can cause the detector voltage to go negative. As such, we compute a ratio of absolute values of average detector voltages, leading to where σ is the standard deviation of the detector noise v[k]. Since σ is in the same units as the detector voltage, which is proportional to incident power, the denominator of (44) uses σ rather than σ 2 , which ensures that this SNR definition is a power ratio. We have also made use of the fact that the average magnitude of a Gaussian random variable is 2/π times the standard deviation of that Gaussian. Note that this SNR definition is with respect to the received signal, so factors such as atmospheric transmission and target reflectivity are included in the numerator.
In order to gain intuition as to how the various parameters affect the SNR, assume that during the observation window, the detector captures all but a negligible amount of energy of the received pulse, and that M 00 >>M ij for i, j 0 in (26). The first assumption is necessary anyway for accurate parameter estimation; the latter assumption is not strictly accurate, but M 00 is always the largest element and all of the other elements in the SNR computation will get multiplied by various sines and cosines which will further attenuate their effects, so this assumption is reasonable for the purpose of getting a sense of how various parameters affect the SNR. Using these assumptions, (There are two factors of N in the denominator; one is from within each y p (t) due to the beam splitter, and the other is because the SNR numerator is defined as an average across the N polarization channels.) The zeroth element in the transmitted stokes vector is just the pulse power. For the sech 2 pulse model in (4), Thus, the SNR can be approximated as This holds for either the proposed Kerr system or for a similar EO-based system, so long as it uses the same pulse shape. Each of the terms A, B, and C is unitless. Term A is essentially the propagation loss through the optical path, term B is a ratio of the laser pulse width over the duration of the range gate, and term C is the ratio of the pulse peak power over the standard deviation of the detector noise. To maintain a unitless ratio, both P 0 and σ should be represented in the same units, such as via the detector voltages they induce. The propagation loss terms k(R, p) from (21) should also be unitless. Due to the assumption that M 00 >>M ij for i, j 0, the SNR expression in (47) is only correct to within perhaps a factor of two, but it can be used to gauge how key parameter values will affect the SNR. The simulations in Section 5 will use (44) to evaluate the SNR for a given discrete time waveform.

Linear algebra analysis
We now consider linear algebra concerns that potentially limit receiver operation. Throughout, we consider the special case where only the elliptical (R/L) polarization channels are used. In that case, k r k r cos 2 2θ k r cos 2θ sin 2θ −k r sin 2θ k l −k l cos 2 2θ −k l cos 2θ sin 2θ k l sin 2θ Note that 1 P H k has a rank of 2; but regardless of the value of α (for either an EO or Kerr-based system), H s only has three linearly independent columns rather than four. This comes from (42), since the matrix has a factor with only 3 columns and thus has a maximum possible rank of 3. Thus, the overall rank of H is 2 × 3 = 6, meaning that the maximum number of Mueller matrix elements we can recover is six. In order to transform this into a well-posed problem, redundant columns of H must be removed by assuming some of the elements of the target Mueller matrix are zero. In the remainder of this section, we discuss the possible ways this can be done. Alternatively, the problem could also be resolved by adding additional polarization analyzers, at the cost of increased receiver complexity. Inspection of the 16 columns of H reveals that they are spanned by the six vectors listed in the top row of Table 1. The terms cos ψ and sin ψ are shorthand for the length-K vectors of samples of cos ψ(t) and sin ψ(t), and the scale factors of P(t) are omitted for notational brevity. Table 1 also shows the dependence of each column of H on the six spanning vectors. A necessary condition for a well-posed problem is that the elements depend on as many spanning vectors as Table 1

. Dependence of the columns of H on the six basis vectors {b n }. The M ij notation indicates the Mueller matrix elements corresponding to the indicated columns of H. Green entries are the elements to be estimated when a diagonal Mueller matrix is assumed. Regarding the blue elements, refer to the text between (52) and (53)
.
k r s 2 cos 2θ sin 2θ −k r s 3 sin 2θ k l s 0 −k l s 1 cos 2 2θ −k l s 2 cos 2θ sin 2θ k l s 3 sin 2θ As mentioned in Section 1, it is generally accepted that the Mueller matrix for a scattering sample is diagonal, so this is a reasonable assumption for analysis of scattering samples. The resulting spanning vector dependence is highlighted in green in Table 1; limiting the problem to those four elements results in dependence on four unique spanning vectors, so the problem is well posed. This is why the special case considered in this section is our default configuration -it allows recovery of the full Mueller matrix diagonal with only a single PSA and a single pair of detectors for its two outputs. Table 1 can also be used to show that it is (surprisingly) absolutely essential to retain both outputs of the linear polarizer, even under the assumption of a diagonal Mueller matrix. If only one polarization channel is retained, the basis vectors are each halved to only either the first K or last K elements, which would make {b 4 , b 5 , b 6 } redundant with {b 1 , b 2 , b 3 }. As such, the rank of H would be 3 yet there would be 4 unknowns, making the problem ill-posed.
We may wish to recover additional Mueller matrix elements, since up to six can be recovered. Inspection shows that the potential choices are extremely limited -we can include M 03 , and we can include either M 01 or M 02 ; these elements are highlighted in blue in Table 1. Any other choices lead to an ill-conditioned problem. These two choices can be formulated as For readability, our notation assumes that one computes all 16 columns of H using the Kronecker product structure of (40) and then deletes the columns corresponding to the removed elements of x, though that is not necessarily the most computationally efficient approach. In summary, for the default single PSA configuration, the only choices of Mueller matrix elements to estimate that lead to a well-conditioned problem are the three choices in (52), (54), or (56); or any subset thereof. All other Mueller matrix elements are implicitly assumed to be zero. In Section 5 we will evaluate the effects of making that assumption when the assumed-to-be-null values are small but non-zero.

Cramer-Rao bound analysis
We can use the Cramer-Rao Lower Bound (CRLB) to approximate the Mueller matrix estimation performance. Throughout this section, we assume that the Mueller matrix is diagonal, and that only the elliptical (R/L) channels are used.
As in prior work [21,32], we will assume the round-trip delay τ is known for the purposes of the CRLB, though it will be considered unknown later. For a model of the form y = Hx + v where v is white Gaussian noise, the CRLB is the inverse of the Fisher Information Matrix (FIM) J, given by [33] Cov where σ 2 is the variance of the noise from (44). We will use the CRLB to gain some intuition regarding the effects of the angle α (either the EO retarder rotation or the Kerr fiber rotation) and the QWP rotation θ. Assuming k h = k v and ignoring constants such as k h , k v , 1/2, and σ 2 , we have s 2 cos 2θ sin 2θ −s 3 sin 2θ s 0 −s 1 cos 2 2θ −s 2 cos 2θ sin 2θ s 3 sin 2θ For brevity of notation, define Using these definitions, Now we can gain some intuition by noting the dependence between the s vectors via the linear dependence among the elements of (20). For example, assuming the sin φ(t) term oscillates faster than the rise and fall of the pulse shape. Using the generic symbol # to indicate a non-zero value, That is, elements M 11 and M 22 are coupled with each other, whereas M 00 and M 33 are uncoupled. This is because element i, j of J −1 is a (very tight) bound on the covariance between M i,i and M j,j , and the inverse of a block diagonal matrix maintains the same block diagonal structure. This will be helpful in interpreting simulation results later. Figures 5 and 6 show a numerical evaluation of the CRLB for the EO and Kerr systems, respectively. The target Mueller matrix is known to be diagonal. The Mueller matrix data used was for oak bark at λ = 1.06 µm, at normal incidence [6]. The SNR was set to 10 dB using (44); since the SNR is system parameter dependent, the waveforms used to compute the noise variance σ 2 were generated using globally optimum values (α = 27.   Note that all plot values will scale linearly with σ; i.e. if the SNR was increased by a factor of two (or 3 dB) because sigma was halved, then the mantissa would be halved but the curve shape would remain the same. The plots are symmetric every 45 o for each angle, so only ranges of

Parameter estimation algorithm
The parameter estimation method that will be employed is in the family of least squares (LS) and Maximum Likelihood (ML) estimation; specifically, a separable least-squares style of estimation can be used [33]. The idea is to assume candidate values for any unknown parameters that have a nonlinear relationship with the observations (in this case, D), perform an LS or ML estimate of the parameters with a linear relationship with the observations (in this case, x), and iterate this approach across a range of values for D to find the value leading to the best overall fit. For the case of additive white Gaussian noise, the ML estimate will always reduce to some variant of LS fit (e.g. standard LS or nonlinear LS), so we use the two terms synonymously here.
Assuming white Gaussian noise, the ML estimate is x,D = arg min The choice for H will be as in (51) if the target Mueller matrix is assumed to be diagonal, or (53) or (55) if the six elements in (54) or (56) are assumed to be nonzero, respectively.

Feature estimation error
In the remainder of this section, we will simulate the transmitted waveforms and estimate different The Mueller matrix was from oak bark, measured at 1.0607µm at normal incidence, and normalized relative to M 00 : For both systems, we assume M is diagonal and only estimate the four elements along the diagonal. In order to remove the ambiguity caused by the unknown scale factor Q in (52), the estimated elements of x are divided by the estimate corresponding to M 0,0 , resulting in three normalized estimates: The performance metric will be the Normalized Root Mean Squared Error (NRMSE). Given that the true Mueller matrix and the estimates have each already been normalized by M 0,0 and M 0,0 , respectively, the NRMSE is where n tr ∈ {1, . . . , N tr } indexes the Monte Carlo trials that average over noise realizations. Figure 7 shows the NRMSE averaged over N tr = 1000 Monte Carlo trials, for an EO system [21] and the proposed Kerr-based system. There are two pairs of curves figures; in the two labeled "(diag)", the off-diagonal elements of M were set to zero when generating the waveforms; whereas in the other two curves, the full Mueller matrix of (68) was used, so the diagonal assumption is invalid. Either way, the estimator assumed the Mueller matrix was diagonal and only estimated the four diagonal elements. Figure 7 shows that, in this case, the error resulting from the assumption of a diagonal sample Mueller matrix has negligible effect on the measurement accuracy, and the estimation accuracy of a truly diagonal sample Mueller matrix has some marginal improvement as SNR increases. When the assumption of a diagonal Mueller matrix is met, performance improves monotonically as SNR increases; whereas the presence of residual off-diagonal elements causes modeling error that manifests as a "floor" on estimation performance and high SNR. The EOM and Kerr results are extremely similar, though the Kerr error in the M 1,1 estimate is about 46% higher than that of the EOM system (except in the presence of the error floor effect). However, this is offset by the fact that the EOM error is about 12% higher in the M 2,2 estimate, and even though that is a smaller percentage increase, the error in that estimate is higher so the error averaged across all three elements is very comparable between the EOM and Kerr systems. This comparison shows that the more compact and cost-effective Kerr-based system produces results comparable to those from the larger and more cumbersome EO system.
Note that the error floor is higher in elements M 1,1 and M 2,2 than M 3,3 . This is related to the analysis surrounding (64). The performance of the estimators for these two elements are coupled, so when one of them has increased error, so does the other.

Experimental demonstration
In this section, we demonstrate the Kerr PSG with an experimental setup similar to Fig. 1, but we implement the three PSAs serially, one pulse at a time. We employ a Keopsys fiber laser with wavelength = 1.064 µm, 10 ns FWHM pulse width and linear polarization. We used a Newport PM fiber with nominal parameters δn = 3.0 · 10 −4 and mode field diameter 7.7 µm. The experimental demonstration is constrained by laser coherence length and the fiber damage threshold. The optical path difference for the orthogonal polarization states in the fiber is OPD= L · δn whereas the coherence length is L coh = (2 ln 2/π) 1/2 λ 2 /(n · δλ) for a Gaussian spectral distribution. For loss of coherence to be negligible, we need fiber OPD L coh , or L 18 m for our laser with linewidth δλ ∼ 0.07 nm. We chose L = 5 m and pump power 500 W to achieve a reasonable level of polarization modulation with limited loss of coherence and risk of damage in the fiber. From Section 2.2 we calculate ∼ 6 ps which is ∼ 0.1% of the pulse width; this places the interaction in the "no walk-off" regime. Lastly, it should be pointed out that the experimental laser pulse shape was not sech 2 (t) but had a more complicated shape as shown by the grey line in the Fig. 8 S 1 plot. The solution to the coupled differential equations depends on the exact pulse shape employed. To get a first order estimate of the veracity of the model we replace the sech 2 (t) power distribution in Eq. (4) with the measured pump power envelope.
A half-wave plate was used to orient the incident linearly polarized laser at 68 o to the fiber slow axis. Three PSAs were used to characterize the emission from the fiber; there were two linear PSAs oriented at 0 o and 45 o to measure the S 1 and S 2 Stokes elements, respectively, and a circular polarization analyzer was used to measure S 3 . Two, 1 GHz bandwidth, Thorlabs pin detectors were used at each polarizer to record the orthogonal polarization signals and then construct the Stokes elements from the difference of measured signals. Figure 8 shows the recorded Stokes elements in comparison with the model predictions. The model was based on Eq. (19) and (20) and also allowed for some small angular misalignment of the fiber, wave plate, and polarizer. Reasonably good agreement between model and measurement is observed. The quantities S 2 1 + S 2 2 + S 2 3 and S 0 should be identical if the pulse is fully coherent. Inspection of these quantities yields noticeable differences in pulse shape suggesting there is some loss of coherence due to the long fiber length. Coherence could be maintained by employing a fiber with end caps so that shorter fiber length and higher pump power could be employed. Future work Fig. 8. Comparison of measured and modeled Stokes elements; the elements were normalized to the peak power of the incident pump pulse. In the S 1 plot, the gray line is pump envelope plotted on the negative S 1 axis for comparison to S 1 . The red and blue lines denote data and model fits, respectively. will employ a pulse shape more consistent with the sech 2 (t) pump envelope and a short length of end-capped fiber to maintain coherence.

Conclusion
We have developed and analyzed a radically new, single pulse Mueller matrix Light Detection and Ranging (LiDAR) polarimeter based on a temporally multiplexed architecture and a PSG based on the Kerr effect in a birefringent optical fiber. We have derived, to our knowledge, the first solution describing the Kerr effect in a polarization maintaining fiber that includes walk-off effects. A temporally multiplexed LiDAR polarimeter performance model was constructed for a PSG based on the Kerr effect in PM fiber and a Pockels cell. The primary difference in the description of the two PSGs is the phase function; proper selection of fiber and pump parameters can enable both systems to produce similar temporal polarization distributions. Application of Cramer-Rao lower bound and Monte-Carlo simulations enabled identification of optimum measurement parameters and measurement accuracy dependence on signal to noise ratio. Simulations showed that samples with a diagonal Mueller matrix could be measured with high accuracy measurements in a single ∼ 10 ns pulse at sufficient signal to noise ratio. Lastly, experimental demonstration of the Kerr PSG showed good agreement with model predictions. Future work will employ the single-pulse Kerr effect PSG in a polarimeter for Mueller matrix measurements.

Disclosures
The authors declare no conflicts of interest.