Extraction and identification of noise patterns for ultracold atoms in an optical lattice

To extract useful information about quantum effects in cold atom experiments, one central task is to identify the intrinsic quantum fluctuation from extrinsic system noises of various kinds. As a data processing method, principal component analysis can decompose fluctuations in experimental data into eigen modes, and give a chance to separate noises originated from different physical sources. In this paper, we demonstrate for Bose-Einstein condensates in one-dimensional optical lattices that the principal component analysis can be applied to time-of-flight images to successfully separate and identify noises from different origins of leading contribution, and can help to reduce or even eliminate noises via corresponding data processing procedures. The attribution of noise modes to their physical origins is also confirmed by numerical analysis within a mean-field theory.


Introduction
Cold atom systems provide a unique opportunity to obtain various information with high precision from an interacting many-body system, which can help us gaining physical understanding of strongly correlated systems. For instance, analysis of spatial noise correlations in the time-offlight (TOF) images can reveal density or spin correlations for atoms loaded in optical lattices (OLs) [1,2], or pairing correlation in a Fermi superfluid [2,3]. Measurements of in-situ density fluctuations have revealed the Pauli blocking effect [4], provided information of spin or density susceptibility of strong interacting gases [5], and verified the scaling law of a critical state in two-dimensional Bose gas [6].
To reveal quantum correlation and fluctuation effects in an ultracold atomic gas, a central task is to separate the extrinsic noises due to the imperfect experimental setup and state preparation from the intrinsic sources of quantum or thermal fluctuations. These noises are often coupled, warped by nonlinear effects, and buried in massive pixels, which make the task even harder. Principal component analysis (PCA) provides a great approach for solving this problem [7][8][9][10][11]. Nowadays this method is being widely used in computer science to help reducing data dimensions and find internal structure of massive high dimensional data. For a similar purpose, we use PCA to analyze time-of-flight (TOF) images of Bose-Einstein condensates (BECs) in one-dimensional (1D) OLs., where data dimensions are as many as the number of image pixels but the noise origins are much fewer. The application of PCA on the raw TOF data suggests that the leading noise sources in our experiment are fluctuations of atom number and spatial position. By preprocessing the raw data with normalization and adaptive region extraction methods, we can significantly reduce or even eliminate these two noises. As a result, PCA of the preprocessed data reveals more subtle structure of noises. We attribute the few dominant noise components with their corresponding physical origins, and compare experimental results with numerical simulations using physical parameters determined by experiments.
The core of PCA is to represent variations approximately using a minimal group of orthogonal vectors called principal components (PCs) while preserving most of information, by which we can retain main features of variations without being distracted by other less essential factors. As PCA is applied to experimental data, PCs acquire their physical meanings apart from their original concepts in mathematics. In our system, experimental data are time-of-flight images and PCs are effectively eigen modes of fluctuations in our experiments. Thus, the total fluctuation is a linear combination of these eigen modes, and the result of a specific TOF image denoted by A i can be represented by the average over images plus its fluctuation, namely A i =Ā + ε i j P j . Here, P j denotes different eigen modes of fluctuation, the absolute value of the coefficient ε i j describes how much P j contributes to the i-th measurement A i , and its sign indicates in which way P j influences A i . Because of the linearity of PCA, the eigen modes associated with different PCs have a one-to-one correspondence to different sources of fluctuations in a linear system. For a nonlinear system, such as an interacting many-body quantum system, where variations are warped by nonlinear interaction effects, the eigen modes of PCA are in general nonlinear combinations of various noises. However, as we will demonstrated below, the nonlinearity in the present system turns out to be sufficiently small, such that different noises can be decomposed efficiently using PCA.
The remainder of this manuscript is organized as follows. In Sec. 2, we briefly introduce the experimental setup. The protocols of the PCA method is explained in Sec. 3, and then implemented for BECs both in the absence and presence of OL in Secs. 4 and 5, respectively. By comparing with numerical simulation, we identify the physical origins of up to five dominant PCs in the noise of TOF images. Finally, we discuss some remarks of the PCA method and summarize in Sec. 6.

Experimental setup
The system we used here is similar to the one in our previous experiments [12][13][14], which is a hybrid trap composed of a quadrupole magnetic trap and an optical dipole trap, as shown in Fig. 1. Our BEC setup is as follows. A BEC of about N 0 = 2 × 10 5 87 Rb atoms in the |F = 2, m F = 2 state is first prepared in the trap with frequencies ω x = 2π× 28Hz, ω y = 2π× 50 Hz and ω z = 2π × 60 Hz. Within 40 ms the BEC is adiabatically loaded into a 1D OL along the x-direction. The lattice wavelength is 852 nm, and the height can be tuned within a range from 6E R to 21E R , where E R = 2 k 2 L /2m is the photon recoil energy. After 35 ms we turn off the harmonic trap and the OL simultaneously to release the BEC. The absorption images are taken in the x-z plane upon 31ms of free expansion with the size of each CCD pixel 6.8µm×6.8µm. Here, we use strong saturated near-resonance imaging laser to obtain the density distribution of atomic gas of high density, and calibrate the the imaging system to validate the TOF measurement [15]. Since absorption imaging is destructive, atoms have to be prepared repeatedly, resulting in variations from shot to shot inevitably. In our experiments, we took about 40 images at each lattice depth. We also studied 100 TOF images of BEC without OL as a preliminary experiment. For each raw TOF image, we implement an optimized fringe removal algorithm (OFRA) to eliminate the background interference of imaging light [16]. As a result, the residue fringes are much weaker in magnitude and can be easily discriminated from signals for the few leading principal components.

Protocol of PCA
First, we need to extract an h × w region of interest from the raw TOF images, and then apply PCA to those regions. The region should be as small as possible to reduce noises, but fully cover the area where atoms reside. Details of the method are described below and illustrated in

Vectorize image For the i-th image, retrieve the region of interest and transform it into a
Construct covariance matrix Stack δ i together to form a matrix X = [δ 1 , δ 2 , · · · , δ n ]. Then the covariance matrix S is obtained by S = 1 n X · X T . Decompose covariance matrix Compute the matrix V of eigenvectors which diagonalizes the covariance matrix with V −1 SV = D.
Reconstruct images If needed, reshape those d × 1 eigenvectors of interest back to h × w matrices to reconstruct feature images.
We use the Scree graph method [11] to determine the number of PCs to be retained. Specifically, we plot the eigenvalues in descending order and determine the turning point from which the curve flattens. The eigenvalues above the turning point are of significance and about to be retained.

Preprocessing Method
To identify the physical origins of the PCs, we also preprocess the raw TOF data to eliminate the fluctuations of atom number and spatial position, and compare the new outcome of PCA with the original ones. The preprocessing methods are listed as follows.
Normalization Normalization is designed to reduce atom number fluctuation in the region of interest. In TOF images, the atom number is determined as where s is the CCD pixel size, σ is the absorption cross section, I is the CCD pixel value when probing atoms, I 0 is the pixel value when there is probe light but no atoms, and I bg is the background when there is no light.
As weakly interacting BECs in our experiments can be described by a macroscopic wave function Ψ(r) = √ N φ(r) [17], our interest is the density distribution of the normalized wave function |φ(r)| 2 instead of the prefactor N. So it is safe to normalize the TOF data so that the pixel values in a region of interest are summed up to unity. By doing so, densities in different TOF images fall into the same range and become comparable.
To normalize the raw TOF data, we first eliminate the bias. In principle, the pixel values at positions where no atom is present should be zero. However, there may exist a finite signal as a bias in real experiments. One method to eliminate this noise is by simply taking the bias as the minimal value of pixels. A slightly more complicated but more robust method that is used in our experiment is taking the mean value of pixels where there is no atom as a uniform background noise, and then subtracting it from all pixels. After the elimination of bias, we can normalize the pixel values usingṽ where v i j denotes the raw intensity of pixel labeled by coordinate indices i and j.
Adaptive region extraction Another fluctuation of TOF images is the shift of the cloud position, which may be induced by experimental misalignments of trapping potential, optical lattice potential, or imaging camera. Adaptive region extraction is designed to select a region whose center is also the center of density distribution. We first set a criterion to determine the center of density distribution within an extracted region, then use the center as a new region center to extract a new region. We iterate this procedure until the region to be extracted becomes stable. Note that in general the coordinates of the cloud center are not integers. While simply rounding them to integers may cause artificial anisotropy in our images, we use interpolation to estimate the pixel values with non-integer coordinates.
For the choice of criterion, a simple method is to set the pixel with maximal value as the center of density distribution. This procedure works well in most cases provided that the CCD pixel noises are small enough. In the following discussion, however, we use a slightly generalized criterion which determines the center by a weighted mean of all pixels in the region, where the weight is chosen to be the pixel values.

PCA Analysis in the Absence of OL
As a preliminary experiment, we first analyze TOF absorption images of a 31ms free-expanding BEC released from the harmonic trap, whose fluctuation modes should be relatively simpler. Fig. 3(a) shows percentages of the total variance associated with the first 10 PCs. The green dashed line is a smoothed line that connects these ten points, from which we can easily tell the turning point resides between the third and fourth PCs and the critical value for retaining PCs is around 5%. So we reconstruct the feature images corresponding to the first three primary PCs, which are shown in Figs. 4(a)-4(c). Figure 4(a) is similar to the original absorbing image, which corresponds to atom number fluctuations. Figures 4(b) and 4(c), whose percentages are of the same order of magnitude, reflect the position uncertainty of the BEC along the xand z-directions, respectively. Their origin should be some mechanical effects such as a shift of the magnetic trap position or drift of the CCD camera, both of which can cause a position deviation of BEC in TOF images. To validate our attribution, we preprocess data using methods introduced in Sec. 3.2 to eliminate the number and position fluctuations, which respectively correspond to the feature images of PCs shown in Fig. 4(a) and Figs. 4(b-c). We then apply PCA to the resulting data, and plot the percentages and eigenvalues of the first 10 leading PCs in Fig. 3(b). As compared to the outcome without preprocessing, there are only two PCs left above the critical line (gray line in Fig. 3(b)), with a combined contribution of > 90%. We reconstruct feature images of the first four PCs, and find that PCs whose patterns are similar to Figs. 4(a-c) now take the first, third and fourth places. The eigenvalue of the PC depicted in Fig. 4(a) decreases by 36%, from 1.846 × 10 −5 to 1.180 × 10 −5 , even if its ratio becomes larger because other noises are suppressed stronger. The eigenvalues and ratios of PCs depicted in Figs. 4(b) and 4(c) are below the critical line at present, ready to be ignored in our analysis. A new PC takes the second place after preprocessing, corresponding to the variation of the transversal radius of the cloud as depicted in Fig. 4(d), which may be attributed to the breathing mode of BEC.
From this result, we can conclude that the atom number fluctuations are effectively reduced and the position fluctuations are nearly eliminated by our preprocessing. We stress that if one employs a simpler but coarse version of adaptive region extraction where the pixel coordinates is always rounded to integers, although the PCs associated with position fluctuations can still be significantly reduced in magnitude, they remain to be leading PCs as the truncation errors are relevant if the size of our CCD pixels are larger than the real spatial shifts.

PCA Analysis of BEC in OL
We now apply PCA to analyze TOF images of BECs in an OL. In Sec. 5.1, we present the first five leading PCs and their corresponding feature images for a typical optical lattice depth of 15E R . The experimental results are in good quantitative agreement with numerical simulation as discussed in Sec. 5.2. Finally, we discuss in Sec. 5.3 the variation of the leading PCs with optical lattice depth.

PCA results of BEC in OL
We analyze TOF images of BECs loaded in an OL with depth of 15E R with the PCA method introduced above, and use the Scree graph method to obtain the feature images of the first five leading PCs as shown in the top panels of Fig. 5(b-f). To see clearly the variation patterns of these images, we also integrate over the vertical (horizontal) dimension for Figs. 5(b) and 5(d-f) (Fig. 5(c)) by summing up pixel values, and show the columnar density by solid blue lines in the corresponding bottom panels. In the top panel of Fig. 5(a), we present a typical example of TOF image before preprocessing, while the blue line in the bottom shows the three interference peaks clearly.
The first PC, denoted by P 1 (Fig. 5(b1)), has the same pattern as the atom number fluctuation PC discussed in Sec. 4, except for the symmetric side peaks caused by the presence of OL. Indeed, even in the case of high lattice depth where the atoms residing on different lattice sites form a local quasi-condensate while the system as a whole does not possess long-range phase coherence, the side peaks are still present as a consequence of short-range correlation [18]. If we normalize the data, this PC will disappear or become significantly less important.
The second (P 2 ) and third (P 3 ) PCs as depicted in Figs. 5(c1) and 5(d1) show clear patterns of position fluctuation along the zand x-directions, respectively. If we extract the region of interest adaptively, these noise modes almost disappear. However, unlike the case without OL, if we eliminate the first three PCs with preprocessing method, a PCA on the resulting data gives a leading PC of number fluctuation mode again, indicating that the number fluctuation of BEC in OLs can not be normalized as effectively as in the case without OL.
The feature image of the forth PC P 4 as shown in Fig. 5(e1) is similar to that of the number fluctuation mode P 1 , but with two negative dips accompanying the interference peaks. As we will see in the next subsection, this pattern reflects fluctuations of the width of each peak in the TOF image.
The fifth PC (P 5 as in Fig. 5(f1)) is featured by a central dip and two side dips with an overall Gaussian profile. The profile strongly suggests an intimate relation to fluctuations of the normal fluid fraction. The presence of dips can be understood by noticing that within the constraint of atomic number conservation, the increase of thermal atomic number is accompanied by a decrease of condensation fraction, which in turn leads to a reduced visibility of interference pattern.
Before concluding this subsection, we emphasize that the patterns of P 4 and P 5 are very small variations which can strongly couple with background noises. It is very difficult to distinguish these noises by conventional analysis on the TOF images. As a comparison, PCA works very effectively to extract these information.

Numerical simulation and comparison with experiments
To validate our previous attribution of physical origins to different PCs, we perform a numerical simulation for BEC in an OL using time-split spectral algorithm (TSSP) algorithm [19,20]. We first calculate the ground state wave function of the BEC by solving the conventional Gross-Pitaevskii equation (GPE) within the potential generated by the OL and the magneto-optical hybrid trap. The wave function then undergoes a free expansion of 31ms governed by the time-dependent GPE. At the end, a Gaussian envelop is added to the density distribution to simulate the excited fraction, which can not be described properly with the GPE. The reason we use a Gaussian distribution to describe non-condensed particles is because the velocities of these atoms obey a Maxwell distribution after release.
To incorporate the fluctuation effects in our simulation, we extract the thermal fraction and the temperature of each shot by a bimodal fit of the raw TOF images. From Fig. 6, we find that the fluctuations of temperature T and thermal atom fraction P ex ≡ N thermal /N tot are about 100 nK and 15 ∼ 20%, respectively. We stress that these quantities, together with the average values of T and P ex for different lattice depth, are all directly measured from experiments with no fitting parameters.
We consider separately the tilt of OL, the fraction of excited atoms, and the width variation of interference peaks induced by defocusing effect, and compare the corresponding numerical results with experimental outcome of P 3 , P 4 , and P 5 , respectively. In Figs. 5(d2), 5(e2) and 5(f2), blue lines are experimental results and orange lines are simulation results. The details of our simulation results are as follows.
Potential energy gradient in OL As shown in Fig. 5(d), a translational shift along the xdirection is mainly manifested by P 3 . In the absence of OL, the eigenvalue of P 3 is of the same order in magnitude as that of P 2 , which characterizes position fluctuations along the z-direction  (see Fig. 3). In an OL, however, P 3 becomes more significant for most experimental realizations. This strongly indicates that there must be some effects related to the OL contribute to P 3 , in addition to the spatial shifts of the magnetic trap, the BEC and the camera. We attribute this fluctuation to a potential energy gradient between different lattice sites. One of the major gradient sources comes from the gravity field as the OL cannot be perfectly horizontal, which introduces a phase gradient ∆φ = mg sin θ · λt hold /2 between sites separated by λ/2, where m, g, t hold , and θ are the atom mass, the gravity acceleration constant, the holding time of OL, and the angle of tilting from the horizontal direction, respectively. As a result, the TOF pattern is shifted along the direction of OL. In our numerical simulation, we consider a random tilting with an angle of no more than ±1.8 × 10 −5 rad. We find a good agreement between the simulation and the experimental result, as shown in Fig. 5(d2). This observation suggests that the PCA method can reveal very small potential gradient in OL, which may have application in detecting microgravity. Although the sensitivity reported here is about two orders of magnitude smaller than the uncertainty of 10 −7 reached by measuring the 5th harmonic of Bloch oscillation of 88 Sr atoms in tilted optical lattices [21], our scheme can be easily implemented with 87 Rb atoms in a simpler experimental setup with conventional TOF techniques.
Fraction of excited atoms In our experiment, the fraction of normal state atoms can hardly be a constant because of the imperfection of our preparation and loading processes [22]. It is reflected by P 5 in Fig.5(f1). We emphasize that to obtain a quantitative agreement with the experimental observation of P 5 , in the numerical simulation we consider fluctuation of excited fraction, under a constraint of total particle number conservation. This requires a normalization of the solution of GPE after a Gaussian fluctuation is added.

Width variation caused by defocusing effect
The GPE is a nonlinear equation with an interaction potential term NU 0 |ψ (r, t)| 2 ψ (r, t). This repulsive interaction between atoms tends to broaden the atomic distribution in both spatial and momentum space, resulting a defocusing effect with wider peak widths in TOF images. Thus, the variation of atom number N can induce fluctuations to TOF signals, which can not be fully eliminated by a normalization of the BEC wave function. Another factor one needs to take into account is that while reducing three-dimensional GPE to 1D GPE, we have to assume a distribution along the y-and z-directions to reduce |ψ (r, t)| 2 to |ψ (x, t)| 2 . This means the density distributions along the y-and z-directions still have an influence on peak width along the x-direction. Fluctuation in these transversal directions as shown in Fig. 4(d) for the case without OL is hence another source of noise.

Variations with OL depth
We now turn to the PCA results for OLs of different depths. To quantify the trend of variation of PCs, we define a quantity where E P i is the eigen value of P i . In the following discussion, we combine P 2 and P 3 together and study γ P 2 +P 3 ≡ γ P 2 + γ P 3 because their underlying physical origins are the same. From Fig. 7(a), we notice that both γ P 1 and γ P 2 +P 3 decrease with increasing lattice depth, while γ P 4 and γ P 5 exhibit an opposite dependence. These trends are qualitatively consistent with our attribution of physical origins of noises, considering the fact that the fluctuation of excited (a) 6   fraction (P 4 ) and the interaction effect (P 5 ) become more severe as OL gets deeper. In fact, we numerically simulate peak width variation under density distribution fluctuation at different lattice depths with all other conditions fixed. As shown in Fig. 7(b), γ P 4 indeed grows when approaching the quantum phase transition with increasing OL depth, which agrees with the experiment result qualitatively. In fact, one would naturally expect that fluctuation of interference between different lattice sites will be significantly enhanced near the phase transition point.

Summary and final remarks
Before concluding, we emphasize that the eigenvectors given by PCA are uncorrelated but not necessarily independent with respect to physical noises. In other words, one single PC can reflect a combination of more than one physical origins, although in most cases only one source is dominating and can be clearly distinguished. In this paper, we demonstrate that PCA can be readily implemented to obtain accurate physical interpretations for noises in complex many-body systems. As a comparison, independent component analysis (ICA), a method based on PCA, can reach optimized and parameterized independence between basis vectors [23]. But it requires a clear understanding about the sources of noise and their effects on the system to properly set criteria of independence. An inappropriate parameter setting could lead to severe wrongful analysis. This kind of misleading is absent in PCA as it is a standard, non-parametric method, requiring no prior knowledge of the system. In summary, we introduce a method of PCA to analyze noise in TOF images of a BEC in a 1D OL. By investigating the corresponding feature images, we identify the physical origins associated to a few PCs of leading contribution. This understanding is then confirmed by a numerical simulation of a GPE with external sources of fluctuations. In particular, we can extract not only classical fluctuations such as a small tilt angle of the optical lattice laser, but also quantum fluctuations such as the fraction of non-condensed excited particles. Both factors are very weak effects that can not be extracted by conventional investigation of interference patterns. Based on the knowledge of the physical origins of leading PCs, we also design a preprocessing method to significantly reduce or even eliminate fluctuations of atom number and spatial position.