Analysis of Proton Bunch Parameters in the AWAKE Experiment

A precise characterization of the incoming proton bunch parameters is required to accurately simulate the self-modulation process in the Advanced Wakefield Experiment (AWAKE). This paper presents an analysis of the parameters of the incoming proton bunches used in the later stages of the AWAKE Run 1 data-taking period. The transverse structure of the bunch is observed at multiple positions along the beamline using scintillating or optical transition radiation screens. The parameters of a model that describes the bunch transverse dimensions and divergence are fitted to represent the observed data using Bayesian inference. The analysis is tested on simulated data and then applied to the experimental data.

A precise characterization of the incoming proton bunch parameters is required to accurately simulate the self-modulation process in the Advanced Wakefield Experiment (AWAKE). This paper presents an analysis of the parameters of the incoming proton bunches used in the later stages of the AWAKE Run 1 data-taking period. The transverse structure of the bunch is observed at multiple positions along the beamline using scintillating or optical transition radiation screens. The parameters of a model that describes the bunch transverse dimensions and divergence are fitted to represent the observed data using Bayesian inference. The analysis is tested on simulated data and then applied to the experimental data.

I. INTRODUCTION
Plasma can sustain high electric fields and can be used to produce accelerating gradients larger than in conventional particle accelerators [1][2][3][4][5]. The Advanced Wakefield Experiment (AWAKE) [6] is an experiment located at CERN that investigates proton-driven plasma wakefield acceleration [7]. AWAKE uses proton bunches from the CERN Super Proton Synchrotron (SPS) with the en-ergy of 400 GeV to accelerate a witness bunch of electrons in a ten-meter-long rubidium plasma source. The seeded self-modulation mechanism [8,9] is used to divide the long driver bunch into a group of shorter bunches that can resonantly drive wakefields [10,11].
A detailed understanding of the parameters of the incoming proton bunches is important for the understanding of the experimental data. The bunch and plasma parameters define the wakefield structure, and a compar- 1. The layout of the AWAKE experiment. Electron, proton, and laser beams that propagate from left to right are shown in blue, red, and green colors, respectively. The bottom subplot shows the self-modulated proton bunch at the plasma exit. Retrieved from [12].
ison of experimental observations to simulations requires correct bunch parameters. It has been shown [13] that results of plasma modeling are sensitive to the parameters of the driver bunch, and uncertainty in the input bunch parameters complicates the comparison of the simulation results and experimental measurements.
In this paper, we present a determination of the parameters of the unmodulated proton bunch by combining the data from multiple beam imaging systems. We test our analysis using simulated data and then apply it to experimental data to study variations of the parameters over a large number of events.
The paper is structured in the following way. In section II, a short overview of the AWAKE experiment is given with a focus on the proton bunch, as it plays the central role in our study. In section III, the experimental setup used in our analysis is described, an overview of the acquired data is given and preliminary investigations are discussed. A statistical model is presented and tested on simulated data in section IV. This is followed by section V, in which parameters and their prior probabilities are discussed, and results are presented. Finally, conclusions are presented in section VI.

II. PROTON BUNCH IN THE AWAKE EXPERIMENT
The proton bunches used in the AWAKE experiment are produced in the CERN accelerator complex. They are accelerated in the Linac 4, Proton Synchrotron Booster, and Proton Synchrotron accelerating stages and reach the required energy of 400 GeV in the SPS. After this, they are sent to the AWAKE facility with intervals of 15−30 s. Extraction of a proton bunch to the AWAKE facility will be further denoted as an 'event'.
In AWAKE, the proton bunch enters a ten-meterlong rubidium vapor section [14,15] together with the co-propagating laser pulse (see Fig. 1). According to the Run 1 baseline [16], the bunch is focused at z w = (5 ± 3) cm after the entrance to the rubidium section. The radial size is σ x,y = (0.20 ± 0.04) mm, longitudinal size is σ z = 6 − 8 cm and angular divergence is σ x,y = (4 ± 2) × 10 −5 rad, where σ represents a Gaussian standard deviation. The parameters of proton bunches, such as the bunch centroid and population, fluctuate from event to event.
The laser pulse [17] with a maximum energy of 450 mJ and a central wavelength of 780 nm ionizes the rubidium vapor, creating a relativistic ionization front that co-propagates with the proton bunch. The relativistic ionization front is much shorter than the typical period of the wakefields (> 3 ps) and it is used to seed the controlled self-modulation instability of the proton bunch. The resulting microbunches then act resonantly to drive large wakefields during the plasma section.
In the SPS accelerating ring, wire scanners are used to measure the transverse and longitudinal bunch emittances with the accuracy of 20% [6,18,19]. The Beam Quality Monitors [20,21] are used to measure longitudinal bunch profiles, and Beam Current Transformers (BCT) [22] are used to determine the bunch intensity during the whole accelerating cycle.
In the AWAKE experiment, transverse and longitudinal structures of the bunch can be observed before and after the plasma section using diagnostics based on Optical Transition Radiation (OTR) or scintillating light, which is produced when the bunch crosses light-emitting screens placed at 45 • to the beamline [6].

III. MEASUREMENTS AND DATA
Four beam imaging systems that capture images of the transverse profile of the unmodulated proton bunch before and after the rubidium vapor section were used in our analysis. Parameters of the beam imaging systems are summarized in Table I. The first three stations have CCD cameras with OTR screens, and the last station has a digital CMOS camera with a scintillating screen. Fig. 2 illustrates the relative positions of the measurement stations, the plasma section, and the envelope trajectory of the unmodulated bunch for the baseline parameters. It can be seen that the first two stations are located very close to the waist position, so they mainly carry information about the size of the bunch close to the focus. The last two stations are located much farther from the waist position, and they are primarily sensitive to the angular divergence of the bunch. We performed measurements on October 10, 2018, during which a dataset that consists of 672 events was collected. By 'event data' we denote 4 images from the beam imaging systems and the proton bunch population measured in the SPS using the BCT. Four types of proton bunches were requested from the SPS operators with the parameters summarized in Table II. The first two types correspond to the bunches with small, (7.77−10.30)×10 10 p + , and large, (23.20−28.00)×10 10 p + , populations. These events are intended to study the impact of the bunch population on the bunch emittance and focal size. The second two types represent bunches with or without longitudinal compression. The bunch compression is achieved via a rotation in longitudinal phase space using a voltage step with a fast rise time [23,24]. The typical longitudinal bunch length (rms) without the bunch rotation is ≈ 9.6 cm and with the bunch rotation ≈ 7.9 cm. The acquired dataset represents proton bunches commonly used in the later stages of Run 1 datataking period.
An example of the event data with a population of n = 24.04 × 10 10 p + is shown in Fig. 3. The central part of the images represents the OTR light on the first three screens and the scintillating light on the last screen produced by the proton bunch. The background noise is produced primarily by secondary particles that are generated upstream of AWAKE.

A. Preliminary Investigations
The analysis relies on knowledge of calibration factors and resolutions scales for the devices used. These are described briefly.

Pixel Calibration Factor
To determine the calibration factor of the pixel sizes, we use calibration frames that are engraved on the surface of each light-emitting screen. Parameters of the frames, such as width, height, and sizes of the engraving lines, are known to high precision. The screens are placed at an angle of 45°with respect to the beamline and rotated  on the horizontal axis. To calculate the horizontal calibration factor, we divide the absolute size of the frame by the number of pixels that correspond to it. To determine the vertical calibration factor, the absolute size of the frame is multiplied by cos(π/4) and then divided by the number of corresponding pixels. In the following, we assume that the pixel size includes the correct calibration factors. The resulting pixel sizes for each camera are specified in Table III.

Resolution Function
There are two types of resolution effects present in the measurements. The first is the resolution of the optical system of the camera. The second is the resolution of the light-emitting screen. We assume that they both are convolutions with Gaussian kernels and their superposing effect is also represented by Gaussian convolution with the variance given by the quadrature sum of these two components.
We estimate the optical camera resolution using the images of the calibration frame previously discussed in the pixel calibration. For this, we assume that the engravings have sharp edges and can be represented as a product of Heaviside step functions. The images of the frames captured by the corresponding cameras are used to determine the unknown optical resolution parameters.
The study [25] performed at the CERN HiRadMat test facility indicates that scintillating screens are characterized by a worse resolution compared to the OTR screens. We include the additional smearing from the scintillating screen in our analysis using nuisance variables and consider constant resolution parameters for 3 cameras with OTR screens. The detailed values of resolution parameters are specified in Table III.

Background Distribution
We divide the image from each camera into different regions (see Fig. 4). Those pixels located in the central region -which is approximately 4-5 standard deviations around the bunch centroid -are used to infer the parameters of the proton bunch. The remaining pixels are combined from multiple events to approximate the background distribution via binned histograms. We use separate background distributions for each camera and datasets with small and large bunch populations. An example of the background distributions from four different regions of one camera is shown in Fig. 4. It can be seen that all histograms significantly overlap, demonstrating that the background follows a similar distribution in different parts of the camera. Background distributions from the four different cameras are shown in Fig. 5. The histograms corresponding to cameras with the OTR screens have a similar structure. Namely, they all have a maximum at 0, long tails that extend to the amplitudes of > 2000, and a small bump of saturated pixels at the righthand side of the histograms. To avoid a possible negative impact of the saturation on the analysis, we discard those pixels that exceed the threshold values of 2700, 3400, 2400,1750, for each camera, respectively (see also dashed lines in Fig. 5). The histograms of signal and background show a smooth behavior up to these threshold values without signs of saturation.
The images produced by the last beam observing system have ∼ 10 times more pixels compared to the images from the three other cameras. To improve the run-time of the analysis, we average every 3×3 pixels from the last camera. Averaging of the pixels reduces the long tail of the distribution as can be seen in Fig. 5 (bottom subplot). In the following, we will assume that the background on each pixel of the last camera is well-modeled with a truncated (form 0 to 4095) Gaussian distribution. The distribution variance is determined from the histogram, and the mean is fitted with a free parameter.

IV. STATISTICAL MODEL
We perform the statistical analysis using a Bayesian approach. Prior probabilities about parameters λ, ν of the model M are updated to posterior probabilities using Bayes' theorem where P (D|λ, ν) is a likelihood that represents a probability of data given the model, P (λ, ν|M ) is a prior, and P (λ, ν|M, D) is a posterior probability distribution. The data from one event is denoted as D ≡ d j x,y , where d is a signal from one pixel, x, y are the row, column of the pixel and j represents the camera index. The dataset consists of multiple events {D} i where i denotes the event index. In Eq. 1, λ represents parameters of interest, and ν represents nuisance parameters. Some of the nuisance parameters are kept constant, and others are free parameters of the fit that will be marginalized at the final stage of the analysis.
We analyze events consecutively and the likelihood of one event is defined as a product of likelihoods of individual pixels where p(d j xy |λ, ν) is the probability of the detected signal for one pixel given the model parameters.
The d j xy are composed of a background noise, with a probability distribution denoted as P b (d j xy ), and a signal P p (d j xy |λ, ν) produced by the proton bunch. In the following, we will avoid indices x, y, j in d j xy for notational convenience if one pixel is considered. We assume that, for the given camera, the background is the same for all pixels and we approximate it by a binned histogram (see previous section). The resulting superposition of these two contributions is given by the convolution The convolution is computed numerically for the first three cameras due to the non-analytic form of the background, and it is computed analytically for the last camera, where the background is Gaussian. An example for such a convolution is shown in Fig. 6, where it is assumed that the signal from the proton bunch has a Gaussian distribution. The evaluation of the numerical convolutions is computationally expensive, but it is needed to include correctly the non-analytic features of the background.

A. Bunch Propagation Equation
We consider a model in which each particle of the bunch follows a linear equation of motion defined as where i denotes the particle's index, z w denotes the waist position, i.e., the coordinate along the beamline in which the radial bunch size is minimal, r 0 = (x 0 , y 0 ) is the particle position at the waist, and r = (dx/dz, dy/dz) ≈ (θ xz , θ yz ) is the particle angle with respect to the beamline in the x − z and y − z planes. The distance along the beamline is denoted by z and is defined as the trajectory of the bunch centroid. We measure distances in the transverse directions relative to the center of the proton bunch. To determine the center of the bunch, we use two parameters per camera, one for the x and one for the y directions, labeled as µ j,x , µ j,y with j denoting the camera index (see Table III).
The envelope equation that describes the transverse size of such a bunch along the beamline can be defined as where · denotes the average over the ensemble of particles. It is assumed that there is no correlation between the x and y projections of the bunch, and that r 0 and r are not correlated at the waist position, i.e., r 0 r = 0. For notational convenience, the bunch size at the waist will be denoted as σ 0 = r 2 0 , and the angular spread of the bunch as σ = r 2 . We consider two models that describe the distribution of protons in the beamline.
In the first model, we assume that the proton bunch is represented by a single Gaussian distribution in the transverse and longitudinal directions. The transverse distribution of the protons at each light-emitting screen can be described by a bivariate Gaussian distribution with the bunch size, σ(λ, ν), described by Eq. 5 y|σ(λ, ν), µ(λ, ν)), (6) and this model will be further denoted as 'Single Gaussian'.
In the second model, denoted as 'Double Gaussian', we assume that the proton bunch is represented by a mixture of two Gaussian components. These components have the same alignment but different sizes, waist positions, and angular divergence. In this model, the transverse distribution of the protons at the light-emitting screen is defined as I p (x, y|λ, ν, M 2 ) = αN (x, y|σ 1 (λ, ν), µ 1 (λ, ν)) + +(1 − α)N (x, y|σ 2 (λ, ν), µ 1 (λ, ν)) (7) where α controls the significance of each contribution. The component with larger angular divergence will be called 'halo' and the smaller 'core', and their parameters will be denoted by subscripts 'c' and 'h', respectively. The single and double Gaussian models are nested, and they predict the same result if α = 1.

B. Camera Modeling
Light with intensity proportional to the number of particles in the bunch is emitted when the proton bunch crosses the light-emitting screen. This light experiences optical smearing that can be represented by a convolution of I p (x, y|λ, ν) with a Gaussian kernel N (x, y|σ) with zero mean and resolution parametersσ, i.e.
The amount of light captured by one pixel is given by the integral over the pixel surface. We assume that the signal at each pixel is described by a Gaussian probability distribution with the mean given by i j I, and the standard deviation of f j √ I, where j denotes the camera index and i j , f j are coefficients of proportionality represented by free parameters (see Table III): This expression is used in Eq. 3 to compute the likelihood for one pixel. The likelihoods from all pixels are multiplied together to get the final event likelihood described by Eq. 2.  We sample the posterior distribution given by Eq. 1 using a Markov chain Monte Carlo (MCMC) algorithm. The likelihood is implemented in the Julia programming language, and the BAT.jl [26]

C. Model Validation
To validate that the analysis leads to the correct reconstruction of the parameters, we performed the following procedure: 1. True parameters of the models were defined.
2. Simulated events were generated that correspond to these parameters. The simulated events include background noise, lights fluctuations, optical smearing of cameras.
3. The analysis algorithms were applied and the resulting parameters were compared to the true values.
The validation procedure was performed for both single and double Gaussian models. Excellent agreement was found between the extracted and true parameters for both models. An example of the simulated data and the fitted model for the double Gaussian bunch density is presented in Fig. 7. A violin plot with the parameter distributions is shown in Fig. 8. It can be seen that true parameters are well within the 95% central interval of the posterior distribution. Fig. 8 shows that some of the parameters, such as the alignment of the bunch, can be reconstructed with great accuracy, with uncertainties smaller than 3% of pixel sizes. Another group of parameters, such as the transverse size and angular divergence, is characterized by uncertainties of a few percent. The nuisance parameters that describe the resolution function of the scintillating screen have the largest uncertainty. The camera with the scintillating screen is located at the largest distance from the waist position, and the bunch size is determined primarily by the angular divergence of the beam. The resolution is not a critical parameter, and it is not correlated significantly with the proton bunch parameters.
We have tested the sensitivity of the analysis to the truncation of the data. Pixels that exceed the threshold values determined from the experimental data were discarded from the analysis, and the change in the resulting posterior means is well within the uncertainties of the parameter values.

A. Parameters and Priors
The parameters that describe the proton bunch distribution are where the two vectors correspond to the single and double Gaussian models, respectively. In addition, the nuisance parameters are where j = 1 .. 4 denotes the camera index, and the bold font is used for the parameters that have the x and y components. A summary of all the parameters is given in Table III. The prior probability distributions for the proton bunch parameters are selected based on the AWAKE design report. The priors of the transverse size of the core and halo components of the bunch at the waist position (denoted as σ c,x , σ c,y , σ h,x , σ h,y ) are described by Gaussian probability distributions with means of 0.2 mm and standard deviations of 0.04 mm. The priors of the angular divergences of the bunch (denoted as σ c,x , σ c,y , σ h,x , σ h,y ) are described by Gaussian probability distributions with means of 4 × 10 −5 rad and standard deviations of 2 × 10 −5 rad. Initially, very broad prior ranges were considered for the proton bunch parameters. After learning about the typical locations of the posteriors, the prior ranges were restricted to reduce computational time. We truncate the angular divergence of the core component, denoted as σ c,x , σ c,y , to the range [1×10 −5 , 8×10 −5 ] rad; and the halo component, denoted as σ h,x , σ h,y , to the range [1×10 −5 , 4×10 −5 ] rad to clearly identify the halo and core components. The prior probability distributions for the waist positions of the core and halo components, denoted as z w,c , z w,h , are described by a Gaussian distribution with means of 2.774 m and standard deviations of 0.03 m. The coefficient that defines the intensity ratio for the halo and core component is denoted as α and is described by a uniform prior.
Parameters that represent the bunch centroid at the camera j along the x and y directions are denoted as µ x,j , µ y,j , and they are given by uniform prior probability distributions in the ranges specified in Table III. The pixel sizes along the x and y directions are denoted as ∆x j and ∆y j , and they are represented by the Dirac delta prior distributions (further denoted as 'constant prior'). The resolution parameters along the x and y directions, denoted asσ j,x ,σ j,y , are assumed to be constant for the cameras that have OTR screens. The prior for the camera with the scintillating screen was modeled with a mean of 3 pixels and a standard deviation of 1.5 pixels. The conversion of the proton bunch distribution into the pixel signal is performed by defining proportionality coefficients i j , that are described by uniform priors. The standard deviations of the Gaussian fluctuations of the light are defined by f j (see section IV). This parameter has a constant prior for those cameras where the numerical convolutions of the background and signal are computed. For the last camera, the signal fluctuation (f 4 ) and the mean of the background (p 4 ) are kept as free variables with uniform priors.

B. Results
We analyzed each event summarized in Table II independently, using single and double Gaussian bunch models. A comparison of the data from one event (with a small bunch population and the bunch rotation OFF) to the best-fit predictions from the two models is shown in Fig. 9. It can be seen that the double Gaussian model fits the data more closely than the single Gaussian model. The prediction from the single Gaussian model shows a visible discrepancy with the data in the first, third, and fourth cameras. It is especially evident for the events with a small bunch population. A much better data-model agreement is reached for the double Gaussian model, showing that the data fluctuations are covered reliably by the 95% central interval of the posterior probability distribution. A small disagreement of the distribution for Cam. 4 (see the bottom subplot in Fig. 9) resulted from the fact that some of the pixels were discarded from the analysis due to their saturated signals. In the following, we discuss only the results obtained by using the double Gaussian proton bunch model.
The transverse sizes of the halo and core components of the bunch at the waist position are shown in Fig. 10. The size of the core is larger than the size of the halo for all events, and both components are significantly smaller than the mode of the prior distribution. There is a correlation of the bunch size with increasing bunch population. The transverse bunch profile is elliptical.
The angular divergences for different events are shown in Fig. 11. Events with a larger bunch population have a smaller angular divergence of the halo component.  The waist positions of the halo and core components are shown in Fig. 12. The figure shows that the core component of the bunch is focused much closer to the expected waist position compared to the halo component, which is focused further downstream.
The bunch size at the waist and angular divergence can be used to compute the bunch emittance using the following engineering formula = 426.0 · 10 3 · σ[mm] · σ [10 −5 rad].
The posterior distributions of the halo and core emittances are summarized in Fig. 13. It can be seen that the bunch emittance increases with the increasing bunch population for both components. The x component of the emittance for the halo component is more strongly correlated with the bunch population than the y component.
The coefficient that shows the intensity of halo and core components is shown in Fig. 14. There is a correla- tion of α with the proton bunch population that shows that the halo component is more significant for the events with a larger bunch population.
In the analysis, we use nuisance parameters to determine the coordinates of the bunch at each screen. These coordinates can be combined to determine the relative alignments of the screens together with polar and azimuthal angles of the individual bunch centroids. By propagating individual bunch centroids to the waist position, the drift and jitter of the bunch can be computed. Fig. 15 shows that the bunch center drifts as a function of event number, which is proportional to the time at which measurement occurred. The standard deviation at the waist is σ(µ x ) ≈ 43 µm, σ(µ y ) ≈ 20 µm. Once the time-drift of the bunch is subtracted, the resulting jitters are σ(µ x ) ≈ 41 µm and σ(µ y ) ≈ 8 µm.
A summary of the average posterior parameters for events with small and large bunch populations is given in Table III for the double Gaussian model. A summary of the average proton bunch parameters for the single Gaussian model is given in Table IV.

VI. CONCLUSIONS
We have developed and applied an approach to analyzing the parameters of the proton bunch in the AWAKE experiment. In this approach, the data from multiple beam imaging systems that capture integrated radial bunch profiles were used to determine optimal parameters of the model that describe proton bunch propagation along the beamline. The fitting procedure was performed using a Bayesian approach, and the MCMC sampling was used to extract the posterior distributions. This approach will be used in future AWAKE runs to analyze the stability of the proton bunch parameters over long runs. Two models that describe the radial bunch density were considered. In the first model, the transverse bunch profile was represented as a Gaussian function, in the second model -as a mixture of two Gaussians denoted as halo and core. By definition, the halo component has a larger emittance compared to the core. These models have been tested using simulated events, and the results show that reconstruction of true parameters is possible with typical uncertainties of a few percent.
We have acquired a dataset with 671 proton bunch extractions, each characterized by varying proton bunch parameters, and applied the developed analysis scheme to this experimental data. It has been demonstrated that the double Gaussian model gives much better agreement with the experimental data compared to the single Gaussian model. The resulting posterior parameters for the double Gaussian model indicate the following: -The transverse size of the proton bunch at the waist position is smaller than the baseline parameters for all bunch populations.  Waist Core (m) -Sizes of the halo and core components increase with the increasing bunch population.
-The transverse bunch profile is elliptical (the horizontal size is smaller than the vertical by ≈ 8% and ≈ 40% for the core and halo components, respectively).
-The contribution of the halo component increases with the bunch population.
-The bunch emittance is smaller than the nominal parameters.
A systematic drift of the bunch centroid was observed of approximately 50 µm during 6 hours of measurements. The jitter of the bunch at the waist position after drift correction is σ(µ x ) ≈ 41 µm and σ(µ y ) ≈ 8 µm.