Generator of arbitrary classical photon statistics

We propose and experimentally demonstrate a device for generating light with arbitrary classical photon-number distribution. We use programmable acousto-optical modulation to control the intensity of light within the dynamic range of more than 30 dB and inter-level transitions faster than 500 ns. We propose a universal method that allows high-fidelity generation of user-defined photon statistics. Extremely high precision<0.001 can be reached for lower photon numbers, and faithful tail behavior can be reached for very high photon numbers. We demonstrate arbitrary statistics generation for up to 500 photons. The proposed device can produce any classical light statistics with given parameters including Poissonian, super-Poissonian, thermal, and heavy-tailed distributions like log-normal. The presented method can be used to simulate communication channels, calibrate the response of photon-number-resolving detectors, or probe physical phenomena sensitive to photon statistics.


I. INTRODUCTION
Statistics of light is of paramount importance for many applications in metrology, material probing, biomedical optics, and optical communications to name a few. Stellar correlation interferometry represents the first application directly exploiting statistical properties of light to improve the resolution of a star angular diameter measurement [1]. In detector metrology, characterization of single-photon detectors or photon-number-resolving detectors is based on probing a detector under test using optical signals with well-defined classical statistics or correlations, and evaluating the corresponding outputs [2][3][4]. Light with thermal statistics improves the efficiency of light-matter interaction like optical harmonic generation [5][6][7] and two-photon fluorescence [8]. Correlations inherently present in thermal light can enhance interferometric phase sensitivity [9] and facilitate sub-wavelength interference [10]. Ghost imaging is another counterintuitive technique, which requires correlated light signals obtained typically by splitting light with thermal statistics [11,12]. It was also shown that excess optical thermal noise can improve the quantum tomography of an unknown optical signal [13]. Alternatively, a distribution of loss in front of a single photon detector enables to estimate photon statistics of impinging optical signal [14]. Furthermore, generating arbitrary loss distributions allows us to faithfully simulate fading communication channels [15][16][17][18]. Asymmetric and heavy-tailed light statistics like log-normal, Lévy, and Weibull, which play a role in communication through turbulent atmosphere, are also important for optical rogue waves and other rare phenomena attracting a lot of attention recently [19]. The ability to produce light with extreme distributions of optical power or photon numbers would significantly help with simulations of these processes and also probing them.
The statistics of individual photons in an optical sig- * straka@optics.upol.cz nal is a fundamental expression of quantum properties of light. A semi-classical approach considers the probability density of photon occurrence proportional to the intensity of light [20]. Any photon statistics that adheres to this model is called classical. Using this principle, we can stochastically modulate optical intensity to obtain the corresponding photon statistics [21]. So far, the main focus has been on generating chaotic or pseudo-thermal light, the standard method for which is collecting light scattered by a rotating ground glass [22], a multi-mode fiber [23], or other scattering processes [24,25]. Such an approach is essentially a negative-exponential intensity modulation that results in Bose-Einstein photon statistics.
In this manuscript, we propose a device based on acousto-optical modulation that can be programmed to perform arbitrary intensity modulation. This can be used as a source of light with programmable photon statistics and also for simulation of transmission fluctuations in communication channels. We also propose a new method of obtaining the distribution of optical intensity from an arbitrary photon-number distribution, that is particularly useful for experimental work. We produce several statistics including Poisson, various super-Poisson, thermal, log-normal, bimodal, and uniform distributions. The generated statistics are characterized using a time-multiplexed photon-number-resolving detector and compared with theoretical expectations. Moreover, we demonstrate generating faithful tail behavior of photon statistics and producing highly bunched light. The proposed generator is also readily extensible to the pulse regime or other techniques of modulation.

II. PHOTON STATISTICS
Photon statistics in a light beam of constant power follows a Poisson point process. Let us consider a stochastic variable: the number of photons n in a time window of length τ , while P is a constant optical power and E the energy of one photon. It follows that the mean number of photons n = P · τ /E. Also, n follows the Poisson probability distribution p n = e − n · n n /n!. If the optical power P is a stochastic variable as well, typically changing in time, we have to consider an integrated optical intensity W (t) := t+τ t P(t)/E dt, a dimensionless quantity that is governed by a certain probability density P (W ). The probability distribution p n of the number of photons then follows a weighted mixture of Poisson distributions, given by Mandel's formula [20] Mandel's formula covers all possible photon statistics for classical states of light. Generating such statistics can be useful if one needs to simulate classical states with no regard to optical phase. Good examples could be found in practical quantum networks -simulation of optical channels or characterization of single-photon detectors.
In principle, Eq. (1) enables engineering photon statistics p n by modulating W using a corresponding P (W ). To this end, we employ an acousto-optical modulator (AOM) driven by a harmonic signal, the amplitude/power of which can be digitally controlled (see Fig. 1). An attenuated continuous optical beam passes through, while the first diffraction order is collected. The system therefore works effectively as a programmable attenuator with a dynamic range exceeding 30 dB and 128 discrete levels of attenuation. These parameters enable us to span a wide range of W with sufficient sampling to generate a wide variety of user-defined photon statistics, as demonstrated in Results.

III. CALCULATING INTENSITY DISTRIBUTION
If a photon statistics p n is given instead of the intensity distribution P (W ), we need to invert Mandel's formula. There are several different approaches found in the literature [26][27][28][29], but here we do not need a full inversion to this ill-posed problem. Both spans for n and W are infinite, but covering infinite ranges is experimentally impossible both on detection and generation side. Therefore, we always specify the photon statistics to be generated up to a certain n max . Because we also work with 128 discrete values of intensity W i in our experiment, Mandel's formula becomes a linear matrix transformation i A ni P i = p n with P i = P (W i ) and A ni = e −Wi W n i /n!. If we specify the right side, p n , the task is to find the solution vector P i .
Let us discuss the properties and dimensions of the problem. There are two additional constraints, the nonnegativity of probabilities P i ≥ 0 and summing to unity i P i = 1. The unity condition can be simply added as an additional equation, extending A and p, but the existence of a non-negative solution P is not guaranteed, even though the system of equations is strongly underdetermined: the dimensions of the matrix A extended by the aforementioned row are (n max + 2) × 128, where n max is set between 10 and 15 for our experiment. This formally leads to a highly multi-dimensional sub-space of solutions, where non-negativity may appear like a weak restriction, but it actually intersects the solution subspace with one orthant (1/2 128 ). Whether or not there is an intersection is neither likely nor trivial to evaluate for high dimensions. In lower dimensions, one can imagine intersecting one octant in 3D space with a plane, line or a point, which depends on their orientation and position and is by no means likely even for a hyperplane. Solution would depend on the choice of p n and the character of the transformation A. We do, however, have a degree of freedom in scaling the intensity levels W i . These are a product of the intensity at the input and the attenuation levels of the AOM system. Although the attenuation levels are fixed, the input power is a free parameter, expressed by the maximum available intensity W max , which is user-scalable and all other intensity levels derive from it. To summarize, our problem has a form The question now is whether we can be content with finding an approximate solution. In that case, we would have to choose a statistical metric that would define the optimal solution. Later in this work, we elect totalvariation distance for quantifying the agreement between model and data, but for the generation itself, we believe a physical argument is needed in order to designate "the closest photon statistics". To avoid this ambiguity, we restrict ourselves to precise solutions in this step.
We chose a Python-implemented non-negative least squares (NNLS) algorithm published in Ref. [30] as an efficient method of finding such a solution. When defining the photon statistics, we always kept n max sufficiently low and chose W max such that the solution would be exact within machine precision. Because the matrix rank is n max + 2, we typically end up with the same number of non-zero elements of P i .
If we relax the condition of precise solutions, we can extend n max at the expense of error. This approach will be investigated in the future.

IV. EXPERIMENTAL IMPLEMENTATION
In our experimental setup, we used a superluminescent diode due to its long-term power stability better Light from a superluminescent diode (SLED) is sent through an acousto-optical modulator (AOM), and the first diffraction order is coupled into a single-mode fiber and detected on a silicon single-photon avalanche diode (SPAD). The AOM is fed from a harmonic signal generator through a programmable attenuator with a parallel 7-bit interface connected to a microcontroller board. than 10 −4 (QPhotonics QSDM-810-2). It was centered around 810 nm and coupled to a single-mode fiber. Light was decoupled into free space and sent through an AOM (Brimrose TEM-125-10-800). The first diffraction order was collected into a single-mode fiber and measured. The AOM was driven by a 125-MHz harmonic signal generated in a harmonic signal generator and passed through a digital step attenuator (Mini-Circuits ZX76-31R75PP+). The attenuator was controlled by an ARM microcontroller (Arduino Due) through a 7-bit parallel interface [31]. There were 128 attenuation levels separated by 0.25 dB with a switching speed below 0.5 µs. The specified responses are 300 ns for the attenuator and 150 ns for the AOM. The diffracted optical intensity was approximately linear with respect to the RF signal power.
For detection, we used a silicon single-photon avalanche diode (SPAD, Excelitas SPCM CD3543H). The width of the detection window τ was 10 µs, which is much larger than the recovery time of the SPAD (23 ns), so that the SPAD can distinguish the number of photons incident during the detection window. Data were collected using an electronic time-to-digital converter with a timing resolution of 156 ps (UQDevices UQD-Logic-16).
The intensity modulation was a stepwise sequence of intensity levels and fully controlled by the microcontroller. The sequence was random with a user-specified statistical distribution implemented by an integrated hardware random number generator. The modulation period was 1 ms, which is much longer than the detection window. Together with fast level switching, we can consider the optical intensity in each detection window constant to model the response of the detector. The overall model of the data is then a statistical mixture of constant-intensity models governed by P i , just like the intensity itself.
The detection model has to take into account all relevant imperfections of the detector [2]. Some have no impact on the measurement. Finite detection efficiency is simply considered as part of the overall attenuation. Background counts are very low and contribute to an offset in intensity, which is accounted for in calibration. However, the effects of recovery time and afterpulsing have a measurable effect on the measured photon statistics. We therefore developed a model, where the detector is inactive for the duration of recovery time after each detection event. Recovery time can be measured directly (23 ns in our case). After recovery time, there is a certain probability of emitting an electronic afterpulse (2.35 %). The temporal probability distribution of afterpulses was established by a separate measurement. Also, there is a probability that an afterpulse will be emitted right after recovery time (in some literature called a twilight pulse). This probability is proportional to the mean count rate; in our case by a constant 2 · 10 −9 s.
All of these effects can be measured, simulated for each intensity W i and the result compared to measured data. We found that for constant intensities (Poisson light), the measured photon statistics differs from the predicted model by less than 6 · 10 −4 for each p n . All of our data that use NNLS inversion have accuracy comparable to this systematic error (for details, see Discussion).

V. RESULTS
In Figures 2 and 3, we present various generated photon statistics. We chose Bose-Einstein statistics for its physical significance; it is the statistics of a single mode of light, which is in thermal equilibrium with its environment. The second important statistics we chose is the log-normal. In this case, the log-normal distribution applies to optical intensity. Such light can occur in turbulent optical channels as the result of log-normal fluctuations of transmittance [15,17]. Finally, we chose some more complex forms of light, bimodal distributions and even a physically bizarre uniform distribution, to demonstrate the variety of statistics that can be realized.
One approach is to take a specified photon statistics p n and perform the NNLS inversion to obtain an intensity distribution. This inversion-approach is universal and does not require any prior knowledge or decomposition of P (W ). The other method (intensity-approach) is simply implementing a given intensity distribution. Here, both limited dynamic range and finite sampling need to be accounted for. The data show that both approaches can lead to accurate results (see Figs. 2, 3 and Table  I). For comparison and more detailed considerations, see Discussion.
For the measured data, we need to evaluate the accuracy of the generated photon statistics. We employ a model that arises from the intensity statistics P (W ) and models the SPAD response to any intensity W . If P (W ) was obtained by the inversion-approach, accurate photon statistics is expected only up to n max . Beyond that, high accuracy of the model is achieved if the photons statistics has already been covered enough, nmax n p n → 1.
To show the difference between model and data, we plot individual differences δp n in Figures 2 and 3. For quantification of the overall difference between the two probability distributions we chose a very conservative definition: total-variation distance ∆. It is defined as the maximum difference between probabilities of any possible set of samples {n} ⊆ N 0 (for definition details, see Methods). The distances for most photon statistics, and therefore maximal deviations in generated probabilities, are in the order of ∆ ∼ 10 −3 (see Table I).
The first distribution that we measured was the Bose-Einstein distribution: p n = n n /( n + 1) n+1 . In theory, it can be generated through negative exponential modulation of the optical intensity P (W ) = exp(−W/ n )/ n . For mean values 1 and 2 we successfully employed both approaches. For mean value of 10 and W max = 20, the intensity approach fares much worse due to unfavorable scaling of the intensity range required for optimal modulation (see Discussion).
Bose-Einstein distribution is also well-known for its photon bunching, as commonly measured by the autocorrelation function g (2) . Its value can be calculated from the photon statistics, but also measured using a Hanbury Brown-Twiss setup [1]. The data for one such measurement are shown in Fig. 2(d). We chose a 10-ns coincidence window, although this choice does not influence the result as long as the window is much shorter than the modulation period. The g (2) half-width of 1 ms corresponds to the period of modulation and the measured values at zero for both generation methods are within 1.98 ± 0.02 for distributions with n = 1, 2, which are reasonably well covered by the modulation range. For the mean number 10, g (2) drops to 1.6, because too much probability for n beyond n max is not covered in our NNLS inversion.
The second distribution, the log-normal, is given by , where ln(W ) is normally distributed with mean Ω and variance σ 2 . The corresponding photon statistics can be numerically calculated by applying Mandel's formula (1). Here, again, we used both approaches for mean value Ω = 1.
To demonstrate that complex distributions can be generated, as long as they are classical, we chose two concave and one uniform photon statistics. The first was given by a combination of Bose-Einstein statistics and normally convoluted Poisson statistics. The second is a combination of two convoluted Poisson statistics with two distinct peaks. The third one is a uniform distribution with n = 10, which is peculiar for its apparent non-classicality. Indeed, if specified in full range, p n≤20 = 1/21, p n>20 = 0, the distribution is nonclassical. However, if we restrict uniformity only to the first 11 elements, we obtain a classical photon statistics that is partially uniform with a falling tail for n > 10.
Finally, Fig. 4 shows a log-normal statistics in a scope up to 500 photons with total-variation distance ∆ = 1.5 · 10 −2 . Our goal was not to achieve as high precision for all p n as in previous results, but to demonstrate  For number of photons n, NNLS inversion was used to obtain P (W ), and for intensity W , distribution P (W ) was given directly. nmax represents the upper limit on the given probability space and Wmax is the maximum intensity. ∆ is the total-variation distance. Definitions of statistical notations follow. B-E( n ): Bose-Einstein photon statistics or negative exponential intensity distribution, where n = W . N ( W , σ 2 ): normally distributed intensity with mean W and variance σ 2 , or the corresponding photon statistics. log-N ( ln W , σ 2 ): log-normal distribution of intensity or the corresponding photon statistics. The moment parameters are the same as for the normal distribution, except here they pertain to ln(W ). uniform(n1,n2): uniform photon statistics pn = 1/(n2 − n1 + 1) for n1 ≤ n ≤ n2.
heavy-tailed scaling of the statistics by comparing measured data to the log-normal. We employed log-normal modulation in intensity and directly compared the measured photon statistics to the theoretical distribution. We did not take into account the detector response, instead, we extended the detection window to avoid too much saturation.

VI. DISCUSSION AND METHODS
a. Scope and extensions. The proposed method works with arbitrary form of intensity modulation. Other modulators like electro-optical, electro-absorption or micromechanical can be used. The advantage of the AOM is its stability, repeatability, high dynamic range and relatively fast response. We estimate that AOMs can reach at least 40 dB of dynamic range and a modulation speed up to several hundred MHz. The repeatability and stability are extremely good and indistinguishable from SLED and coupling stability. To the best of our knowledge, such parameters could not be simultaneously reached using other modulation techniques.
The response speed itself can be further improved by using electro-optical modulation or electro-absorption modulation. Particularly, we successfully tried using an  Table I. Orange bars represent the expected model. Note that the model is not strictly equal to theoretical pn, because it includes SPAD recovery time and afterpulses. Squares represent measured data. The difference between them, δp = p data − p model , is represented by blue points on a magnified scale. Lighter points and striped bars represent values beyond nmax. a -c: Bose-Einstein statistics. NNLS was used to calculate P (W ). d: typical shape of the autocorrelation function g (2) . e, f: Bose-Einstein statistics. Intensity was modulated with negative exponential distribution.  Fig. 2). Additional information in Table I. For brevity, we denote a normally distributed variable X with mean X0 and variance σ 2 as X ∼ N (X0, σ 2 ). a, b: desired statistics are based on log-normal intensity distributions. NNLS inversion was used to calculate P (W ). c: same as a, but using log-normal modulation in intensity.  4. Measured log-normal distribution log-N (2, 1 2 ) with pronounced heavy-tailed behavior. Black points denote experimental data and the orange curve is the theoretical expectation. Orange area denotes statistical confidence region of 2σ, so approximately 95 % of all data should be within. For this measurement, the modulation period was 2 ms. We also extended the detection window to 200 µs to avoid detector saturation for high photon numbers. Measurement time was 1000 s. delay (ms) 1 356 autocorrelation data g (2) FIG. 5. The maximum measured autocorrelation value that is achievable using the device in its present form; the result of a two-detector coincidence measurement. Blue curve represents data for a coincidence window of 10 ns, delay values were sampled by 10 µs and the uncertainty is lower than the thickness of the curve. The shape corresponds to stepwise intensity modulation with 1-ms period. The signal was a random mixture of two intensities with count rates on each detector approximately 3 Mcps (probability p = 6.6 · 10 −4 ) and 2 kcps (probability 1 − p). The intensities were chosen such that dark counts and recovery time of the detector would have the smallest effect.
electro-optical integrated Mach-Zehnder amplitude modulator. Compared to acousto-optics, this technology offers very high modulation speeds up to 40 GHz, but typically has only about 20 dB of range and poor long-term stability due to interferometric phase drift. However, an active phase lock can be implemented to solve this issue. That would increase modulation speeds necessary for shorter temporal modes.
In our particular generator, dynamic range is limited by the electronics driving the AOM, which can be in principle improved upon. Generally, dynamic range can be augmented using multiple modulators in a sequence. This would enable considerable scaling, increasing the range of the generated photon statistics (n max ) by orders of magnitude.
b. Pulse regime. The proposed scheme would work the same for pulsed light. Wide wavelength spectrum of the pulsed signal does not represent an issue, as the implemented generator uses a superluminescent diode with 20-nm-wide spectrum. On the detection side, a pulseddomain photon-number-resolving detector would be required.
In this work, we elected the continuous-wave regime to demonstrate the accuracy of the proposed generator. This enabled us to use photon-number resolution in time using one SPAD detector, which is accurate and available. Such an approach offers easily scalable detection range up to hundreds of photons with well-understood imperfections like afterpulsing, recovery time, detection efficiency, and dark counts. These either bear no effect or can be directly measured and taken into account. This detection technique enabled us to demonstrate the quality of generated statistics, but the same quality can be reached in the pulse regime.
c. Speed and timing. There are several important timing parameters: recovery time t D , the temporal width τ of the measurement mode, the period of the modulation t M , and the overall measurement time T . They need to be ascending by orders of magnitude, t D τ t M T . The recovery time is given by the detector, t D = 23 ns. We chose τ = 10 µs so that the SPAD can recognize the number of detections without too much saturation. The period t M = 1 ms so that the integral intensity W (t) is influenced only by the given modulation statistics P (W ) and does not get distorted by modulation transitions too much. The overall measurement time T = 100 s to gather enough statistical data. During this time, the fluctuations in the input power were negligible with respect to other systematic errors. For pulsed light and fast photon-number-resolving detector, the sub-500-ns modulation transition limits the possible repetition rate to 2 MHz, provided we desire pulse-to-pulse-independent statistics.
d. The SPAD model accuracy. The model of the SPAD assumes only such processes take place in the detector, that depend on the time of only the most recent count event. Apart from recovery time, we included the effect of afterpulses, including their temporal characteristic and their component that grows linearly with count rate [2]. This turns out to be accurate in the order of δp ∼ 10 −4 . It is important to note that no data fitting occurred. All parameters were established by an independent measurement beforehand to create a unified detection model. We used continuous Poissonian signals with constant count rates of 50, 100, 200, 500, 1000 and 1500 kcps. The outcomes of the model for various statistics were then compared to the respective raw data.
e. Systematic errors. The systematic errors in our data are caused by systematic errors of the SPAD model and by systematic errors in photon statistics on the generation side. For the inversion-approach, generated p n are precise, because we can measure intensity levels W i directly. Thus, the systematic error is virtually on the detection side, which is in accord with our measurements. Consequently, we believe the proposed generator could be potentially used for calibrating photon-number-resolving detectors response to various photon statistics.
For the approach, where P (W ) is given directly on an infinite scale, finite dynamic range of the modulation and discrete sampling do not permit a precise solution, and both sources of systematic error combine. The errors in the respective data conform to this explanation.
f. Total-variation distance. When measuring the number of photons n, the set of possible outcomes is N 0 = {0, 1, 2, . . . }. A photon-number distribution assigns a probability to each individual outcome n ∈ N 0 , and by extension to all sets of outcomes {n} ⊆ N 0 . We wish to compare two probability distributions in the broadest possible terms, which means for every set of outcomes. For a certain worst-case set {n} ∆ , the difference between probabilities is maximal and thus defined as the total-variation distance ∆. If we know the individual differences δp n = p n,data − p n,model , it follows from additivity of p n that the maximum difference can be obtained by summing all the positive or all negative differences. These positive and negative sums are equal, because we know that n δp n = 0 due to probability normalization. Therefore, the total variation distance is a simple sum ∆ = 1 2 n |δp n |. We picked this quantification because it is a standard statistical measure, it has a straightforward definition and conservative upper-bound character.
g. Intensity scaling. For the intensity-approach, there is a degree of freedom in the scale of W within a constant dynamic range. There is an optimum W max that gives the best match in p n , and it may vary depending on the definition of statistical distance, but only slightly. The optimal W max increases with the mean intensity W much faster than it does for inversion-approach. This is an interesting advantage of the inversion-approach as compared to explicitly given modulation, because it permits photon statistics engineering for intensities where the detector is less saturated. If we use the same W max , the inversion approach gives much better results for higher mean values.
h. Photon-number correlations. An important insight is provided by splitting the generated light on a balanced beam splitter and evaluating photon-number correlations between both outputs [6]. Examples of measured correlation coefficients are < 0.001 for Poissonian light, 0.48 for Bose-Einstein light (0.50 in theory), and 0.45 for log-normal light (0.46 in theory). The measured values depend on mean number of photons, beam-splitter balance and detection efficiencies. In theory, the correlation parameter for classical light intensity with mean µ and variance σ 2 is C = 1/(1+2µ/σ 2 ), assuming balanced detection.
Another well-known correlation metric is the autocorrelation function g (2) that can be calculated from the photon-number distribution or directly measured using the same two-way-splitting scheme and evaluating coincidence detections. Using aforementioned intensity moments, g (2) (0) = 1 + (σ/µ) 2 . The highest achievable value stems from dynamic range of the modulation d = W max /W min , and is then g (2) (0) = (1 + d) 2 /4d. It can be generated using a statistical mixture of the minimum and maximum intensity. Using the generator, we were able to reach g (2) (0) > 350 (see Fig. 5).

VII. CONCLUSION
In conclusion, we presented a generator of arbitrary classical photon statistics than can be fully programmed by the user. We generated various statistics including Poissonian, super-Poissonian, thermal, and log-normal. We demonstrated very high generation accuracy δp n < 10 −3 , which corresponds to the accuracy of the detection mechanism employed. Furthermore, we proposed an efficient inversion method to turn an arbitrary photon statistics into an optical intensity distribution. Finally, we demonstrated high-fidelity generation of photon statistics in the range of 500 photons.
The concept of the generator can be extended to any form of intensity modulation with possible increases in speed and range of generated statistics by orders of magnitude. The generator can also be straightforwardly used in a pulsed regime to produce single-mode states with given statistics. Another use is stochastic loss modulation to simulate realistic transmission channels. As a purely experimental advance, the proposed method of modulation is capable of putting out bunched light with much higher intensity than the conventional rotating glass approach. In addition, the presented generator can produce super-bunched light up to g (2) (0) ∼ 350.
The presented generator can be used to simulate communication channels, calibrate the response of photonnumber-resolving detectors, or probe physical phenomena sensitive to photon statistics.