Photon statistics and signal to noise ratio for incoherent diffraction imaging

Intensity interferometry is a well known method in astronomy. Recently, a related method called incoherent diffractive imaging (IDI) was proposed to apply intensity correlations of x-ray fluorescence radiation to determine the 3D arrangement of the emitting atoms in a sample. Here we discuss inherent sources of noise affecting IDI and derive a model to estimate the dependence of the signal to noise ratio (SNR) on the photon counts per pixel, the temporal coherence (or number of modes), and the shape of the imaged object. Simulations in two- and three-dimensions have been performed to validate the predictions of the model. We find that contrary to coherent imaging methods, higher intensities and higher detected counts do not always correspond to a larger SNR. Also, larger and more complex objects generally yield a poorer SNR despite the higher measured counts. The framework developed here should be a valuable guide to future experimental design.


I. INTRODUCTION
The scattering of a spatially and temporally coherent beam from an object gives rise to a far-field diffraction pattern consisting of constructive and destructive interference that encodes that object's structure, an effect that is utilised to obtain atomic-resolution images of the electron density of crystals with x-rays, for example. If measured in a similar way, the far-field pattern of light emitted by a luminous object, however, appears unstructured since the individual emitters of that object are mutually incoherent. Unlike in the case of coherent diffraction, the phase relationships of spherical waves emanating from elements of the object do not remain constant, and thus over the course of an exposure the measurement averages to the sum of the integrated intensities of those emitters. If, on the other hand, the far-field pattern is measured with an exposure time shorter than the coherence time of the light, we would indeed observe interference in the form of a speckle pattern [1,2]. Although this speckle pattern would change each time it is measured due to random fluctuations of the phases of the emitters, the integrated intensities nevertheless retain correlations. This is the basis for intensity interferometry of Hanbury Brown and Twiss, in which the signals measured in independent detectors are correlated. The method was first used to measure the correlation length for radio and visible stars to deduce their diameters [3].
Classen et al. [1] proposed to use intensity interferometry of x-ray fluorescence to reconstruct the threedimensional arrangement of a particular species of atom * fabian.trost@cfel.de in a sample such as a protein crystal, a method referred to as incoherent diffractive imaging (IDI). Fluorescence is generated by the transition of a valence electron into the core hole created by x-ray photoionisation. For transition metal elements such as Fe or Mn, the cross section for photoabsorption exceeds that of coherent scattering by a factor of about 250 [4], producing about 50 Kα fluorescence photons per coherently scattered photon. The wavelength of the emission is on the order of 1Å and the lifetime-and thus the coherence time τ c -is of the order of 0.4 fs. It was suggested that the femtosecondduration pulses produced by x-ray free-electron lasers would generate fluorescence from a sample within a burst that could then be measured with an integrating detector to compute the intensity-intensity correlation, as recently demonstrated by Inoue et al. [5]. As yet, only the width of the x-ray spot focused onto a fluorescing metal foil has been determined by this method-the image of a more complicated structure such as a crystal is yet to be demonstrated.
The design of IDI experiments requires an analysis of the signal to noise ratio (SNR) that can be achieved by the method and how this depends on various experimental parameters. Estimates of the achievable SNR in images obtained by intensity interferometry of general scenes have been presented in the context of astronomy [3,[6][7][8][9][10][11][12][13]. These studies suggest the SNR scales with intensity-that is, with the number of photons measured per coherence mode-and should improve with the square root of the number of detector pairs (and hence correlations) that contribute to the measurement. However, prior works omit considerations of the consequences of performing pair-correlations of intensities measured simultaneously on many independent detectors (or detec-tor pixels) as will be the case when fluorescing atoms are stimulated by a femtosecond-duration x-ray pulse. This, as we find here, has a profound influence on the achievable SNR.
The situation for IDI can be compared with coherent diffractive imaging (CDI) based on elastic scattering, where for a well-designed experiment the noise in the measured integrated intensities is dominated by the Poisson statistics of the photons and so higher measured counts yields a higher SNR. Although this contention also holds for IDI, the situation is more complicated because of the way the signal is constructed from a correlation of intensities. It has not been readily obvious what other factors the SNR depends upon, and what conditions must be met for a feasible experiment. Our analysis is based upon a classical (wave optics) approach, combined with photon statistics, to determine the statistics of detected signals and the corresponding statistics of their correlations. After briefly reviewing the method of IDI in Sec. II, compared with CDI, we introduce the statistics of the correlation function in Sec. III. These are then used to estimate the relative SNRs in Sec. IV as a function of experimental parameters and the object shape, which we compare with numerical simulations. We assess the feasibility of imaging different types of structures using snapshot x-ray fluorescence measurements in Sec. V as well as the imaging of stars at high angular resolution using arrays of visible telescopes. Our results show that the complexity of the structure is a crucial factor in the ability to determine the first-order coherence function g (1) of the light-field (equal to the normalized Fourier transform of the spatial distribution of emitters) from measurements of the second-order coherence function g (2) (the normalized intensity autocorrelation). Figure 1 depicts a general scattering experiment that gives access to both a "coherent diffraction imaging" (CDI) measurement and an IDI measurement. In CDI, the interference of elastically-scattered waves is recorded as a diffraction pattern, shown here in the forward direction. For a particular position on the detector in the far field, specified by the direction of the wavevector k, this interference can be calculated by summing over all rays originating from a source point (assumed here at infinity) and scattering from the elemental scatterers in the sample (e.g. atoms) to arrive at the detector. The relative phases of these rays depend on their path differences and are given by ( k − K) · r, where r is the position of the scatterer relative to some arbitrary origin and K is the common wavevector of the rays incident on the sample. The phases are further modulated by the complexvalued scattering factor f of each scatterer, giving rise to a diffraction pattern I( q) = | i f i exp(i q · r i )| 2 , with q = k − K. Since the scattering is elastic, the magnitudes

II. INCOHERENT DIFFRACTIVE IMAGING
Schematic sketch of an IDI setup to demonstrate the differences from a coherent diffractive imaging (CDI) setup. Since fluorescence is emitted isotropically the IDI detector placement does not depend on the incident beam (but can be placed, where coherent diffraction is suppressed by polarization). In IDI, structural information is obtained by correlation, therefore the distance between two pixels gives rise to a certain q while in CDI q is defined relative to the incident beam.
of k and K are equal and the diffraction amplitudes represent Fourier componentsρ( q) measured on a spherical manifold of radius 2π/λ (for a wavelength λ) that passes through the origin q = 0. This manifold is referred to as the Ewald sphere [14]. We can compare CDI to IDI by considering monochromatic fluorescence emitted from the sample and detected on the grey-coloured detector in Fig. 1. If measured with an exposure much shorter than the coherence time of the fluorescence, and ignoring for the moment any sources of noise or quantization as well as polarization, the waves originating from the elemental emitters of strength s i in the sample will interfere in a similar way to those formed by elastic scattering discussed above, as A difference to CDI is that the phases φ i of the waves are random and uncorrelated. The diffraction pattern is thus a speckle pattern, with values that follow a negative exponential distribution [2] and speckles of a width inversely proportional to the width of the object. An example of such a pattern is given in Fig. 2. The different realisations of the random phases φ i each time a measurement is made will give rise to a different speckle pattern. At first sight it would seem that structural information would not readily be discernable, since over an ensemble {p} of repeated measurements exp(i(φ i,p −φ j,p )) p = δ ij so that I p ( k 1 ) p = i |s i | 2 becomes featureless. For an object consisting of N E identical emitters, this is equal to N E |s| 2 . However, if we instead take correlations of integrated intensities at positions k 1 and k 2 within an exposure, and then average these correlations over many patterns, then it can be seen that Furthermore, additional averaging can be carried out over all pairs of pixels with a given displacement q = k 1 − k 2 so that For a pixelated detector that covers a particular solid angle of the emission from the object (such as seen in Fig. 1), the samples of q cover a volume of reciprocal space given by the autocorrelation of the Ewald sphere surface provided by that solid angle [1]. Therefore, compared with a coherent diffraction pattern that encodes data only on a two-dimensional manifold of reciprocal space, IDI encodes three-dimensional information.
We note that although Eqn. 3 was derived using classical wave optics and the assumption of random phases, this reproduces the derivation based upon quantum mechanics presented in Classen et al. [1]. This equation shows that the autocorrelation of the values measured within the coherence time of emitters is equal to the square of the total emitted power added to the square modulus of the Fourier transform of the distribution of emitters S( r) = |s( r)| 2 , as whereS( q) is the Fourier transform of S( r). Obtaining the emitter structure S( r) from the square modulus of its Fourier transform is the same phasing problem that faces CDI, and this can be tackled using the same tools, such as iterative projection algorithms [15,16]. It is convenient to carry out analysis on normalised quantities, g (1) ( q) =S( q)/S(0), so that Eqn. 4 becomes More generally, for measurements made with only partial temporal coherence, the contrast of the g (1) signal will be reduced by the visibility β with 0 < β < 1 so that This is known as the Siegert relation [1,17]. Thus far, the derivation has been purely classical, and this relationship holds for thermal light source (TLS) emitters, and hence the subscript in Eqns. 5 and 6. In the case of innershell x-ray fluorescence from single atoms, we can not assume a thermal light source. Instead a more accurate model assumes a source composed of single photon emitters (SPEs). In this case, the intensity auto-correlation Eqn. 5 must be corrected to account for the inability of an atom to emit another photon within a coherence time [1]: For objects with a large number of emitters, the difference between these expressions for SPEs and TLSes vanishes. Here we restrict the analysis to the TLS case.

III. SOURCES OF NOISE
As evident from Eqn. 6 the determination of g (1) ( q) 2 ideally requires measurements with a high intensity per pixel (sufficient to neglect shot noise), a large number (N P ) of recorded exposures, and full temporal coherence. Such an ideal of course cannot be met, and we examine here the effects on the g (2) signal for non-ideal conditions. We make a distinction between two sources of noise as well as a reduction of contrast. The first source of noise is that caused by a finite integrated intensity per pixel and the quantum nature of light, commonly known as "shot noise". As this noise produces Poissonian statistics we simply refer to it as "Poisson noise". The second source of noise arises due to the finite number of patterns, which we refer to as "phase noise" since its origin lies in random phases of the emitted waves which gives rise to the speckle nature of the patterns of Eqn. 1. This noise has also been called "wave interaction noise", or "photon excess noise" by Hanbury Brown and Twiss but was neglected in their signal to noise calculations [6,7].
We must also consider effects that reduce the visibility of g (2) measurements, such as due to polarization states, temporal incoherence, insufficient sampling of the speckles or energy spread. Since the signal is modulated by the visibility β, maximizing it is as important as minimizing noise. Temporal coherence can be considered in terms of modes, whereby photons in a mode are mutually coherent (giving rise to interference) and incoherent to those in separate modes. The visibility is equal to the inverse of the number of modes, M = β −1 , which can be expressed as [17] where 0 ≤ P ≤ 1 denotes the degree of polarization, T the measurement time, Λ(τ ) = 1 − |τ | for |τ | ≤ 1 and 0 otherwise, and γ(τ ) is the complex degree of temporal coherence. For example, a Lorentzian spectrum characteristic of inner-shell fluorescence [18]) with a coherence time τ c yields [17] For measurement times sufficiently greater than the coherence time For x-ray fluorescence and an polarisation-insensitive detector, P = 0, giving a factor 2 to the number of modes due to the two orthogonal polarisation eigenstates 1 . Fig. 2 shows four examples of integrated intensities simulated for fluorescence from a small simple cubic crystal and a single emitter located in each of its 15 × 15 × 15 unit cells. The simulation assumed a lattice constant a = 5Å, an emitting wavelength of λ = 2Å, and a detector placed 50 mm from the sample with pixels of area 100 µm × 100 µm. The four patterns were simulated for different values of the mean intensity per emitter, characterised by the mean detected counts µ per pixel, and number of modes M . The high numbers of detected photons per emitter are not physically realisable for single atoms, but the examples serve to illustrate the salient features of such patterns and the noise sources. When the emitters are very strong, as illustrated with µ = 20 in Figs. 2a and b, the speckle pattern can be discerned, with a speckle width inversely proportional to the object width as discussed in Sec. II. The visibility of the speckles is much reduced when the number of modes is increased from 1 to 20 (Fig. 2b), which would be the case for a measurement with a polarisation-insensitive detector and an exposure time T ≈ 10 τ c . Poisson noise can 1 The analysis of Inoue et al. [5] neglected this factor of 2. also be seen in these patterns, but that noise dominates when the number of photons detected per emitter is reduced as in the simulations depicted in Figs. 2c and d. There the effect of the modes is harder to see.
We model the statistics of the photon correlations by considering an object consisting of N E emitters that emit monochromatic spherical waves with random relative phases. The measured energy on a pixelated detector for a single mode is then given by Eqn. 1. Since the sum of the waves with random phases can be viewed as a random walk in the complex plane, the energy is a random variable with the negative exponential distribution function P exp (x, µ 0 ) = 1 µ0 e −x/µ0 , where µ 0 is the average energy per pixel and mode [2]. The detected energy for M modes consists of the sum of M such random variables. This generates an Erlang distribution described by where µ is taken to be the average energy per pixel of the measurement, so that µ = M µ 0 . The variance is then given by Var Erlang = µ 2 /M representing the "phase noise". Since photons are detected as countable particles the Erlang-distribution must be combined with a Poissondistribution, generating a negative binomial (NB) distribution [19] as where now the random integer variable x describes the number of detected photons at pixels and µ is again the mean counts per pixel. The variance is given by The first term µ is equal to the variance for a Poisson distribution (obtained when M → ∞) and as such represents the contribution of "Poisson noise", while the µ 2 /M term is equivalent to the variance of the Erlang distribution and so can be considered to be due to "phase noise" and modes, as can be related back to the examples in Fig. 2. The "phase noise" is dominant in Fig. 2a, and increasing the number of modes decreases the variance as seen in Fig. 2b. In Fig. 2d, "Poisson noise" is dominant. We also recognise that P NB becomes the Bose-Einsteindistribution [20] for M = 1, given by Incoherent diffractive imaging requires the autocorrelation of measured counts as described in Eqn. 3. The simplest way to perform this correlation is to multiply the counts of two single-pixel detectors. Initially, for the sake of simplicity, we assume that the counts of both detectors are uncorrelated. Their correlation then follows the distribution of the product of two NB-distributed random variables. The expectation value of this product distribution is µ NB·NB = µ 2 , where µ remains the expectation value of the detected counts: E(I) = µ. The variance of this product distribution is given by Therefore, this relation describes the variance of the correlation of signals measured in two single-pixel detectors (for instance the two telescopes of Hanbury Brown and Twiss [21]), or for coincidence measurements made between two detectors out of a multiple detector array (as proposed using the Cherenkov Telescope Array [22]). When, on the other hand, measurements are made using a pixelated detector, where the counts in many detector pairs are acquired simultaneously, the discrete AC(q) is given by a sum of such products. In order to investigate the effect of performing measurements with many pixels, we assume a set of J NB-distributed values I(j), representing the photon counts at the pixels j. Further, we assume, for the sake of simplicity, that the angular positions of these pixels evenly spaced along a line of k positions (that is, a one-dimensional array). We keep our assumption that I(j) are actually uncorrelated. We further assume periodic boundary conditions: The auto-correlation then becomes Each term within this sum follows the product distribution with a variance given by Eqn. 15. Note that even if there is no correlation between the single multiplicands I(j) (per our assumption), there may still be a covariance between the summands of Eqn. 16 . For I(j)I(j − q) and I(l)I(l − q) there is no covariance if |j − l| = q, but if j − l = q there will be. As an example, consider q = 1: since the condition j − l = q appears J times within the auto-correlation sum. Noting that the covariance is for random variables X and Y , we find that the terms in the last sum of Eqn. 17 are therefore equal to E(I(j)) · E(I 2 (j − q)) · E(I(j − 2q)) − µ 4 . The expectation value of the square of a negative binomial distributed variable is E(I 2 ) = µ + µ 2 (1 + 1/M ) and therefore the last sum in Eqn. 17 equals 2(µ 4 /M + µ 3 ). The first term of the second line of Eqn. 17 is equal to Var NB·NB , so the complete sum gives the variance of the auto-correlation as It is important to note that Eqn. 15 and thus also Eqn. 18 were derived under the assumption of an absence of correlation between the measured counts, even though this is what would generate the signal that pertains to the structure of the emitting sample.
Further, it should be noted that besides the contribution to I( k) I( k − q) k from the desired |g (1) ( q)| 2 arising from the structure, there are contributions to the background that are also correlated when measured simultaneously within a single pattern. While averaging over many patterns smooths the background to a constant value, a q-dependence of the variance can persist. The background cannot be strictly defined, therefore, as the uncorrelated (or zero covariance) contributions to I p ( k) I p ( k − q) k p . This is illustrated in Appendix C where the variance is calculated for detectors consisting of two pixels and of many pixels. It is seen there that the correlations contributing to the background change the form of the variance of g (2) ( q) from that indicated by Eqn. 18, which can be only considered an approximation. This leads to the effect that increasing the number of pairs of pixels within the same pattern is not equivalent to collecting more patterns. In the following we continue with this approximation and explore the validity of results by comparing with simulations.

IV. SIGNAL TO NOISE RATIO
We now aim to determine the dependence of the signal to noise ratio (SNR) of IDI on the various kinds of noise discussed above. For this discussion we define our signal as Sig = G (1) ( q) 2 = S ( q) 2 and the noise as the standard deviation of the background as discussed in the previous section. This signal is proportional to the square of the measured counts, so we can also write . This is different to CDI, where the signal scales linearly with µ.
In the following we examine various situations and different kinds of fluorescing samples which require different simulation methods (detailed in Appendices A and B). The detector arrangements also differ, according to sampling requirements, placing the different cases on quite different scales and making direct comparisons somewhat artificial (for example imaging a crystal versus a single non-periodic object). We therefore concentrate on separately studying the dependence of the SNR on varying intensity, numbers of modes, and object shape to gain an understanding of how to best design experiments.
Following Eqn. 18 we express the SNR as where C( q) is the multiplicity equal to the number of pixel pairs with the same wave-vector difference. The multiplicative factor 1/M = β accounts for the visibility of |G (1) | 2 . One should note that increasing C( q) does not have the same effect on the SNR as increasing N P and this term saturates at some point. Even if we consider a detector covering 4π with infinite sampling, a single pattern will still suffer from phase-noise, since the assumption of independent photon counts forming the background can not be maintained. A simple, analytic example is discussed in Appendix C.
As a first example we consider a crystal with n × n × n simple cubic unit cells. We assume each unit cell consists of one cluster of single photon emitters that are so close to each other that they are indistinguishable and can be treated as one emitter. The crystal then consists of N E = n 3 emitters, each isotropically emitting on average N γ photons per mode and pattern. The expected mean counts per detector pixel therefore is µ = ΩN E N γ M , where 4πΩ is the solid angle of a pixel (here, for the sake of simplicity, assumed to be constant over the whole detector). The autocorrelation signal G (2) ( q) = I p ( k) I p ( k − q) k p obtained from the measured fluorescence photon counts of the crystal consists of a uniform background with strong peaks at the reciprocal lattice points (Bragg peaks) as shown in Fig. 3. The |G (1) | 2 map that is extracted from the autocorrelation can be written as where a is the lattice constant. We then define the signal that is extracted from such a map as the values integrated over Bragg peaks, which in the limit of large cubic crystals is proportional to the number of emitters: This yields a signal as described by and thus, the SNR is A word of caution about Eqn. 23 is warranted since it indicates that pixels of larger solid angle should result in higher SNR. In fact, as the pixel size is increased, the number of modes increases in accordance with a loss of contrast [23]. Since this effect can be treated by an appropriate adjustment of the number of modes we further ignore the "speckle sampling" effect in this paper to keep it as simple as possible.

A. SNR as function of mean counts
To test the SNR expression in Eqn. 23, we first investigate the dependence of the SNR of simulated data on µ, or more precisely on N γ (the number of photons emitted by a cluster of non distinguishable emitter per mode). Details of the simulations are given in Appendix A, and in all simulations in this paper the object (emitter density) consists only of real and positive values. We performed simulations with three-dimensional crystals from which two slices through G (2) ( q) are shown in Fig. 3.
In Fig. 4, the SNR of the two Bragg peaks highlighted in Fig. 3 is plotted as a function of N γ , which was changed by adding more emitters to the cluster in each unit-cell, keeping the size of the crystal constant. This is effectively the same as increasing the intensity (emitted number of photons per mode) of each emitter. We observe that the SNR increases with increasing sample intensity but appears to asymptote to a certain value. This is because for a small number of photons, Poisson noise is dominant yielding SNR ∝ N γ , whereas for a sufficiently large number of photons per pixel, phase noise becomes important which yields a constant SNR for a fixed number of patterns and modes. This can also be seen from Eqn. 23 where the low and high-signal limits are  signals than the Bragg peaks of crystals. Since the signal could not be readily separated from background and noise, the simulated G (2) was fitted to the ground truth via G (2) = O + S · |g (1) | 2 + , where the fit parameter S can be interpreted as signal, O as the background, and as the noise. A more detailed description of the simulations is given in Appendix B. In Fig. 5a, the SNR is plotted for four different objects: two very sparse ones, one crystal-like object and one "dense" object with spatial frequencies giving a continuously filled Fourier-space. The plots of these SNRs were scaled to asymptote to unity, for comparison. This also demonstrates the limits of the theory with its assumption of uncorrelated values following a negative binomial distribution, applied to the case of correlated values with structural information. As can be seen in the figure, the theory fits quite well for objects with sparsely populated g (1) ( q) signals (e.g. for the "Crystal" object in Fig. 5a), since most of the detected counts are indeed uncorrelated in such objects.
We mention in passing that in the limit of dense and unstructured objects, like the "Dense" object in Fig. 5b, we were able to fit the G (2) -variance, for a single mode, as Var DenseObj = µ 4 + 6µ 3 + µ 2 . Because of the strong dependency of the variance of G (2) on the characteristics of the object, as seen by the discrepancies of Eqn. 19 to the simulations in Fig. 5a, we keep the expression of the variance of Eqn. 18 for further discussions, but need to keep these limits in mind when fitting this model to the simulated data.

B. SNR as function of modes
Here we discuss the dependence of Eqn. 19 on the number of modes and make use of the simulations of 3D and 2D objects again. We assume each mode to be of the same mean counts µ 0 , so that the total mean counts per pattern is µ = M µ 0 . The SNR then follows the form As a first example, we consider a simulation of the crystal with N E = 15 × 15 × 15 unit cells and emitters. This simulation was performed in a similar way to described in the previous section, with a mean counts per mode and pixel of µ 0 = 1.35. The reduction of the visibility β = M −1 with increased modes, according to Eqn. 6, can be seen in the plot of the inverted signal to backgound ratio (SBR) in Fig. 6a. The SNR obtained in the simulations is plotted in Fig. 6b and found to scale with the number of modes in accordance to the expression in Eqn. 24.
The influence of µ 0 on the mode-dependent SNR was investigated using simulations of the 2D "Dense" object from Fig. 5b. The variance is plotted in Fig. 7a Fig. 7c and 7d, together with the analytic prediction from Eqn. 24. We can see from Fig. 7 that the SNR declines much slower with respect to M for µ 0 = 0.01 than for µ 0 = 1. In the limit of very low µ 0 the dependence of the SNR on M becomes negligible: A negligible dependence of SNR on the number of modes in the limit of low µ 0 (where the contributions of Poisson noise greatly exceeds the phase noise) was already described by Hanbury Brown and Twiss when they stated, that "...the signal to noise ratio is independent of changes in the optical bandwidth, ..." [21]. Similar statements can be found in [22,24]. Roughly speaking, a slight increase of µ leads to less Poisson noise, while phase noise is still negligible, and therefore µ compensates for the weaker visibility caused by a larger number of modes.
In the limit of high intensity per mode, on the other hand, we obtain and therefore it is expected that under such circumstances an increase in the number of modes will be significantly detrimental to the SNR.
C. Dependence of SNR on the size and shape of the object In section IV A we saw that the shape of the emitting object has a significant influence on the SNR of G (2) . Here we return to the 3D crystal with N E = n×n×n unit cells and emitters, and examine how the SNR scales with the overall size of the crystal. Therefore we define the proportionality constant α = Ω N γ M , in an analogous fashion to µ 0 in the previous section, so that µ = αN E . With this we rewrite the SNR of Eqn. 23 as (27) In Fig. 8 the SNR is plotted as a function of the crystal size N E for three different emitter "efficiencies" α, all for the case of a single mode. Somewhat unintuitively, bigger crystals give lower SNR. In the limit of large α the SNR behaves as 1/N E , as indicated in Fig. 8b where the reciprocal of the SNR is plotted against N E . However, the SNR becomes less dependent on N E and the curve becomes flatter for smaller α. This may seem to be an improvement over larger α, but for a given crystal size a smaller α gives lower SNR. As discussed in section IV A, a greater α generally leads to a better SNR. However, as mentioned earlier, increasing α by increasing Ω alone will reduce contrast and reduce SNR.
In conventional crystallography, which makes use of coherent scattering from the crystal, larger crystals clearly produce higher SNR than small ones. In that case the SNR is proportional to the square root of the number of photons diffracted per Bragg-peak which by an equivalent analysis to Eqn. 21 is proportional to √ N E (assuming a scattering by the emitting atoms). Equation 27 and the simulations of Fig. 8 show the opposite behaviour in IDI. Even though we have assumed perfect conditions (i.e.. M = 1) in the simulations, in a real experiment there are at least two other factors in favour of choosing smaller crystals. The first is that larger crystals lead to smaller speckles, which therefore require smaller pixels or a larger crystal to detector distance, which, for a finite detector, reduces the maximum resolution and results in a decrease in µ. Secondly, large crystals can lead to the situation that even for exactly simultaneous emission, the difference of paths to the detector from atoms at the extremes of the crystal can exceed the speed of light times the coherence time, contributing an additional source of modes.
The reason for the diminishing ability to image larger objects by IDI is due to the fact that as the object gets larger and more complex, the number of intensityintensity products that do not arise in a correlated signal grows at a greater rate. This is apparent since as |S( q)/S(0)| 2 = |g (1) ( q)| 2 ≤ 1 for all q, the background always exceeds, or is at least as large as the the signal for any q. Since the distribution of emitters s( r) is always real and positive, as the object becomes larger |S( q)/S(0)| generally becomes smaller at any given q as the spectral power is distributed into more "channels". This is the case if the additional emitters added to a structure (to make it bigger) are resolvable. Those emitters added close to others (such as considered in the single clusters of emitters in the crystals, above) will tend not to reduce |g (1) ( q)| at q = 0.
To investigate the proposition that more complicated objects have lower SNR we carried out simulations of IDI analyses of patterns of non-periodic objects constructed in such a way to give a Fourier spectrum G (1) ( q) consisting of discrete narrow Gaussian-shaped peaks equally spaced in a ring at a particular reciprocal distance q 1 , as shown in Fig. 9. The complexity of the object is set by the number of Fourier frequencies that follows from the number of Gaussian peaks, without changing the resolution or overall shape of the object in real space. The object is parameterised by the number of frequency components in the ring at q 1 , given by 2 c, ensuring a centrosymmetric transform to maintain a real and positive real-space emitter density. The number of photons per emitter and per pixel is again specified as α = ΩN γ M , and when α is constant the mean counts per pixel is proportional to c. We compute the SNR based upon obtaining the signal of the integrated value of |G (1) ( q)| 2 of any one of the (non central) peaks. Since the strengths of these peaks do not change with c we assume the signal to be Sig ∝ α 2 . With the mean counts per pixel per µ = α c, we expect that the SNR scales as The SNR obtained from the simulations based on the parameterised objects is plotted as a function of the complexity parameter c in Fig. 10. The case of low intensity, with α = 0.001 is shown in Fig. 10a and scales as 1/c, as expected from Eqn. 28 which is plotted as the solid line. Simulations with high photon counts, setting α = 100, are summarised in Fig. 10c which show that the SNR scales even more strongly in this case, as 1/c 2 , again in agreement with Eqn. 28. The simulations support the assertion that the SNR never improves as the object becomes more complex, but instead it most probably becomes worse.
This analysis also implies that in the imaging of stars by intensity interferometry, recovering an image of a bi- nary star (or of a planet transiting a star) [22], requires overcoming a lower SNR than would be achieved for a single star.

V. DISCUSSION AND CONCLUSIONS
The method of IDI detects x-ray fluorescence that is generated on the exposure of a sample to a short duration of ionising radiation such as a pulse of x-rays from a free-electron laser. If this generating pulse is of a duration that is comparable to the coherence time of the fluorescence (typically less than 1 fs) then the angular distribution of the detected fluorescence will be influenced by the interference of waves originating from the various emitters in the sample. The phases of the emitting waves will be random and different from shot to shot, but the correlations of photon counts measured in a single shot, averaged over many shots, yields a sum of two terms: one that is formed from persisting phase relationships (due to the structure and proportional to the square of the Fourier transform of the structure of emitters) and a term due to correlations of purely random phases. In the limit of a large number of averages, this second term approaches a constant that is at least as large as the square of the zero-frequency component of the emitting structure.
An insight gained here from the model and simulations of the IDI measurement, both based upon a classical description of wave interference combined with Poisson photon statistics, is that the optimisation of an IDI experiment, and in particular the requirement of the total number of single-shot patterns to recover the Fourier form factors of the structure of emitters in an object, depends strongly on the size and complexity of the object. This is apparent from the fact that the background term in the correlation always exceeds the magnitude of all other spatial frequencies of the Fourier spectrum of the object, and as the object becomes more complex the ratio of frequencies to the zero frequency diminishes. Every emitter in the object adds to the background (and therefore the noise) more than it adds to the signal. In the case of crystals, it was shown in Sec. IV C that an increase in the number of unit cells always decreases the SNR of a particular signal (here, the integrated strength of a Bragg peak). Indeed, for a low number of detected fluorescence counts per emitter, the SNR is inversely proportional to the total number of emitters in the sample. Likewise, the SNR decreases when the number of distinguishable emitters in the object increases, which is the case when the G (2) ( q) function is obtained at a higher resolution (corresponding to higher magnitudes q).
We find also that noise depends not only on Poisson statistics due to photon counting, but also on the structure of the background term. Poisson statistics are of course familiar to coherent diffraction such as crystallography, where the SNR usually rises in proportion to the square root of the measured counts. The random phases of the emitted waves give rise to a standard deviation in the correlation signal that is proportional to the mean (rather than the square root of the mean). This phase noise was discussed in the context of "interferometry of intensity fluctuations in light" by Hanbury Brown and Twiss [6,7] (there called "wave interaction noise"), but not considered by them in further analysis. We find that phase noise leads to a saturation of the SNR at high intensities, as discussed in section IV A, indicating that higher emission from a given object does not give a proportionally higher SNR. Furthermore, as discussed in Sec. IV B, IDI is sensitive to loss of contrast due to mutually incoherent modes (caused, for example, by polarization states, pulse duration relative to the coherence time, finite pixel solid angle, path differences of light emitted from points across the object, and so on). In the limit of high detected counts per mode the SNR is proportional to 1/ √ 1 + 4M for M modes, while for low intensities the influence of modes vanishes due to the dominance of Poisson noise.
Previous analyses of the feasibility of imaging using intensity interferometry have considered only the simple case of intensity measurements using two detectors [21] (or multiple detectors where correlations are performed between pairs of detectors (baselines) independently [24]). This is equivalent to using two pixels per exposure (in a larger detector). Here we have examined the case of using a detector with N Pix pixels, giving N Pix (N Pix −1)/2 correlations to compute and average for a given reciprocal space vector difference q, and found that this not equivalent to averaging N Pix (N Pix − 1)/2 different shots with a two-pixel detector. That is, the SNR does not necessarily grow with the square root of the number of correlations that can be performed in an N Pix -pixel detector, so one cannot make a simple extrapolation from the two-pixel case. This is because the products formed from different combinations of pairs of photon counts exhibit correlations since some pairs share values, as discussed in Sec. III. Instead, in the limit of a large number of correlations per shot, the standard deviation of the background ( √ Var AC ) can be considerably larger than expected for two detector pixels.
Our results indicate that IDI may offer utility in structure determination of single molecules at x-ray FELs, using highest possible incident intensities (providing the highest possible number of detected fluorescence photons per atom per pixel per mode), pulse durations comparable to the coherence time, and small object extent (allowing a large solid angle Ω of pixels). The total fluorescence counts from single molecules will be much lower than from macroscopic objects (e.g. the molecule in crystallized form), but the inverse dependence of SNR on the number of emitters shows that the measurement would actually be greatly improved compared with those macroscopic objects. Thus with an optimised detection scheme, IDI could potentially provide element-specific structural information to complement weak coherent scattering [25]. For the IDI simulations of 3D-crystals, we assume a 500×500-pixel detector with a pixel-size of 100×100 µm 2 , placed at a distance of 50 mm to the sample. We consider a cubic crystal sample consisting of simple cubic unit cells with a lattice constant of 5Å and with one emitter per cell. Each snapshot pattern is simulated by generating a random phase φ = [0, 2π) for each emitter and mode. The combined scalar wave function arising from the emission of all emitters is calculated for each pixel, making use of the far field approximation and considering a wavelength of 2Å. Furthermore, we neglect the quadratic decay of intensity with distance, which is equivalent to the assumption that each pixel covers an equal solid angle. To ensure an accurate representation of the recorded signal, the wave function was evaluated on a grid of nine points that sub-divides each pixel. The continuously-valued intensity for a pixel centered at r P therefore reads where r pix,s are the sampling positions within the pixel at r pix and M is the number of mutually incoherent modes. The continuously-valued intensity I c is then rescaled (according to the fraction of the pixels solid angle Ω, here assumed to be equal for all pixel, and the number of photons per emitter N γ , to achieve a certain µ). After that scaling, a Poisson discretization is then applied I( r pix ) = PoissSampl( µ Ic I c ( r pix )). The auto-correlation is calculated as follows where Π( a, b) is defined as a modified top hat function equal to unity if |a j − b j | < ∆Vox/2 | ∀ j and zero otherwise. ∆Vox represents the voxel edge size in a discretized G (2) -space. The usage of Π therefore represents a nearest-neighbor interpolation of q. If we do not have a spherical 4π-detector, the number of possible realizations of q generally varies. Therefore, we define the function C( q) as the density of realizations, which reads The G (2) is then obtained by averaging N P patterns (independent auto-correlations) To obtain the variance of G (2) ( q), we perform the whole simulation twice with exactly the same parameters (but with different realisations of the random phases) to obtain G 2 . The variance is then estimated by the C( q)-weighted, squared difference of these two autocorrelations: . (A5) It should be noted that we have used quite small crystals (starting from 5 × 5 × 5 unit-cells) in our simulations. Therefore, the Bragg peaks that arise in G (2) ( q) have non-negligible side maxima that are not easily distinguished from fluctuations in the background. To avoid this we chose to set the integration limits to the positions of the first-order minima q 1st min = ±2π/( 3 √ N E a). Even so, the signal within this integration boundary is only proportional to N E in the limit of large crystals. Therefore, we calculate peak weighting factors as (1) given by Eqn. 20) which are used to scale the integrated Bragg peaks obtained from the simulated G (2) ( q).
The signal and background can now be obtained as fitparameter (S, B) with the best fit model Then the variance is calculated by (B5) It should be noted that for the fitting (Eqn. B4) and the calculation of the variance (Eqn. B5) the zero-frequency component (q x = q y = 0) is ignored. This is done because that component follows a different distribution (the squared of a negative binomial distributed value) to that of the "autocorrelated negative binomial distribution", discussed in section III.
Appendix C: Examples for the dependence of the variance of G (2) on the detector configuration and correlations within the background term In Sec. III the derivation of the variance of the autocorrelation, Var AC (q), depends upon the strong assumption that the counts measured at different detector pixels are uncorrelated. Since this assumption may seem quite unsatisfactory, here we illustrate some of the problems one has to face when dropping that assumption. Also, we demonstrate that increasing N P and C( q) do not have the same effect on the SNR in the limit of large values.
We consider a simple one-dimensional arrangement of emitters and further simplify the analysis by adopting the high-intensity limit where Poisson statistics can be neglected (no Poisson noise) and the calculated detected energies are not necessarily discrete. Certainly, the inclusion of Poisson statistics will not make the situation in any way less complicated.
As a first sample we choose two emitters at the positions r 1 = 0 and r 2 = R. We further assume that the photon signals are measured with two independent detectors (or two detector pixels) at the positions k 1 = 0 and k 2 = q. The correlation sum can then be written as I(k 1 = 0) · I(k 2 = q) = 2 j,j ,l,l e i(k1(rj −r l )+φj −φ l ) e −i(k1(r j −r l )+φ j −φ l ) = 2 + 2 cos (q · r + φ 1 − φ 2 ) , and by averaging over many realisations of phases we obtain G (2) as G (2) (q) = 4 + 2 cos (q · R) . (C2) This expression might seem to contradict Eqn. 5 or Eqn. 7 since we now have g (2) = 1 − 1/N E + |g (1) | 2 . This is because Eqn. C1 is not a perfect representation of TLS emitters since that equation only allows up to two photons to be emitted per emitter (and only in the case j = j = l = l ). However, the expression also does not correspond to the pure SPE case. These differences are not relevant to the SNR discussion in the main part of this paper, since they vanish in the limit of large N E . However, because of the small number of emitters in this example we are able to calculate the variance of G (2) analytically by integrating over all possible combinations of the random phases. Generally, for objects with N Eemitters (and therefore N E random phases φ) the variance reads (C3) For our two-emitter object, we therefore obtain the variance as Var = 18 + 16 cos (q · R). If we alter the situation to use more than two independent detectorssay, an infinite number of detector pixels in this thought experiment-covering the full relevant area from q = 0 to q = 2π/R, we can write the correlation as R 2π 2π R 0 I(k, φ 1 , φ 2 ) · I(k + q, φ 1 , φ 2 ) dk = 4 + 2 cos (q · R) | ∀ φ1,φ2 .

(C4)
Here the variance is obviously zero. That may not seem so surprising, since, under the assumption of uncorrelated photon counts, more detector pixels could be seen as equivalent to more patterns.
(C6) We also calculate the case for the "infinite" detector in analogy to Eqn. C4 and obtain R 4π 4π R 0 I(k, φ 1 , φ 2 ) · I(k + q, φ 1 , φ 2 ) dk = 9 + 4 cos q · R 2 + 2 cos (q · R) The different integration boundary to that of Eqn. C4 is required to sample the full diffraction information. Since the smallest distance in the three-emitter setting is half  FIG. 11: One-dimensional object consisting of three incoherent emitters (with a distance of 0.5). (a) Variance as a function of q for two detectors separated by q (solid black line) and for a 1D-detector of infinite sampling (dashed blue line), covering the full q space. Note that there is not only a difference in scaling but also in the form of the variance. (b) SNR as a function of q for the same object and detector configuration as in (a). Note that for the "infinite" detector, the SNR maxima are not at the points of maximal signal (see Eqn. C5).
of the distance between the two emitters in the previous example the integration (Eqn. C4) in the q-space must be doubled. We see that, as opposed to the case in Eqn. C4, the single-pattern measurement is dependent on the random phases. Therefore, averaging over pixels within a single pattern is not be equivalent averaging over more realisations of patterns with fewer pixels. In other words, the effect of the C( q) on the SNR is limited. After averaging over the random phases in Eqn. C7 we obtain the same result as in Eqn. C5, as expected. When calculating the variance using Eqn. C7 we obtain Var inf det (q) = 4 + 4 cos(q · R) . (C8) This variance differs from that with only two independent detectors not only in terms of scaling, but also in terms of the its dependence on q, as seen in Fig. 11a. These differences originate from the fact that the intensity measurements within one pattern are not only correlated due to the emission structure of the object, but also because the terms that form the background are correlated. This also leads to the situation that the maxima of the SNR (see Fig. 11b) are not necessarily at the same q-positions as the maxima of the G (2) .