Reconstruction of an object from diffraction intensities averaged over multiple object clusters

A projection operator is derived for use in iterative phase retrieval algorithms when the Fourier intensity data is an average over the intensity from multiple clusters of identical objects. The projection operator is a generalization of the magnitude projection for conventional phase retrieval for a single object, and is applicable when the relative orientations and positions of the objects within the clusters are known. Simulations demonstrate that an iterative projection algorithm equipped with this new projection operator can successfully reconstruct an object from the averaged Fourier intensities from multiple clusters, each containing multiple copies of the object.


Introduction
Coherent diffractive imaging (CDI) is a lensless imaging technique in which the scattering density of objects are computationally inferred from the diffraction they produce when illuminated by a coherent radiation wavefront. Far-field diffraction patterns typically provide a direct measurement of the intensity (modulus squared) of the Fourier transform of the objectʼs scattering density, and the scattering density is typically proportional to the objectʼs electron density in the case of x-ray diffraction. Deducing the scattering density of objects from their Fourier intensities is the goal of CDI and is also known as phase retrieval. Such deductions are inverse problems that are ill-posed in general with the potential of having multiple solutions. However, it is well known that a sufficiently finely sampled Fourier intensity in conjunction with knowledge that the object is finite in size gives sufficient information to specify a unique scattering density up to trivial ambiguities [2,18]. Given that the two aforementioned criteria are met, iterative projection algorithms (IPAs) are an effective method for arriving at a practical reconstruction of the object [6,10]. The method of CDI using IPAs has been successfully demonstrated experimentally with different types of radiation and objects in the past, and forms a useful imaging technique in fields such as microscopy and astronomy [14,19].
In this paper we consider cases where a CDI experiment provides us with the average diffracted intensity from a collection of identical objects in which the relative spatial positions and orientations of the objects are known. Such a collection of objects is referred to here as a 'cluster.' An example of where this problem can arise is in recent serial femtosecond crystallography experiments with x-ray freeelectron lasers. In this application, one hopes to recover the electron density of the crystal asymmetric unit from the averaged x-ray diffraction intensity obtained from a large ensemble of nanocrystals of varying sizes, shapes, and surface terminations [5,11,12,20]. This particular reconstruction technique has been called 'shape transform phasing' and does not suffer from the conventional problem of crystallographic under-sampling because it utilizes diffraction intensities measured away from the Bragg condition. Unlike conventional structure factor amplitudes measured in protein crystallography, which are almost completely insensitive to the nature of the crystal surface, inter-Bragg diffraction from finite crystals varies dramatically according to the specific occupancies of molecules at the surface of the crystal. The key idea encountered in shape transform phasing that is of relevance to this paper is that the ensemble-averaged diffraction from varying finite crystals can be reduced, approximately, to an incoherent average over the diffraction from different types of unit-cell configurations [3,4,11]. In cases where one unit-cell arrangement dominates the diffraction intensities, conventional iterative phasing strategies may be effective [13]. In general, however, the scattering density of one molecule must be reconstructed with consideration of the averaged diffraction, using knowledge of the support of the molecule and the symmetry relationships within and between the different types of unit cell [8,11]. Here we consider general clusters of objects that are not restricted to crystallographic unit-cell configurations as in the case of shape transform phasing.
In this paper we derive and demonstrate a projection operator that may be used to reconstruct an object when only the sum of the Fourier intensities corresponding to various clusters of that object is available. We assume that the relative positions and orientations of the common object in each cluster are known a priori, but these positions and orientations need not correspond to any particular symmetry group. The basics of IPAs for CDI applications are outlined in section 2, the projection operator is described in section 3, simulations verifying the projection are shown in section 4 and conclusions are drawn in section 5.

Reconstruction from the Fourier intensity of one object
Reconstructing the scattering density of an object ( ) f x from knowing only its Fourier intensity ( ) I q can be achieved by IPAs [6,10,14,15]. The object scattering density is represented by a vector f where each sample in ( ) f x constitutes one entry in f . Each of those samples in ( ) f x then in turn corresponds to one independent dimension and f can be thought of as a point in the corresponding multidimensional vector space. Prior knowledge and experimental data are then treated as constraints which can be described by sets of equations, forming various manifolds in the vector space. The phase retrieval problem then becomes that of finding a point in the intersection of all the constraint manifolds. Intuitively, the constraints for CDI require that the solution (a) reproduces the measured diffracted intensities, and (b) satisfies whatever is assumed to be known a priori about the object.
The search for the constraint manifold intersection proceeds in an iterative manner through the application of projection operators. These operators make the smallest change to their input such that their output satisfies the corresponding constraint. The two major constraints in CDI are the support constraint and the Fourier magnitude constraint, with their corresponding projection operators denoted here by P S and P M , respectively. The projection operator P S sets the value of the samples outside a finite region  in real space where the diffracting object is thought not to exist to zero, and leaves the sample values inside , where the diffracting object is thought to exist, unchanged. Denoting the support function by ( ) s x , gives The projection P M sets the magnitude of a complex-valued sample in Fourier space to the measured value while leaving the phase of that complex-valued sample unchanged. The set of all complex numbers that have the same magnitude defines a circle on the complex plane, therefore the projection involves moving the target sample value radially on the complex plane until it lies on the circle. The Fourier transform operation and its inverse, denoted here by the operators  and  -1 , respectively, are required to move the sample values back-and-forth between real space and Fourier space and so are incorporated into the projection operator P M , giving Various IPAs are in use for coercing the iterate to arrive at the constraint intersection, which is viewed as a fixed point in the dynamical systems perspective of the algorithm. Not all fixed points are solutions; most represent local minima in the landscape described by the distance between the constraint manifolds. The simplest IPA is the error reduction (ER) algorithm [10] where the ith iterate ( ) f i is updated according to the rule ( ) The ER update rule moves the iterate steadily towards a fixed point but is unable to escape and explore other regions of the vector space should that fixed point not be a solution. A flow diagram depicting the overall procedure of the ER algorithm for reconstructing a single object from its Fourier intensity is shown in figure 1. An alternative algorithm which is pleasingly symmetric is the difference map (DM) algorithm [6] in which the ith iterate is updated according to the rule where R S and R M are called relaxed projections and are given by the diffracted intensity averaged over a total of J clusters where the jth cluster is composed of K j identical copies of an object ( ) f x . The scattering density of the jth cluster can be written as where R jk is the rotation matrix that gives the orientation of the kth object in the jth cluster, and the translation vector t jk shifts the kth oriented object to the correct location in the jth cluster. There are in total N = å = K j J j 1 copies of the object in all J clusters. The set of all N objects is denoted by The above description is vectorized as follows to facilitate analysis in the later sections. Write the scattering density of the common object ( ) f x as the vector f , the scattering density of the kth object in the jth cluster ( ) f x jk as the vector f jk , and the scattering density of the jth cluster ( ) c x j as the vector c j . The vectorized density of the kth object in the jth cluster, f jk , can be related to the vectorized density of the common object, f , via where L jk is a matrix that carries out all the rotation and translation operations on f to generate object k in cluster j.
The vectorized density of the jth cluster, c j , can be similarly related to the vectorized density of the common object via where L j is a linear operator (matrix) that replicates f and carries out all the rotation and translation operations for the fs to generate the set of all K j objects in the jth cluster. An explicit form for the matrices L j and L jk depends on the properties of the vectorization that maps the samples of the scattering densities into their corresponding vectors, specifically, it depends on the order of stacking of the samples into a column vector. With the formalism described by equations (7) and (8), more general operations than equation (6) can be included in our model, such as arbitrary spatial permutations of density samples that are not derived purely from rotation and translation operations. We note that in many cases the operations given by equations (7) and (8) may also incorporate some interpolation of rotated objects, which would occur, for example, in the case of a Cartesian grid of densities with rotations that are not integer multiples of p 2. Equation (8) may also allow object densities to have spatial overlaps. Although in three dimensions we may safely assume that most objects have no physical overlap, we consider in this paper the general case of overlapping objects because it is relevant to the problem of forming two-dimensional projection images of three-dimensional objects.
The inverses of equations (7) and (8) will be important to the proposed methods in the paper and are now described. The mapping from the kth object in the jth cluster back to the object that generated it can be done via where + L jk is the pseudoinverse of L jk . The mapping from the jth cluster back to a common object can similarly be performed by where + L j is the pseudoinverse of L j . For convenience, we define the equivalent operator notation forms of equations (7) and (9) that act on the unvectorized scattering densities, as Figure 1. General flow diagram of the ER algorithm for reconstructing a single object from its Fourier intensity. Gray areas indicate potential regions of non-zero values in the computational array holding the object. The operators P M and P S are as described in the text. The quantities being directed into the operator boxes represent the information required by the particular operation.
The operators  jk and jk 1 are referred to as the forward object mapping and the inverse object mapping, respectively. Similarly, we define the equivalent operator notation forms of equations (8) and (10) that act on the un-vectorized scattering densities of the common object and the jth cluster as The operators  j and j 1 are referred to as the object-tocluster mapping and the cluster-to-object mapping, respectively.
The invertibility of both  j and  jk is problem-dependent since there is in general a one-to-many mapping between the elements of f and the elements of c j (e.g. when there is more than one object in the cluster), and also between the elements of f and the elements of f jk (e.g. when interpolations are required for object rotations). Disregarding the effects of such operations as interpolations due to rotations, the mapping  jk is in principle one-to-one since it maps one sample of the common object to one sample in the ( j, k )th copy of the object. However,  j is a one-to-many mapping by definition since it maps one sample of the common object to more than one sample in the jth cluster. The invertibility of the object-tocluster mapping  j can be understood via the matrix L j . If the objects do not overlap, L j simply replicates and permutes the replicated entries of f to construct c j . The matrix L j in that case is always invertible with the inverse given by where the superscript T denotes matrix transposition. If the objects do overlap, then the pseudoinverse, + L j , contained within j 1 is appropriate as it corresponds to minimizing the two-norm between the resulting common object and the jth cluster. Further discussion on the invertibility of L j and L jk is given in section 3.4.
Returning now to the diffraction experiment, the summed diffracted intensity from a total of J clusters, where each cluster is weighted by a real-valued multiplicative factor w j , is given by The weighting could represent, for example, the relative frequency of occurrence of the different clusters. The reconstruction problem can now be stated as follows: find a set of N objects Reconstruction of an object from the averaged diffracted intensity of multiple clusters of identical copies of that object then corresponds to deducing ( ) f x from knowing the intensity data ( ) I q data and the fact that all N objects are the same and of finite extent. The associated R jk and t jk for each object and the weights w j for each cluster are assumed in this paper to be known.
The solution to the above problem can be found using IPAs, provided that appropriate projection operators are utilized. There are at least two approaches to the above problem of developing suitable projection operators. Approach A is to solve for the set of all clusters { ( )} c x j while enforcing the relationships between the objects in the clusters, and a support constraint on each cluster. The object ( ) f x is then extracted from the resulting clusters. Approach B is to solve directly for the set of all objects { ( )} f x jk , without explicitly reconstructing each cluster, and estimating ( ) f x from the resulting set of objects. Both approaches achieve the same goal and are different perspectives on the same problem.

Approach A
In approach A, the object is reconstructed from the averaged diffracted intensity of multiple clusters by reconstructing the clusters themselves [16]. For the case of a weighted sum of diffraction intensities considered here, where the weights are not necessarily equal, the iterates of the IPA must be scaled accordingly in order for there to be a closed-form expression for the Fourier magnitude projection P M . Applying the scaling the summed diffracted intensity from equation (15) becomes The IPA will now operate on ( ) ¢ c x j and the reconstruction at the termination of the algorithm can be obtained via the inverse scal- j . The Fourier magnitude projection, P M , for approach A is similar to that described by Elser and Millane [9] who considered the case of phase retrieval from diffraction intensity data that is averaged over multiple orientations of a single object. The Fourier magnitude projection is given by [16] (17). Geometrically, at a given sample point q in Fourier space, the constraint manifold specified by equation (17) in terms of ( ) ¢ C q j corresponds to the surface of a J 2 -dimensional hypersphere in J 2 -dimensional space (one-dimension for each of the real and imaginary components of ( ) ¢ C q j ). The projection operator P M in equation (18) then brings its input onto the hypersphere via the rescaling ( ) ( ) I I q q data . The constraint manifold being a hypersphere ensures that such a rescaling is able to bring the input onto the constraint surface in a distance-minimizing way. For a set of weights w j that are not equal, the constraint surface described by equation (15) in terms of ( ) C q j is a hyper-ellipsoid. The effect of the change of variables given by equation (16) is therefore to rescale the axes such that the hyper-ellipsoid becomes a hypersphere.
The real-space projection, P S , for approach A has the following steps: (i) apply the cluster-to-object mapping, j 1 , to the current estimate of the jth scaled cluster, ( ) ¢ c x j , to obtain an estimate of the common object for all = ¼ j J 1, , , (ii) account for the number of objects in each cluster by weighting the estimate of the common object by K j , (iii) calculate the weighted average of all such common object estimates with the appropriate weightings (as derived in appendix A), (iv) enforce the support of the common object, ( ) s x , on the result of the weighted average to arrive at an updated common object, and (v) apply the object-to-cluster mapping,  j , and the scaling w j , to the updated common object to generate an update to each cluster. The real space projection P S for approach A described above can be written as The real-space operation given in equation (19) becomes only approximately distance-minimizing if there are object overlaps or irreversible effects from interpolations. In those cases a single operation that maps a common object ( ) f x to the set of all clusters { ( )} c x j must be defined and the pseudoinverse of that operation can be calculated to map the set of all clusters back onto a common object in a distance-minimizing way. The overall flow diagram of the ER algorithm implementing approach A is shown in figure 2.
Approach A estimates the scattering density of the set of J clusters { ( )} c x j independently at each iteration, and is a variant of a related problem studied by Millane and Chen [17] in which the objective is to reconstruct a set of unrelated objects from their averaged Fourier intensities. That problem uses the same Fourier magnitude projection equation (18), but the averaging step in the real space projection is omitted since the objects (clusters) are unrelated. The phase retrieval problem in that case is not uniquely solvable in general because it depends on the number of unique constraints (samples of the union of the object autocorrelations) and the number of unknown parameters (samples of the object densities). Uniqueness can be evaluated for particular cases as described by Millane and Chen [17]. For the case of clusters of identical objects considered here, the number of unknowns is fixed and does not grow with increasing numbers of clusters, and thus non-uniqueness is less problematic.

Approach B
In approach B, developed here, the object ( ) f x is reconstructed from the averaged diffracted intensity of multiple clusters by first reconstructing the set of all objects { ( )} f x jk and then estimating ( ) f x from the set. A Fourier magnitude projection P M is required that enforces the relationship that the averaged diffracted intensity derived from the current estimate of each object ( ) f x jk in each cluster must equal the measured data ( ) I q data , without explicitly reconstructing the individual clusters. For the case of a weighted sum of diffraction intensities considered here, where the weights are not necessarily equal, the iterates of the IPA must be scaled accordingly in order for there to be a closed-form expression for the Fourier magnitude projection P M (as described in appendix B). Applying the scaling the summed diffracted intensity from equation (15) becomes The IPA will now operate on ( ) ¢ f x jk and the reconstruction at the termination of the algorithm can be obtained via the inverse scaling, jk jk j j . The Fourier magnitude projection, P M , for approach B can be found by deriving the corresponding constraint manifold and using the method of Lagrange multipliers (as General flow diagram of the ER algorithm for reconstructing multiple clusters of objects from their weighted averaged Fourier intensity using approach A. In this example, there are two clusters that each contain two objects. Gray areas indicate potential regions of non-zero values in the computational array holding the object. The operators P M and P S are as described in the text, the symbol  is understood to contain the collection of all   ¼ , , J 1 operators, and similarly  -1 contains the set of their inverse. The weighted averaging step in P S is also understood to be incorporated within  -1 . The quantities being directed into the operator boxes represent the information required by the particular operation. detailed in appendix B) and is given by where ( ) I q is calculated from the current estimate of the set of all objects { ( )} ¢ f x jk via equation (21). Geometrically, as elaborated in appendix B, for a given sample q in Fourier space the constraint manifold specified by equation (21) in terms of ( ) ¢ F q jk , corresponds to the surface of a J 2 -dimensional hypercylinder embedded in a N 2 -dimensional space (one-dimension for each of the real and imaginary components of the Fourier transforms of the N objects). The projection P M specified by equation (22) then brings its input onto the hypercylinder in a distance-minimizing fashion.
The real-space projection P S for approach B has the following steps: (i) apply the inverse object mapping, jk 1 , to the current estimate of the scaled kth object in the jth cluster, to generate a set of estimates for the common object, (ii) calculate the weighted average of all such common object estimates with the appropriate weightings (as derived in appendix A), (iii) enforce the support of the common object, ( ) s x , on the result of the weighted average to arrive at an updated common object, and (iv) apply the forward object mapping,  jk , and the scaling K w j j to the updated common object to generate an update to each ( ) ¢ f x jk . The real space projection P S for approach B described above can be expressed as The real-space operation given in equation (23) becomes only approximately distance-minimizing if there are irreversible effects from interpolations. In that case a single operation that maps a common object ( ) f x to the set of all objects { ( )} f x jk must be defined and the pseudoinverse of that operation can be calculated to map the set of all objects back onto a common object in a distance-minimizing way. The overall flow diagram of approach B as implemented using the ER algorithm is shown in figure 3. As expected, one can see that for J=1, = K 1 1 , = w 1 1 , = t 0 11 and R 11 being the identity matrix, the averaged diffracted intensity given by equation (15) becomes | ( )| F q 11 2 and the projection operator in equation (22) reduces to the Fourier magnitude projection in conventional phase retrieval for a single object given by equation (2).

Modified approach B
Since the objects in the clusters are all identical, approach B can in principle be simplified in the following way. We can select out only a single object ( ) ¢ f x jk , for a single ( ) j k , , generated by the Fourier space projection equation (22), and thus eliminate both the estimation of the other copies of the object and the averaging step in real-space. For example, choosing j=1, k=1, and setting  11 as the identity operator, the Fourier magnitude projection for modified approach B is given by equation (22) but only ( ) ¢ P f x M 11 need be calculated. The operation in real-space for modified approach B has two steps: (i) enforce the support of the common object, ( ) s x , on the current estimate of ( ) ¢ f x 11 , weighted by the constant factor K w 1 1 1

/
, and (ii) apply the forward object mapping,  jk , and the scaling K w j j , to the common object to generate the updated The overall flow diagram of the ER algorithm implementing modified approach B is shown in figure 4. The advantage of the modified approach is that it reduces the computational cost of the Fourier space projection by a factor of N. Furthermore, it eliminates the need for any inverse mapping back onto the common object such as inverse rotations, and hence reduces artefacts arising from any interpolation that may be required. A disadvantage is that the simplified real-space operation is no longer a projection since the  (20), rendering the projection operator equation (22) not distance-minimizing, and hence convergence of IPAs cannot be guaranteed. Furthermore, the real-space constraint applied is weaker since there is no averaging over other estimates of the object, which may lead to more iterations being required for the potential convergence.

Comparison of the approaches
A key distinction between approaches A and B is that it is necessary in A to extract the individual object densities from their constituent clusters in real-space, for example via the pseudoinverse matrix + L j as described above, whereas in approach B, the determination of + L j is not necessary since, by design, the object densities are the quantities extracted via the Fourier modulus projection operator. Only the inverse object mappings, + L jk , are needed in approach B, which in principle always have a unique output since the forward object mapping,  jk , is one-to-one in theory. However in practice the mapping may be one-to-many depending on the interpolation process applied.
If object overlap occurs, then the problem appears to be solvable in many, but not all, cases. In non-solvable cases there would be insufficient independent data for obtaining the correct reconstruction, i.e. at least one of the L j or L jk would be rank-deficient and thus not uniquely invertible. In those non-solvable cases no algorithm (including methods A and B) will be able to reconstruct the correct object since the required information is simply absent. An apparently sufficient but not necessary condition for L j to be rank-deficient, for clusters consisting of rotated copies of an object, is when the set of rotation operator matrices R jk forms a group.
The uniqueness of a constraint satisfaction problem in general can be characterized by the 'constraint ratio' which is the ratio of the number of independent equations to the number of independent unknowns [9,17]. The constraint ratio can be applied to determine the necessary, but not necessarily sufficient, conditions for uniqueness of an inverse problem. The constraint ratio is in this case of reconstructing from the averaged diffracted intensity of multiple clusters, related to the independent portions of the sum of all clusters' autocorrelation [16]. The exact form of the constraint ratio for this type of phase retrieval problem is therefore dependent on the spatial relationships of the objects in the clusters.
It can be shown that the simpler operation produces an output that also satisfies the Fourier magnitude constraint, equation (15) but is not distance-minimizing in general and applies only to certain translations and rotation groups. Equation (25) corresponds to equation (18) with the clusters replaced by the objects. The operation given by equation (25) is not strictly a projection since even though it moves the initial point in the N 2 -dimensional space onto the constraint surface given by equation (15) for arbitrary K j , w j and J, it only minimizes the distance corresponding to that movement when the surface is a J 2 -dimensional hypersphere, i.e., when w j are all equal and = K 1 j for all = ¼ j J 1, , . However, despite this shortcoming, equation (25) was demonstrated by Kirian et al [11] to still be effective in the context of shape transform phasing. In that case, convergence was also helped by using a dynamic support that tightens around the object instead of the cluster.

Simulation results
The new Fourier magnitude projection operator equation (22) was tested through simulations. The two-dimensional 'Autumn' image resized to a size of 129 by 129 pixels, was chosen as the object to be reconstructed. The pixel values of this image were purely real and had a minimum value of zero and a maximum value of 1. Clusters of the Autumn image were formed by rotating and translating multiple copies of the original image. The rotations were carried out with nearestneighbor interpolation and the translations consist of integer numbers of pixel spacings. The Fourier intensities of the clusters were calculated using the discrete Fourier transform and summed together, with equal weights, i.e., = w 1 j for all = ¼ j J 1, , , to represent the diffracted intensity data ( ) I q data . Noise was not added since the objective here was to investigate the effectiveness of the proposed projection operator.
Iterative phase retrieval was carried out using the Fourier magnitude projection operator equation (22) and the realspace projection operator equation (23). The support was fixed as a tight box around the original image. Extra constraints such as positivity were not enforced. The simulation volume was obtained by zero-padding the support box by a factor of three in each dimension and the translations t jk were chosen such that the Fourier intensity data were sufficiently sampled with respect to the maximum size of the cluster for the particular ensemble of clusters tested. The IPA used for the reconstruction was the DM algorithm given by equation (4) with its parameter β set to 0.9. The IPA was started such that the initial iterate has uniformly random sample values in real space contained within the support. Convergence of the algorithm was measured by calculating the normalized root mean square error, E, between ( ) I q data and the averaged diffracted intensity obtained from the current estimate of the object, ( ) I q , i.e. where ( ) I q is given by equation (15). The final reconstruction for an IPA run was obtained by taking the iterate at the iteration where the error metric E was the minimum within the maximum number of iterations. The quality of the reconstruction was measured in real space by the normalized root mean square error, e, between the density of the true object, ( ) f x true , and the current estimate of the object density ( ) f x , i.e.
The first simulation is for four clusters of 3, 4, 5, and 7 randomly displaced Autumn images with random rotations as shown in figure 5(a). Three thousand iterations of the IPA were carried out. Reasonable reconstruction of the object was obtained within 500 iterations as shown in the error plot in figure 5(e). It can be seen that the reconstruction, displayed in figure 5(d), is slightly blurry. This is due to the interpolation process when the inverse rotation operator is applied to each object during each iteration of the iterative algorithm.
The second simulation is for the same four clusters of 3, 4, 5, and 7 randomly displaced Autumn images but with their rotations now restricted to random integer multiples of 90 o , as shown in figure 6(a). Three thousand iterations of the IPA were carried out. Successful reconstruction of the object was obtained within 1000 iterations as shown in figures 6(d) and (e). The restriction of the rotations to integer multiples of 90 o means that no interpolation is required during the application of the inverse rotation operator and a good reconstruction is obtained.
The third simulation is for one cluster of two Autumn images such that their rotations are 0 o and 180 o , respectively. The two objects are displaced slightly along their diagonal, as shown in figure 7(a). Three thousand iterations of the IPA were carried out. Reconstruction of an object that satisfies the Fourier intensity of the cluster was obtained as shown in the error plot figure 7(e). However, the object-to-cluster mapping matrix L 1 is rank-deficient in this case and there is insufficient information to reconstruct the object. This is shown by the reconstruction displayed in figure 7(d) that is in error. Inspection of figure 7(d) shows that the reconstruction is accurate where the objects do not overlap, but is inaccurate in the region of overlap where it is contaminated by a mixture of the two rotated objects.

Conclusions
The key result presented here is the new Fourier magnitude projection operator equation (22) and the combination of this operator with a weighted averaging method in real-space. This approach is appropriate for use in reconstructing an object from arbitrarily weighted Fourier intensities averaged over an arbitrary number of clusters of a common object. In practical terms, this corresponds to the problem in which the far-field diffraction intensity is obtained from a variety of molecular clusters of different configurations, but where the intensities have been averaged. This problem arises in the context of serial femtosecond nanocrystallography, since coherently illuminated nanocrystals with imperfect surfaces can indeed be viewed as clusters of identical objects, but where complete three-dimensional diffraction information cannot be obtained in a single shot, on a single crystal shape. Other situations where a superposition of coherently and incoherently summed diffraction intensity from objects can arise include molecular imaging experiments with translationally disordered crystals [1] and in time-resolved pump-probe experiments when only a certain fraction of the molecules are promoted to an excited state [21]. The Fourier magnitude projection operator presented here, in conjunction with the weighted averaging real-space projection, may be applicable to the above situations as IPAs equipped with these projections are (i) not limited to crystallographic space groups (or any particular cluster configurations), (ii) able to handle arbitrary diffracted intensity weightings, and (iii) has the advantage of requiring a simpler real-space projection operator compared to methods that explicitly solve for the clusters themselves.
The effectiveness of the projection operator is confirmed by incorporating it into a standard IPA and reconstructing simple two-dimensional objects under ideal, noiseless conditions. We expect that the algorithm will also be effective with realistic objects in three-dimensions, since iterative phasing algorithms are known to be more robust in higher dimensions. However, this expectation, along with the impact of noise and systematic errors in the diffraction data, needs further investigation.
The projection operators presented here can be applied to the more general situation of reconstructing independent, nonidentical objects within clusters by appropriately altering the real-space averaging step. However, the issue of uniqueness and ambiguities that will subsequently arise with an increased number of independent unknown parameters has not been considered in detail here and is the subject of current research. Similarly, one can envision a generalization to cases in which the weights of each cluster, i.e. the relative occurrence of each cluster type that contributes to the summed intensities, are initially unknown. Since the phase retrieval algorithm solves for many unknown phases, we expect that the addition of a small number of additional unknowns (the weights) is a tractable problem, which we intend to address in future work. In this appendix we first explain the geometric form of the Fourier magnitude constraint given by equation (15), then we set out to derive the projection operator given by equation (22). The derivation will show why the particular normalization factor, K w j j , that was employed for approach B was necessary.
At a particular = q q p , equation (15) can be written as The quantity z jk is a complex number with real and imaginary parts denoted here by x jk and y jk , respectively. Equation (B.1) relates x jk and y jk to the known quantity I and describes a surface in a multi-dimensional space where each of the real numbers x jk and y jk for all = ¼ k K 1, , j and = ¼ j J 1, , represents one independent dimension in this space. The constraint surface given by equation A change of variables can therefore rotate the coordinate system such that the quadratic form is put into its diagonal form, i.e., one that has no cross terms. It can be shown that the Equation (B.4) shows that the quadratic form given by equation (B.2) can be described by just J 2 variables in the appropriate coordinate system and furthermore, becomes a J 2 -dimensional hypersphere. However, since there are now more dimensions ( N 2 ) than independent variables describing the surface ( J 2 ), the surface is a hypercylinder. Moreover, the weighting factor w j can in general be different for different j and so the constraint surface is in general a hyper-elliptical cylinder. Equation (B.1) thus describes the surface of a 2 -dimensional space, corresponding to the case treated by Elser and Millane [9]. The above geometric representation is for a particular = q q p . The geometry of the entire constraint manifold over all Fourier intensity samples = ¼ p P 1, , is then the intersection of all P J 2 -dimensional hypercylinders in a PN 2 -dimensional space. We now derive the the projection operator onto the above manifold. Consider again a particular = q q p . Because z jk is related to the real space object scattering density values by a unitary operator that combines together the translations, rotations and the Fourier transformation operations, namely, ( ) ( · ) p = z F R q q t exp i2 jk jk p p jk T , minimizing the distance between two points in the N 2 -dimensional space also