On the scaling of multicrystal data sets collected at high-intensity X-ray and electron sources

The need for data-scaling has become increasingly evident as time-resolved pump-probe photocrystallography is rapidly developing at high intensity X-ray sources. Several aspects of the scaling of data sets collected at synchrotrons, XFELs (X-ray Free Electron Lasers) and high-intensity pulsed electron sources are discussed. They include laser-ON/laser-OFF data scaling, inter- and intra-data set scaling.


I. INTRODUCTION
With the increasing application of pump-probe photocrystallography over a range from milliseconds to femtoseconds, sample crystals have increasingly needed to be replaced due to laser-induced radiation damage. Even more dramatic, X-ray induced damage is dominant in femtosecond experimentation as performed at X-ray free electron laser (XFEL) sources and high brightness electron diffraction facilities, leading the sample to explode immediately after the generation of a diffraction pattern. In that case, the subsequent analysis must be based on multi-crystal data sets, which implies that data-scaling becomes a crucial aspect of the processing of the data, as has been recognized in many reported studies, see, for example, Hattne et al. (2014) and Kirian et al. (2011). We will discuss three types of scaling and provide references to more explicit descriptions where necessary.

II. CONCEPTS
Time-resolved pump-probe experiments can be performed by repeated measurements on a reversible system with different pump-probe delays if necessary, or, in serial time-resolved crystallography, by following non-reversible processes, such as chemical or biological reactions at different time points. In the first case, the RATIO method can be used in which laser-ON and laser-OFF frames in a single setting are collected in rapid succession, and in the second case, the time-zero dark and serial light-on structures have to be collected separately. Important quantities in time-resolved crystallography are the response ratio R, defined for each reflection H as In the RATIO method of data analysis, the ON/OFF ratios R are the basic quantities on which the refinement is based (Coppens et al., 2009). One of the aims of the studies is the calculation of photodifference maps, showing the light-induced change in electron density, either as a function of time if a dynamic process is being recorded or as due to light induced electronic excitation (Zheng et al., 2007;Makal et al., 2011;and Fournier and Coppens, 2014b). Such maps, defined as The use of structure factors from an accurately refined reference model minimizes the experimental errors in (2b) which will then essentially result from the ratio determination. The maps include the effect of structural rearrangements and temperature changes induced by the laser exposure, the former including electron density changes on excitation, the latter however typically being a much smaller effect than that of the features due to the atomic motions. An example of a photodifference map, showing density changes resulting from molecular excitation of a [CuI[(1,N) bis(triphenylphosphine)] complex (Makal et al., 2012), is shown in Figure 1. A comparable map on a dirhodium complex can be found in a 2011 publication (Makal et al., 2011).

A. ON/OFF scaling
For each frame pair l collected with the consecutive ON/OFF exposure technique with different incident beam intensities, a scale factor can be defined as The scaled ON/OFF ratio for each H in the frame pair I is obtained from For ON/OFF pairs of well populated frames, this scale factor can be estimated by analysis of the very low-order reflections, as the g's will go to zero in the forward scattering direction, in which all electrons scatter in phase. For example, the linear regression y-intersect of the photo-Wilson plot (Schmøkel et al., 2010) can be used to estimate the factor K l . This argument neglects any increase in diffuse scattering on exposure, which may be significant if conversion percentages are large. Once the difference between the incident beam intensities of the laser-ON and laser-OFF frames has been corrected, the inter-set scaling can be performed using the procedures described in Section III B.

ON/OFF scaling, special cases
In the time-resolved synchrotron experiment, it is feasible to collect laser-ON and laser-OFF frames in rapid succession, thus making sure that crystal quality and other conditions, such as incident beam intensity or crystal size, are the same, This eliminates the need for relative scaling of the ON and OFF frames which is otherwise necessary for calculation of the ratios in expression 2b, following expression (5). This is not the case when radiation damage becomes excessive at XFEL sources. When a liquid jet injector is used to pass a stream of nanocrystals through the photon beam (Spence et al., 2012), no consecutive ON/OFF exposure is feasible as the crystals are destroyed immediately after the diffracted beams have been emitted. However, the nanocrystals may be stationarily positioned in arrays, which may be done by use of a crystallization chip with multiple basins (Mueller, 2014), an ultra-thin silicon nitride membrane (Hunter et al., 2014), or similar, In this case, it is possible to take a first exposure with attenuated incident beam intensity below the destruction limit to collect the OFF-frame, to be followed by a second exposure with full intensity for the ON-frame (Mueller et al., 2015). The consecutive ON/OFF-exposure technique is advantageous even in this case as the OFF and ON reflections are measured on the same sample crystal, but the method still requires the ON/ OFF scaling described above, as the OFF beam has been attenuated to preserve the sample and apart from that, the X-ray pulses at XFELs vary in intensity. If the ON and OFF data have been collected on different samples, data set merging and scaling must be performed first separately on all ON and all OFF data after each of the patterns has been successfully indexed, prior to ON/OFF scaling.

B. Inter-set scaling
Two different scaling methods have been developed. Both are based on the assumption that the effects of the thermal increase and that of the induced structural changes on excitation are proportional to the laser intensity and penetration depth. The Absolute-Average-System-Response (AASR) Method is simple and fast. It has been briefly described earlier and applied to synchrotron data sets on a bridged binuclear Rh compound (Makal et al., 2011). The more sophisticated Weighted Least-Squares (WLS) Method is based on the fit of a non-structural model to the intensity changes g in the different data sets.

The AASR method
The g values can be both positive and negative. But if the effect of the thermal increase and that of the induced structural changes on excitation are proportional, the average absolute value of the gs over each of the data set should be scaled to a common value to put the g values in different sets on a common scale. Thus, a scale factor can be defined for each set i as to give The method is fast, but it relies apart from the assumption of proportionality of structural and thermal effects, on a considerable overlap of the reflections in the various data sets, especially in cases in which the structural changes, and therefore, the scattering changes, are very anisotropic.

The WLS method
The weighted least-squares method as applied to the g values (Fournier and Coppens, 2014a), defined in expression (1), follows a procedure described by Hamilton in 1965(Hamilton et al., 1965 for the layer scaling of Weissenberg photographs and further developed by Fox and Holmes (Fox and Holmes, 1966).
The function minimized is defined as in which the parameter-based model for g(H) in set i, g i model ðHÞ is given by Q i hg model iðHÞ; hg model iðHÞ being the calculated average g for reflection H over all sets, w ði;jÞ obs is the weight of reflection j in set i, and Q i , the relative excited-state population, is defined as Q i ¼ P i /hP i i all sets , with P i the excited state population in set i. hg model iðHÞ and Q i are the variables in the WLS refinement. The scaled g values of set i are then obtained from The values of hg model iðHÞ can be used in the calculation of the photodifference map defined in expressions (2a) and (2b).

Intra-data set scaling
A logical consequence of the anisotropic optical nature of all but cubic crystals is the anisotropy of light absorption and therefore of light-induced changes such as photoreactions or molecular excitation. In addition, when samples with dimensions of several tens of microns or larger are used in the experiments, light penetration is typically limited, which will cause an orientation dependence of the light absorption in irregularly shaped samples.
In a molecular crystal, the optical anisotropy is in first approximation dominated by the molecular properties. The main absorption direction represented by the transition dielectric dipole moment integral vector is a common output of theoretical calculations. As a sample is rotated during data collection or when different crystals are used for collection of single frames, the angle between the transition moment and the incident beam will change, leading to variation of the absorption. To minimize (but not eliminate) the effect, the laser pump-beam should be circularly rather than linearly polarized, as illustrated in Figure 2.
The intensity of absorption at an instant t is proportional to jEj 2 .jlj 2 cos 2 h (Malus law), in which E is the vector of the electric component of the laser beam. Integration over the electric field plane of the circularly polarized laser leads to an average intensity of the absorption proportional to sin 2 c. Clearly, the effect is strongly dependent on the relative orientation of crystal. When the sample will be rotated around the laser direction, no variation due to absorption anisotropy should be observed. On the other hand, when the sample is rotated perpendicular to the laser direction, the anisotropy may be pronounced.

IV. EXAMPLE OF AN APPLICATION
A synchrotron study was done on the tetranuclear complex Ag 2 Cu 2 L 4 (L ¼ 2 À diphenylphosphino À 3-methylindole) (Jarzembska et al., 2014) at beamline 14-ID of BioCars at APS. The complex shows significant structural changes on excitation by 355 nm laser light, the silver-silver contact in the complex shortening by 0.38(3) Å on ligand-to-metal charge transfer. Four data sets were collected on different samples. Details are summarized in Table I.
Merging of the four data sets leads to a global completeness of 46.7% with a redundancy of multiply measured and equivalent ON/OFF ratios of 5.4. The achieved resolution is controlled by instrumental limitations to a maximal sin h/k value of 0.61 Å À1 .
Results of inter-and intra-data set scaling of the four sets of data collected are illustrated in Figure 3. The horizontal red and black lines represent the effect of the inter-data set scaling FIG. 2. Relation between the transition dielectric dipole moment integral vector l, the direction of a circularly polarized laser beam n, and the u rotation axis. The direction of n is fixed during the experiment but the l vector will rotate around the u-axis during multi ON/OFF frame data collection, thus changing the c angle. Unique data set collected with a pair redundancy hNi ¼ 10 and u angle steps Du ¼ 1 , while all others collected with hNi ¼ 5 and Du ¼ 2 .

064101-5
with the WLS and AASR methods, respectively. For data set 3 both horizontal lines superimpose. Inter-set scaling differences are apparent especially for sets 1 and 2. However, the effect of intra-data set scaling is much more pronounced especially for set 3 as shown by the curved lines in black and red, respectively, obtained using the AASR and WLS methods. In the case of the WLS method, a restraint is added to the WLS error function to reduce the variation between adjacent u settings, which smoothes the lines. The inter-scaling relative populations obtained with the different scaling methods described above are summarized in Table II.
The atomic shifts on excitation obtained in a joint refinement against the four individual sets are summarized in Table III. The results are remarkably independent of the details of the analysis.
The robustness of the results is likely due to the large redundancy averaging at 5.4 in the experimental ratios. In a joint refinement against the four data sets, as described above, the effect of equivalent reflections being affected differently will average out in the final results.
A. Intra-data set scaling of set 3 The pronounced variation of the experimental ratios with the u-angle allows an experimental determination of the direction of the transition moment integral vector l. Application of sin 2 c-dependence to the WLS data of set 3, discussed in Section III B 3 leads to a direction equal to l normalized ¼ ½À0:955 ; 0:073 ; À0:287: This may be compared with the following values from Gaussian/DFT(HSEH1PBE)/LANL2DZ calculations for the four lowest energy transitions (Table IV). The first two transitions, which have the highest oscillator strengths, have a transition-moment direction oriented close to the aaxis, in agreement with the experimental result. The agreement between theory and experiment is quite reasonable and supports the validity of the experimental methods used.

V. CONCLUSIONS
We conclude that the anisotropy of absorption followed by molecular excitation can be pronounced, but as expected, depends strongly on the orientation of the sample with respect to the laser beam. When a large number of individual data sets are collected as necessitated by sample destruction in a very intense X-ray beam, and to a much lesser extent by laser damage in synchrotron experiments, anisotropy effects, evident from intra-data set scaling, may average in joint refinements, or in a unique-set refinement performed when all sets are merged. Interdata set scaling is essential if photodifference maps are to be examined. ON/OFF scaling must be used when the consecutive ON/OFF-exposure method, applied to the same sample with the same incident beam intensity, cannot be used.  III. Atomic shifts in the refined structure models obtained by joint model refinements performed with the program LASER2010 (Vorontsov et al., 2010) against independently merged data sets with and without intra-set scaling using the scaling methods AASR or WLS. The atomic shifts reported in the original paper (Jarzembska et al., 2014)