multiflexxlib: A Python package for data reduction and visualization for the cold-neutron multi energy wide angle analyzer MultiFLEXX

A Python software package for data reduction and visualization of continuous angle multiple energy analysis(CAMEA)typedetectorbackendMultiFLEXXispresented.Thesoftwareconceptfocusesonunam-biguous, automated aggregation of experimental data and preservation of raw data structure in graphical representation,enablingon-the-flyanalysisofexperimentaldatafromMultiFLEXXwithminimalamount ofuserinput,reducingconfusionandhumanerrorinstudiesinvolvingmultipleparameters.Thesoftware also provides a set of interfaces for versatile tweaking of graphing parameters, facilitating generation of production-qualitygraphsforuseinpublications.ThesoftwareenhancestheroleofMultiFLEXXasaswift mapping option available at the cold-neutron triple-axis spectrometer FLEXX. © 2018TheAuthors.PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBYlicense (http


Introduction and motivation
In material science, the interatomic forces are best investigated using inelastic neutron scattering.Due to the periodicity of the single crystals investigated, these forces give rise to collective excitations of the lattice as a whole.These can be either in the form of waves of moving lattice planes or waves of precessing magnetic moments, where the forces investigated are either the electrostatic repulsion between atoms or the quantum mechanical exchange force between atomic magnetic moments.They generally take the form of quasi-particles, with well defined momenta and energy - * Correspondence to: Hahn-Meitner-Platz 1, 14109 Berlin, Germany.E-mail address: habicht@helmholtz-berlin.de (K.Habicht).
where the relationship between momentum (denoted Q) and energy (denoted hω) defines the so-called dispersion relation.These relations define the dynamics of the material in question, and are investigated when neutrons create or annihilate such quasiparticles -losing or gaining momentum and energy in the process of inelastic scattering.In some systems, such as disordered magnetic systems [1,2], it is often interesting to study such dispersion relation over a large portion of (Q, ω) parameter space, referred to as a mapping study.Mapping studies can be very time-consuming, due to the large amount of data points to be gathered, and the generally low count rate in INS experiments.multiple scattering angles (2θ ) and with multiple final energies for each angular channel simultaneously.Such concept is known as Continuous Angle Multiple Energy Analysis (CAMEA) [3,4].
The new MultiFLEXX [5,6] backend built for the cold-neutron TAS (cTAS) FLEXX [5,7] at the BER II neutron source, HZB collects neutrons in 31 2θ angular channels that are separated by 2.5 • , and 5 fixed energies from 2.5 meV to 4.5 meV on each 2θ channel.The MultiFLEXX backend is dominantly operated in constant-energy mode, where 2-D constant-energy slices of the scattering function S(Q, ω) are generated.While MultiFLEXX greatly improves the mapping capability of cTAS FLEXX, the utilization of MultiFLEXX was met with multiple challenges: Experiments on MultiFLEXX often involve multiple parameters such as temperature and magnetic field.The amount of variables can quickly grow and become difficult to manage during an experiment.The difficulty is compounded by the fact that MultiFLEXX experiments often involve multiple measurement passes that partially interleave or repeat each other, possibly with varying step sizes.Additionally, the fixed angular offset between angular channels on MultiFLEXX means that the measurement is carried out using a set of points in (Q, ω) space that do not form an orthogonal grid.On MultiFLEXX, due to the relatively small number of data points, further reducing such data by binning into an orthogonal grid of bins can give rise to large errors in (Q, ω) coordinates and artifacts by having a small number of data points or even zero data points per bin.Such concern is largely alleviated in time-of-flight measurements due to much larger amount of data points, which is the main focus of modern software development such as Mantid [8] and Horace [9].In order to address aforementioned tasks, multiflexxlib is designed to be a highly automated data reduction and visualization software that enables users and instrument responsible to correctly and quickly manage and analyze data from MultiFLEXX.

Software design
Fig. 1 shows the data aggregation pipeline of multiflexxlib.Raw data is first parsed into Scan objects.Scan objects add a layer of abstraction over raw data by providing a set of interfaces for accessing experiment metadata, generating the UB-matrix [10] if not explicitly provided, applying the UB-matrix to data points and keeping track of detector characteristics such as normalization factors.While multiflexxlib currently only accept scan files from MultiFLEXX as data source, the Scan class can be further abstracted into a Python abstract base class to support a common interface for other CAMEA-type backends.While users can directly access Scan objects to perform basic diagnostics and exporting of data if necessary, Scan objects are mainly intended to be fed into data aggregation routine and combined into BinnedData objects.The aggregation routine identifies points as being identical or different in experiment conditions and angles through a process called binning.The bins are defined through a series of bin edges, any values that fall between two subsequent bin edges are considered identical in the data aggregation process.The binning process first sorts a relevant physical quantity such as incoming neutron energy in ascending order, and subsequently iterate through the sorted values, adding a bin edge whenever two values are different by more than the instrument control precision tolerance.This way, bin edges are created based on proximity of values, avoiding the potential pitfall of splitting values that are intended to be identical into two bins by having a pre-defined bin edge too close to a nominal value, and relieving users of the burden of debugging such scenarios and manually keeping track of bin edges.A bigger tolerance can be passed into the aggregation routine if needed.Alternatively, a smaller tolerance can be used if a focused study is performed with step length smaller than usual tolerance.The binning routine can also be operated in a mode using a regular grid of bins if preferred.The aggregation routine first performs binning datasets based on experiment conditions including initial energy (E i ), temperature and magnetic field, and subsequently aggregates data from identical conditions based on 2θ angles, sample rotation angles (A3) and final energies (E f ).The user is only required to explicitly specify if the binning tolerance is to be overridden from instrument defaults.This way, data aggregation is performed with a high degree of robustness and automation.multiflexxlib makes extensive use of the pandas [11] library in data aggregation, especially its ability to embed arbitrary data types directly in its DataFrame data structure.The matplotlib [12] library is used for graphics generation in this package.
The BinnedData object contains aggregated data from Scan objects derived from individual scan runs.BinnedData objects generate 2-D constant energy plots and 1-D subsets (referred to as ''cuts'') by spawning the corresponding objects.The BinnedData object keeps track of measurement parameters, and is the main point of interaction for interactive and scripted use.Care is taken to design an interface that is both easy to use and versatile enough for advanced needs: the most basic usage of generating 2-D colormap plot from raw data only contains 3 lines of code with no required parameters in function calls.If there is further need for customization, matplotlib objects can be accessed through provided properties to enable the entire range of customization possible with matplotlib package.The aggregated data and 1D-cuts can be exported into comma-separated values (CSV) format files if required.

Constant-energy mapping
Constant energy maps of magnetic excitations in the antiferromagnet MnF 2 [13,14] generated using multiflexxlib are shown in Fig. 2. A common strategy of adapting non-orthogonal scatter of data points to 2-D colormaps is to perform interpolation over input data.While such technique is adequate for a sufficiently fine mesh of input data, measured points of MultiFLEXX can be sparse compared to the size of features of the measured excitation spectra depending on scan parameters.Fig. 2a and b show graphs generated by performing linear interpolation over the nonorthogonal grid of measured data points.It can be seen that the data point density information is obscured in this case.To address this issue, multiflexxlib performs 2-D Voronoi partition [15] of input data points in reciprocal space in the scattering plane.Alternatively, the Voronoi partition can be performed in angles and subsequently converted to reciprocal coordinates.A similar approach is adopted by multiplot [16] for the FlatCone [17] multianalyzer at ILL.This way, each measured point is represented as a discrete region of Q-space that is closer to the corresponding data point than to surrounding points.The users are thus provided with an intuitive and unambiguous representation of raw data structure and measured point density.The simplicity of the principle also ensures that the software is versatile and robust when processing scans that are partially overlapping, partially repeating, done with variable step intervals or contain non-working detector tubes.It is worth mentioning that such method is essentially identical to a nearest neighbor interpolation over an infinitely fine grid, but performing Voronoi partition has the advantage of not requiring an extremely fine interpolated mesh grid and can be zoomed in without causing pixelation.multiflexxlib also supports handling non-orthogonal axes such as hexagonal a-b plane.The creation of axes and 2-D plots has its own public interface to facilitate the integration into the users' own graph generation routines or other software packages.

Constant-energy 1-D cut and dispersion relation plot
It is also useful to extract 1-D cuts on measured data.When performed on experimental data from ToF measurements, such cuts usually involve binning data into a set of bins, typically with multiple measured data points per bin due to the large number of available measured points [9].When treating MultiFLEXX data, due to the lower data point density, setting lateral bin sizes becomes tricky as an overly small setting leads to an overly small number of data points accepted in cut generation, requiring users to tweak bin settings carefully.multiflexxlib provides an alternative cutting method that operates on the Voronoi partition of measured data.The method draws a line segment between the specified cut start and end points, and each data point corresponding to the crossed Voronoi regions is subsequently projected onto the cut axis.Fig. 3 shows 1-D cuts generated using multiflexxlib.It can be seen that the algorithm automatically takes advantage of a higher data point density when available without user intervention, and gives a representation of the number of data points involved in the representation of a feature in the excitation, which both are useful in on-the-fly analysis.The user is also relieved of the burden of manually keeping track of and setting lateral tolerance, as it is adapted to the size of the Voronoi tessellation cells.Manual setting  of rectangular bin sizes is also supported in a separate function, which is more suited for analysis with sufficient data point density, where multiple points per bin are available.Cuts can be made along any direction for both methods.For both cutting methods, an inspect method that overplots bin edges on data points as is shown in Fig. 3 a and b is provided.In the case shown in Fig. 3, the ''bins'' shown are actually the Voronoi partition cells corresponding to the data points involved in the 1-D cut.
(Q, ω) dispersion relation plots can be created by vertically stacking const-E 1-D cuts.Fig. 4 shows data for the (Q, ω) dispersion relation for magnons in MnF 2 from (0.5 0 0.05) to (1.5 0 0.05) (r.l.u.).Data from different final energy channels are normalized using I ∝ ΩV res = Ω k 3 f tan θ A [19], where k f is the wavevector of an energy channel, and θ A is the half of the analyzer takeoff angle of the channel.It is worth mentioning though, that the normalization factor does not completely account for differences of the resolution ellipsoid of different final energy channels.Nevertheless, such (Q, ω) dispersion relation graph can be useful in visualizing dispersion relations on-the-fly in lieu of a full analysis accounting for resolution effects.

Conclusions and future work
While a lot of effort has been invested into the development and construction of MultiFLEXX and other multiplexing backends, software for data visualization lagged considerably behind.The extensive amount of variables commonly involved in MultiFLEXX experiments also makes data analysis prone to human error.This is especially true to on-the-fly analysis during experiments.Due to the widespread use of interpolation in graphics generation, users are tempted to perform scans with unnecessarily high data point density.The ambiguity of interpolated data can also lead to misinterpretations.All these factors can lead to sub-optimal utilization of beamtime.By providing the automated and unambiguous data aggregation and visualization toolbox multiflexxlib, the authors hope to address these difficulties and improve the scientific output of MultiFLEXX and other multiplexing backends that share a common concept [3,4,20].The authors wish to add qualitative analysis functionalities such as modeling and fitting of experimental data in the future, which remains a challenging task [6].

Fig. 2 .
Fig. 2. Constant energy maps of magnetic excitations in MnF 2 measured with MultiFLEXX with E i =5.8 meV and E f =3.0 meV.(a) and (c) are generated from one measurement pass; (b) and (d) are generated from two interleaving measurement passes with a 1.25 • offset in 2θ angles.(a) and (b) are plotted using bilinear interpolation; (c) and (d) are plotted using the Voronoi tessellation method.

Fig. 3 .
Fig. 3. 1-D const-E cuts of MnF 2 excitations from (1 0 0) to (1.5 0 0) (r.l.u.) with E i =5.8 meV and E f =3.0 meV.The left column is generated from one scan pass, (a) shows the Voronoi cells of the data points involved in cut.(c) shows cut results.The right column shows results generated from two interleaving scan passes with 1.25 • offset in 2θ angles.