XRF and XRPD data sets in ternary mixtures with high level micro-absorption and/or preferred orientations problems for phase quantification analysis☆☆☆

Micro-absorption (MA) and/or preferred orientations (PO) are two among the major problems affecting quantitative phase analysis (QPA) by X-ray Powder Diffraction Data (XRPD) in industrial samples such as minerals and ores, additives, cements, friction materials, coal combustion by-products. Typically XRPD data are coupled to elemental analysis by X-ray Fluorescence (XRF) to facilitate phase recognition and quantification when elements heavier than sodium are present. Graphite and urea are typical examples of large production industrial commodities showing such analytical problems. The present article provides a recipe to produce sets of data of increasing difficulty to test the performances of different approaches and/or software’s for QPA by XRPD in graphite, zinc acetate and urea containing samples. Graphite, due to its platelet morphology, can exhibit orientation and was chosen because it is possible to control its PO degree by sieving. Simplex-centroid design augmented was used for the design of the experiments to select the mixtures with the more possible homogeneous exploration of the ternary experimental domains, from pure phase to equal-weighted mixtures. The different data sets collected on the four experimental domains by XRF and XRPD are provided and stored as a repository on Mendeley Data. Using the same approach, additional data sets sets with different composition and/or experimental setup can be added by us or any other contributor with the same DoE approach to create a wide open access data set of standardized X-ray powder diffraction and X-ray fluorescence data.


a b s t r a c t
Micro-absorption (MA) and/or preferred orientations (PO) are two among the major problems affecting quantitative phase analysis (QPA) by X-ray Powder Diffraction Data (XRPD) in industrial samples such as minerals and ores, additives, cements, friction materials, coal combustion by-products. Typically XRPD data are coupled to elemental analysis by X-ray Fluorescence (XRF) to facilitate phase recognition and quantification when elements heavier than sodium are present. Graphite and urea are typical examples of large production industrial commodities showing such analytical problems. The present article provides a recipe to produce sets of data of increasing difficulty to test the performances of different approaches and/or software's for QPA by XRPD in graphite, zinc acetate and urea containing samples. Graphite, due to its platelet morphology, can exhibit orientation and was chosen because it is possible to control its PO degree by sieving. Simplex-centroid design augmented was used for the design of the experiments to select the mixtures with the more possible homogeneous exploration of the ternary experimental domains, from pure phase to equal-weighted mixtures. The different data sets collected on the four experimental domains by XRF and XRPD are provided and stored as a repository on Mendeley Data. Using the same approach, additional data sets sets with different composition and/or experimental setup can be added by us or any other contributor with the same DoE approach to create a wide open access data set of standardized X-ray powder diffraction and X-ray fluorescence data.  Table   Subject Materials science Specific subject area Quantitative phase analysis by XRF and XRPD Type of data Text files How data were acquired X-ray fluorescence X-ray powder diffraction Data format Raw Parameters for data collection XRF and XRPD raw profiles were collected on four different sample groups of mixtures, designed by the simplex-centroid augmented scheme with increasing effects of micro-absorption and/or preferred orientations. Description of data collection XRF data were collected in air condition on a Rigaku NEX QC with the X-ray tube at 20 kV and 200 μA. XRPD data were collected by a Bruker D8 Advance in the 2 θ range of 10 °to 120 °with the X-ray tube ad 40 kV electric potential

Value of the Data
• Chemists, earth and materials scientists, engineers by both industrial and academic fields, interested in solid-state matter analysis can benefit of these data to test and improve the performances and explore the potentialities and limitations of the analytical methods of solid samples. Improving quality control process in commodity production can improve materials usages and reduce wastes and resource consumption. These data will benefit researchers in developing and improving methods, and software and/or analytical approaches can be benchmarked and ranked on the same sets of data to have a clearer view of their accuracy and limitations for quantitative phase analysis in complex samples. • The data sets available in the literature are natural samples or mixtures prepared with concentrations suitable to reflect real sample compositions [1][2][3][4][5][6][7] . Conversely, the data set D1-4 are the first available XRF/XRPD data sets designed using the DoE theory, therefore by a method-centred approach. A simplex-centroid augmented experimental plan for mixture was used: in this way, the mixture concentrations do not mimic real samples but are optimised to properly sample the experimental domain. This allow the best possible precision in evaluating and comparing methods and/or software used for XRPD/XRF data analysis. • Methods for phase analysis by X-ray powder diffraction data in presence of preferred orientation and micro-absorption are needed in many industrial applications for the quality control of raw and refined materials. This data set was built measuring samples containing urea and graphite which suffer of such problems and can hardly be quantified with traditional methods of analysis by XRF and XRPD data. The provided data set was built analyzing samples and substances showing different degrees of micro-absorption and preferred orientations, in order to produce standard data set on which different methods can be used for quantification purposes. Patterns and spectra of the pure phases are provided, giving all the necessary tools for both supervised and unsupervised analysis. • The provided data set is open source and open access and new XRPD/XRF data sets built with the same simplex-centroid design approach but with different samples and/or experimental setup can be added by us or any other contributor. In this way, a wide standardized database of experimental data will be available to carefully test and compare the performances, limitations and possibilities of assessed or new methods of analysis.

Data Description
The actual (v.4) version of the repository on Mendeley Data is a collection of 80 raw files organized as described in the following directory tree diagram: Each level-1 directory (D1, D2, D3 and D4) has the name of the corresponding mixture experimental domain, which is accurately described in the next section. The data set consists in a couple of measurements (one XRF spectrum and one XRPD pattern) collected on each sample of the four sets of experiments. XRF data are provided in two-columns .txt files: intensity (a.u.) vs. energy (keV). XRPD data are instead provided three-column .xye files, in which are reported the intensity (a.u.), the 2 θ angles (degrees) and the error associated to the measurement (a.u.).
Both XRF and XRPD sets were not pre-processed and are given as obtained by the instruments. Each set of data was collected on mixture samples designed with a simplex-centroid augmented design on four different ternary mixture spaces. The approach, based on DoE, guarantees a full and homogeneous exploration of such experimental domains. The data on experimental set D1 (composed by BaSO 4 , Bi 2 O 3 and graphite with diameter lower than 90 μm) are reported in Fig. 1 . For readability purposes, XRF data are plotted from 0 keV to 25 keV to better show the principal signals belonging to each pure phase. In the database, data are provided in the full collection range, from 0 keV to 50 keV, although almost no signal is present above 25 keV. As can be seen in Fig. 1 , both samples D1-Ba and D1-Bi , belonging to the pure phases of respectively BaSO 4 and Bi 2 O 3 , show the signals of the detectable atoms with this techniques: barium, sulfur and bismuth. However, since the measurement were performed at 20 kV of electric potential (which is the optimal condition for mixtures containing medium and high Z elements) the signal of the sulfur appears very small. The 20 kV measurement condition was chosen to also allow comparison of XRF data with a 20 kV SEM/EDX measurement. SEM/EDX data are not present in this dataset because EDX is not competitive with XRF concerning both analysis duration/complexity and accuracy in quantification, it was used to assess the morphology of mixture components. Sample D1-Gr , belonging to graphite, shows Compton scattering only, except for a small signal near 6 keV which was attributed to small impurities of iron that appeared after sieving the powders ( K α 1 Fe = 6 . 404 keV , K α 2 Fe = 6 . 391 keV , K β 1 Fe = 7 . 057 keV ). The physical laws that regulate the signal intensity are not linear combinations of the signal of the pure phases, which is one of the principal obstacles of quantification by XRF. In Fig. diffraction patterns of the same samples are shown. Data were collected in the 2 θ range of 10 °to 120 °, since there were no signals below 10 °and peaks belonging to the heavy phases were well visible until very high angles. Samples D1-Ba and D1-Bi both have patterns very rich in signals with principal peaks intensities in the order of magnitude of 10 3 . Those many signals cause peak superposition, resulting in very complex XRPD pattern such as the one in mixture D1-Ba Bi . Sample D1-Gr shows a very intense peak and few other signals that are about two orders of magnitude lower, which is typical of oriented samples. Effects of micro-absorption can be observed by looking and the differences in the signal intensities: in principle the scattering power increases with increasing Z numbers of the element in each phase. However, the high density of barite and bismite, especially in the case of bismite showing large crystallites, cause and increase absorption of X-ray and net reduced diffraction efficiency. However, barite and bismite both have high densities, which result in crystals where the absorption of the incident X-ray is higher. Data set D2 shows data very similar to data set D1, but in this second collection of data, graphite contains a 30% of large particles (diameters greater than 90 μm) resulting in a much more oriented system with large PO effects. In fact, XRPD data of sample D2-Gr show an intense peak which is an order of magnitude greater than the one in sample D1-Gr . The effects of preferred orientation and of the micro-absorption are more evident, as the intensities of the peaks in mixtures shows complex features without any clear mathematical relation to those belonging to the pure phases. Data set D3 was collected on an experiment set of mixtures in which graphite was substituted by zinc acetate, therefore a salt with a metallic element which can produce a XRF signal. In Fig. 2 (a), in addition to the previously described signals of barite and bismite, the XRF spectrum of zinc acetate can be observed ( D3-Zn ), with two characteristic signals corresponding to its K α and K β ( K α 1 Zn = 8 . 639 keV , K α 2 Zn = 8 . 616 keV , K β 1 Zn = 9 . 572 keV ). Zinc acetate, pure and in mixture, is characterized by a principal peak ( Fig. 2 (b)), 10 to 100 times more intense than the others, which flatten all the other peaks in the pattern. The effect of PO is comparable to the one that affect data set D2. Data set D4 was collected on a ternary mixture experimental domain composed by barium sulfate, bismuth oxide and urea. XRF spectrum of sample D4-Ur does not show peaks except for a large band between 14 keV to 19 keV, which is related to the Compton scattering of carbon, nitrogen and oxygen, which do not have an XRF detectable signal. In XRPD data plotted in Fig. 2 (d), urea shows a behaviour very similar to zinc acetate and graphite, with a very intense characteristic peak that is two orders of magnitude greater than the most intense peaks of barium sulfate and bismuth oxide.

Simplex-centroid design augmented
Design of experiments (DoE) is a practice following the "parsimony principle", consisting in using the smallest number of samples able to give the greatest possible amount of information. This can be done designing experiments homogeneously within the experimental domain, avoiding lined up samples and empty unexplored areas. Most common approaches such as D-optimal, K-optimal, Box-Behnken and Doelhert designs [8] are not suitable for the exploration of a space in which variables cannot be fully independent [9] . Indeed, in mixtures, a fundamental relation states that the sum of every weight fraction x i of all the q components has to be equal to 1, clearly stating the impossibility of the independence principle: For this reason, the choice of best DoE approach for mixtures must consider designs appositely made for systems showing such a dependency [9,10] . Among the many existing solutions (such as simplex-lattice design [11] , axial design, ...) simplex-centroid design augmented was chosen because of the low number of experiments required to have a full homogeneous exploration of the mixture space spanning from pure phases to equal-weighted mixtures. Simplex-centroid design is a component-independent experimental design for mixtures in which experiments are generated by the equation: Where n is the number of experiments and q is the maximum number of components of the mixture (corresponding to the dimensionality of the mixture space). While the number of the experiments is defined by Eq. 2 , the composition of each mixture is defined by the generic binomial coefficient q m , where m represents the number of desired components in the mixture. This means: for m = 1 , q 1 pure components, permutations of (1 , 0 , 0 , . . . , 0) for m = 2 , q 2 mixtures, permutations of (1 / 2 , 1 / 2 , 0 , . . . , 0) for m = 3 , q 3 mixtures, permutations of (1 / 3 , 1 / 3 , 1 / 3 , . . . , 0) . . . for m = q, q q mixture of the q components ( 1 A graphical representation of this experimental design applied to a three-components system, mixture, which results in three additional experiment for a three-component mixture domain.

Materials
Blanc

Data collection
A Rigaku NEX QC with the X-ray tube at 20 kV in electric potential and 200 μA in current was used for XRF measurement in air condition. Rigaku's filter A was used during the measurements.
A Bruker D8 Advance with a Lynxeye XE-T detector and a Cu source ( λ = 1.54 Å ) was used for XRPD pattern collection in the 2 θ range of 10 °to 120 °). The instrument was equipped with an auto-sampler with nine positions and an air scatter knife. On both the primary and secondary optics Soller slits 2.5 °opening were positioned and automatic divergence slits were used with a fixed sample illumination of 10 mm length. Goniometer radius was 280 mm. Analyses were run without sample rotation as in preliminary measurements it was observed to negatively affect the quality of data. In fact, the large differences in density cause a segregation during rotation because heavy and light phases are concentrated in the bottom and top parts of the sample holder respectively. Such phenomenon could be avoided by pressing the sample in a pellet but this would cause an increased and not controllable degree of PO. The tube was set at 40 mA in current and 40 kV in electric potential. Powders were observed with a STEMI 508 microscope with a 2x frontal optic equipped with Zeiss fiber optics halogen bulbs and a LED ring for illumination. Digital high resolution images of the powders were collected with a 20 MPx SONY sensor camera. A Hitachi FLEXSEM 10 0 0 equipped with AZtecOne Oxford EDS, was used for electron microscopy analysis and elemental analysis, in order to evaluate samples' features. The source of the electrons was a tungsten filament at 20 kV. Powders were measured without coatings.

Sample preparation
Substances were only manually ground prior measurement in an agate mortar to simulate an industrial content in which simple and fast measurements are preferred (i.e. avoid Mc Crone grinding prior XRPD measurement). Graphite, the compound showing layered morphology and thus more prone to PO, was then sieved (90 μm mesh diameter) and both the finer and the larger fractions were retained with the scope of preparing samples with different but controlled presence of PO. Images of pure phases, collected with a STEMI 508 microscope with a 2x frontal optic, are shown in Fig. 4 . 4.5 g of each sample were prepared. The quantity was chosen accordingly to the UNI EN 15309:2007 Italian standard methodology for the determination of elemental by X-ray fluorescence [12] as there are no particular requirements for XRPD.
Mixtures sets, whose compositions are detailed in Table 1 , were prepared according to a Simplex-centroid design augmented and characterized by the following: four sets of samples Table 1 Compositions and ID of every produced sample. The four experimental design are ordered in an increasing degree of influence of the micro-absorption and PO effect. Graphite quantities in experiments set D2 are reported as total weight fraction outside the brackets and in relative quantities of the larger and smaller fractions inside the brackets.   were then produced, in order to simulate an equal number of different systems with different typical obstacles for the quantitative phase analysis: complexity of the mixture by the increasing peak superposition, preferred orientations and high differences in density and particle size of the phases are also present, to enhance the effects of micro-absorption. The 4 data sets have been built with the following features: • Experiments set D1: the three components are barite, bismite and the sieved graphite ( φ < differences in density and in the particle size of the three pure phases and the presence of a phase (graphite) which does not have a characteristic XRF signal. • Experiments set D2: identical to experimental set D1, but 30% in weight of the used graphite belonged the fraction with diameter larger than 90 μm after sieving. As a result, an important PO effect in addition to the previously described issues of experiment set D1 is introduced in experiment set D2. • Experiments set D3: graphite is substituted by zinc acetate, which possesses an XRF signal ( K αZn = 8 . 63 keV , K βZn = 9 . 57 keV ), shows PO and has a lower density, which enhances the micro-absorption effects and a larger unit cell, increasing peak superposition effects. • Experiments set D4: in this set, the third component is urea, which once again does not possesses a XRF signal, has a slight PO effect, has larger particles (as can be seen in Fig. 4 ) and has a lower density than zinc acetate, with increased with in micro-absorption effects due to density difference with respect to barite and bismite.
SEM-EDS analysis was used for electron microscopy analysis and elemental analysis, in order to evaluate samples' correct dispersion and differences in particle size. In this paper, an example of the distribution of the mixtures prepared is reported in Fig. 5 , where the dispersion of the three phases (barium sulfate, bismuth oxide and zinc acetate) can be observed. Although being accurately mixed, the differences in the average particle size of the pure substances affect the homogeneity of the powder mixture. In particular, a large agglomerate of almost 300 μm can be observed on the top-right corner of the figures.

Open Database Rules
Any one can add a XRPD data set provided that: • XRPD data and instrumental conditions are provided to the corresponding author to produce a new updated version of the Mendeley Data repository; • the simplex-centroid augmented approach must be used to produce ten XRPD patterns and, if possible, the ten corresponding XRF spectra for each proposed dataset; • all the instrumental data set to both reproduce the experiments and using parametric approach for data refinement and/or analysis must be provided.