Original RxpsG a new open project for Photoelectron and Electron Spectroscopy data processing

Data analysis and plotting is an important part of the research work accompanying any scientist. Once the experiments are concluded, generally a software allowing data reduction such as selection of background and its subtraction, peak fitting, graphical visualization is used to obtain a correct interpretation of the results. RxpsG is a public domain software with an easy user friendly interface oriented to X-ray Photoelectron Spectroscopy (XPS) and Auger Electron Spectroscopy (AES) data manipulation based on the R platform. All the features needed to analyze XPS, AES spectra are implemented and the software allows an immediate data reporting. Although the RxpsG is primarily devoted to electron and photoelectron spectral analysis, it allows any data in text format to be loaded and processed. RxpsG is a project open to contributions and implementation of new procedures. In this work we will describe the potentialities of the software and the more important features.


Introduction
Photoelectron spectroscopies are important tools in surface science. This stems from the possibility to derive the chemical composition of the analyzed materials by measuring the kinetic energy of X-ray photoemitted and Auger electrons [1,2]. Essentially the electron emission is based on the photoelectric effect discovered by A. Einstein who received for this the Nobel prize in 1921. High energy photons impinging a material may release part or all of their energy to the electrons of the atoms. If the absorbed energy is sufficient, the excited electrons may escape the atomic orbital and be emitted in the material matrix.
Here they propagate in all the directions experiencing elastic and inelastic scattering with the material atoms. Those electrons generated within a few nanometer of the surface may be emitted in the vacuum where they are collected by an energy analyzer measuring their kinetic energy. This is the reason why XPS is a surface analysis technique. The same holds for the AES electrons. In this case, X-rays or an electron beam generate a hole in a core atomic orbital. This leads to relaxation processes where the energy released is simultaneously transferred to electrons in outer atomic orbitals, the Auger electrons, which are emitted. The kinetic energy of Auger and X-ray emitted electrons is dependent on the electronic structure of the emitting atoms and can be used to identify them making XPS and AES sensitive to the material composition. For this reason, these techniques developed in the sixties, experienced continuous improvements and are key tools in material science chemistry and industry for the development of new materials.
The development of better detectors combined with increasing computation power has fostered the creation of sophisticated software packages for spectral data analysis. Data analysis is an important process of data reduction encompassing visualization, noise rejection through smoothing, modeling and replotting. A powerful data processing environment providing a complete list of options for spectral analysis can greatly simplify the data manipulation while providing the means for a correct interpretation.
There are some commercial packages to process XPS AES data. Different from these software, RxpsG is an open source project to perform the spectral analysis.
It is developed utilizing the R platform potentialities not only for computational part but also for controlling all the graphical user interfaces (GUI) needed to simplify the software use. As for the native file format, RxpsG may load the general purpose ISO 14976 VAMAS and PXT binary or DOS binary (Scienta-Omicron or the old Scienta-Seiko formats respectively). In addition it may load any kind of data in ASCII format. On demand, specific routines may be easily integrated in the program to load other kinds of data formats. RxpsG includes many features, such as subtraction of diverse backgrounds, smoothing using different filters, peak fitting with a list of functions, elemental quantification and includes powerful graphical features allowing suitable and personalized data representation. Finally, analyzed data are saved in RData binary format for archiving or may be exported in ASCII format for other kinds of applications. Reporting is very easy since any information regarding fits and quantification can be copiedpasted in textual documents. In addition, all plots can be easily saved in the main graphical formats (metafile, postscript, pdf, bmp, png, tiff and jpeg) in UNIX based systems and then imported in document editors or just copied-pasted directly in the case of Microsoft Windows systems. Finally, a manual describing in detail all the RxpsG option is provided with the software.
RxpsG has been used for spectral analysis since some years during which it was tested and gradually improved and contributed to several publications [3][4][5][6][7][8]. This article describes the main features of RxpsG and its application in the analysis of real XPS spectra.

Software architecture
R is a free software environment for statistical computing and graphics (see R-platform for more details about the R features and environment characteristics [9]). R is compatible with a wide variety of UNIX, with Windows and MacOS making the RxpsG running under these operating systems. The software is released under the terms of the Free Software Foundation's GNU General Public License in source code form. RxpsG is developed following a modular structure allowing an easy extension of its capabilities by the implementation of customized file handling routines, specific processes for data analysis or other procedures for image processing or statistical analysis. In addition to the base R libraries, RxpsG macros need a list of additional packages which have to be loaded in the R environment: baseline [10], deSolve [11], digest [12], FME [13], gWid-gets2 [14], gWidgets2tcltk [15], lattice [16], latticeExtra [17], memoise [18], minpack.lm [19], NORMT3 [20], RColorBrewer [21], root-Solve, MASS [22], signal [23], sm [24], SparseM [25], wavelets [26].
The RxpsG architecture is displayed in Fig. 1. Experimental data can be loaded in the program main memory (the R .Global Environment). Generally XPS or AES data files are composed by several parts. We describe the analysis of XPS data files, however, the description can equally applied to AES data files. A typical XPS analysis is composed by a survey, i.e. a wide spectrum at low energy resolution and/or by a list of core-lines, spectra acquired at higher energy resolution. Files selection and loading are made with a suitable interface and all the spectral components are automatically visualized. In Fig. 2  Similarly the structure of the Core-Line object is represented by block containing a list of slots. The number of these blocks will correspond to the number of core-lines acquired in the experiment. In the real case of a XPS analysis composed by three spectra, namely the ''survey'', the ''C 1s'' and the ''O1s'' corelines, the correspondent XPS-Sample structure is shown in the following example: As it can be seen each of the Core-Line structure is formed by slots dedicated to the raw data, to the region selected to perform background subtraction (the @RegionToFit slot), the baseline function utilized at this aim (@Baseline slot) and all the information related to peak fitting (fitting functions utilized, the  best fit function, residuals etc. stored in the @Components and @Fit slots).
Thus, the definition of the XPS-Samples and Core-Line structures enables the allocation of a dynamical memory to store an initially unknown number of XPS spectra. In addition, specific ''methods'' can be defined for each of the class of objects. This allows the application of well defined procedures for each specific kind of data-objects. As an example the plot() function will act differently on the XPS-Sample class where just raw data will be visualized or on Core-Line class objects where, upon analysis, the baseline, the fit components and the best fit will be automatically shown. The GUI collects all the methods facilitating the user in correctly applying the different data manipulation procedures.
Finally, the RxpsG software was tested on a database of 220 curves (courtesy of the National Institute of Standards and Technology) described in [27] to compare its performances with the results of packages performing similar analyses. As a reference were utilized the results presented in [27][28][29] proving that the precision of the RxpsG processing routines well compares with those of other analysis programs.

Software description
The software features appears in self explaining user interfaces where the operations are logically organized to make the data handling as easier as possible. The more complex tasks require individual settings or the selection of appropriate function to model the experimental data. To help the correct selection of the more appropriate function, RxpsG is provided with a manual in which all the options are described. The manual contains also all the references relative to specific operations such as definition of the baselines, definition of the fit functions, definition of filters. . . ).
There are well recognized commercial software performing similar operations. We observe that data reduction of electronphotoelectron-spectra is well consolidated. Although commercial packages show distinct peculiarities, all of them perform similar base operations consisting in background subtraction utilizing different functions, smoothing, peak fitting with dedicated lineshapes, quantification and data visualization. As observed, also RxpsG performs these tasks utilizing immediate user-friendly GUIs. However, compared to existing packages RxpsG has some advantages: (i) it is an open source free project which is expected to be continuously improved and integrated by the XPS-user community. New GUIs performing specific operations can be easily integrated in the RxpsG main body; (ii) any code in the R-project environment must be accessible. The ''body()'' function of R allows the user to extract and visualize all the routines of each of the libraries including the RxpsG package. The user is free to modify/personalize the code according to his needs. (iii) the RxpsG software is developed in R and by definition it is compatible with all the packages of this environment. Any function of R is immediately available provided the relative library be installed. These functions can be applied to XPS-spectral data directly using the R-Studio interface or through GUIs which the user may freely implement. As an example, there are in R-project several libraries for multivariate analysis which can be selected.
Installed the more suitable package, one can directly perform the analysis applying the package functions to the XPS-Sample data loaded in the .Global environment of R.
Finally, there are some features which are specific of RxpsG. Some of them are: • Move fit Components to check the material stoichiometry; • Peak fitting algorithms: not only least square minimization but also conjugate gradient and pseudo-random methods may be utilized in case of convergence problems; • Quantification interface allowing easy change of RSF and inclusion/exclusion of selected fit components; • Valence band analysis for VB-edge definition is made using linear, non-linear threshold and Hill sigmoid methods; • Data filtering: up to 7 different filters are implemented.
Note that FIR, or IIR filters are needed for second derivative spectral analysis [30]; • Spectral data processing allows applying simple math operations to couple of spectra, joining spectra, copy the spectral analysis between core-lines; • Manual and automatic peak identification; • Customized data replotting to prepare figures for a publication.

Analysis menu options
As mentioned, RxpsG is dedicated mainly to the analysis of XPS and AES spectra. A collection of features are implemented to make the spectral processing easy. When loaded, VAMAS data are corrected by the analyzer response (if this data are included in the file). In the Info-Help menu the option XPS-Sample Info provides all the information related to the experiment and the list of corelines acquired. Selecting one of them, the acquisition conditions will be shown.
These basic tasks for spectral data reduction are summarized in the Analysis menu shown in Fig. 3. All the options are implemented in interactive graphical windows or appropriate GUIs. Let us review the various options: • Analyze regards the definition of the baseline for background subtraction and the definition of the fitting functions; • Fit constraints allows the definition of constraints needed for a correct peak fitting; • Fit Lev.Marq., Fit Mod. Fit: two different methods for peak fitting are implemented in RxpsG. The first applies the Levenberg Marquardt algorithm. In the second case a model fitting is performed utilizing several models either based on the least square minimization or mixed models or random models (based on Metropolis algorithm); • Move Components: using this feature the user can control the element stoichiometry manually moving the single fit components ensuring at the same time that a reasonable fit is obtained; • Quantify: performs elemental quantification using appropriate sensitivity factors are utilized to calculate the elemental concentrations.
• Energy Shift: acquisition of XPS spectra on non-conductive samples requires a charge compensation leading to a shift of the energy scale. This option allows alignment of the energy scale with respect to a reference energy value.
• Process Core-Line: this option allows a list of operation on the core-line encompassing: core-line duplication or removal, copy a whole core-line or the simple baseline or the complete peak fit from another XPS-Sample, addition of a constant, multiplication by a constant, differentiation, combination of two core-lines, baseline subtraction. • Reset Analysis: this option is applied to delete single fit components or part of the analysis performed on the core-line to optimize the data processing; • Sprucing up: spikes or single data may be corrected directly by editing the spectral data values; • Element Identification, Core-Lines and Auger Tables: spectral features in a wide spectrum can be recognized through automatic procedures whose sensitivity can be changed selecting convenient threshold values. Once peaks are correctly identified they can be assigned to the correspondent chemical elements. This option provides also the XPS and Auger table listing the energy positions of all the chemical elements. Clicking on an element the positions of the correspondent core-lines and/or Auger features are shown.
• VMS Data Transmission Correction: performs the data transmission correction. This operation now is performed by default when the list of correction values is provided by the instrument manufacturer for the acquired spectra.

Plot menu options
The Plot menu is dedicated to the graphical data visualization (see Fig. 4).
• Plot the same as double clicking on a XPS-Sample name leads to the spectra visualization; • Spectrum selection: allows the selection of the desired coreline; • Overlay Spectra: frequently interpretation of spectra relies on the comparison of data deriving from different XPS-Samples to better understand the effect of specific treatments/synthesis methods. Overlapping spectra can be performed using this option which includes the selection of plotting styles (colors, line patterns, symbols, 2D, 3D); • Custom Plot: preparing data for publication is normally a time consuming job. This option is intended to simplify the work by allowing a step by step construction of the figure (size of axes number and labels, size of title, colors, line patterns and symbols, legends and annotations); • Two Y-scale Plot: sometime it is necessary to represent in the same figure data extending on rather different Y ranges. This option plots data utilizing two different Y scales. To simplify the correlation between the set of data and the relative Y scale, they are drawn using the same color; • Annotate: option to add annotations to the figures; • Zoom and Cursor: it is possible to zoom on single parts of the wide and core-line spectra. The ''cursor'' option returns the cursor position (energy, intensity) by clicking on the spectrum shown in the graphical window; • Graphic Device Options: by default the programs sets the graphic device for Windows systems. Here it is possible to change this default and set the one appropriate for the own operating system. Here it is possible also to select the file format to save the content of the graphic window for importing pictures in a text editor; • Set the Analysis Window Size: by default medium dimensions are set which are suitable with the dimensions of normal PC screens. For laptops these dimensions could be too large and may be reduced;

Info menu options
The Info -Help menu provides information regarding the XPS-Sample.
• XPS Sample Info return information about the acquisition conditions of the raw data and analysis.
• Core-Line Fit Info: return information regarding the background function utilized and the fitting lineshapes and the correspondent fitting parameters; • Analysis Report: summarizes the results of the analysis performed on each of the analyzed core-lines.
• Help: opens the RxpsG manual describing each of the software features.

Illustrative examples
Part of the potentialities of RxpsG can be shown in the following examples.

Complex peak fitting
In Fig. 5 is shown the peak fitting of the Cerium 3d coreline. The fitting was performed utilizing a spline for background subtraction and ten Gaussian components. Discussion of the data are published in [6,7]. The FWHM of the fitting components were forced to be equal by using the Fit Constraints GUI. The Levenberg Marquardt algorithm was then used to obtain the best fit.

Copying fits and refined analysis
An example of peak fitting performances is shown in Fig. 6A. The fitting regards the carbon 1s core-line acquired on Highly Oriented Pyrolitic Graphite (HOPG) exfoliated under vacuum. Being a semimetal, graphite has to be fitted by the Doniach Sunjic (DS) function [31] (1−α)/2 * cos[π α/2 + (1 − α)arctan(E/γ )] where γ represents the one half of the FWHM of the DS lineshape while α accounts for the asymmetric tail on the high binding energy side of the C1s generated by electron losses.
In Fig. 6A is shown the best fit obtained applying a Shirley background subtraction and using a pure DS function. In the first case the original DS function is unable to describe the tails of the C 1s. Deviations from the original data are observed at ∼283 eV, in the region at ∼288 eV and loss features are not correctly described by the relative DS tail. A much better result is obtained using a DS function plus Gaussian broadening as reported in Fig. 6B. Observe the tail at lower binding energy where no deviations from the original spectrum are now present and also loss features at ∼292 eV are properly described. In Fig. 6C an example of peak fitting of C1s from the reduced graphene oxide (rGO) is shown. Graphene oxide (GO) is obtained from graphite exfoliation in acids. rGO is obtained from GO flakes applying a thermal or chemical reduction of GO. In this case the fitting procedure is delicate because the intensity of the tail at high binding energy derives not only from the carbon atoms organized in exagons as in graphite, but also from the presence of residual oxygen atoms (i.e. carbon oxygen bonds). To solve the problem, first the HOPG fit was used to describe the graphitic carbon component of rGO and then was added the contributions of the oxidized carbon components. Using the Process Core-Line option, the best fit of the HOPG was superimposed to the rGO carbon core-line. Then a Gaussian broadening was added to the original DS lineshape to describe the presence of defects typical of rGO. The curve asymmetry was kept fixed during the fitting procedure. The higher intensity in the range 286 eV-290 eV was described using three additional Gaussian components assigned respectively to C-OH, O-C-O and -O-(C=O) bonds (see Fig. 6C).

Estimation of the valence band top
Valence band VB upper edge is indicative of the conductive, semiconducting or insulating properties of a material. For example to understand transport properties across the interface of a semiconductor heterostructure, an accurate knowledge of the valence-band offset ∆VB t is needed. This provides information of the electron-hole recombination rates determining the efficiency of many semiconductor devices, such as thin-film solar cells. In the following example we will show the GUI dedicated to estimate the top of the VB.
In Fig. 7A is represented the main GUI page. Here the VB background and the region where to perform the VB-top analysis must be defined. This task is easily accomplished through a graphic interactive window where the edges of the background (red circles in Fig. 7A) can be adjusted on the VB spectrum by simply using the mouse. Confirmation and Reset buttons of the GUI are designed to proceed or restart the analysis. The second GUI page is dedicated to three VB fitting procedures: linear, non linear and Hill Sigmoid which is represented in Fig. 7B. Again the graphic interface window presents the VB portion to fit. Just clicking with the mouse the values required to fit the VB can be defined (in this example the Hill Sigmoid maximum, the flex point and the minimum represented by the green crosses in Fig. 7B). Add Hill Sigmoid, Fit and Reset Fit buttons allow respectively to add the sigmoid function through the defined points, accomplish the best fit and reset the analysis if needed. Finally pressing the ''Estimate VB top'' button the fitted Hill sigmoid is used to compute the valence band top represented by the orange cross in Fig. 7B.

Graphical performances
A final example is dedicated to the graphical visualization of data.
In Fig. 8 are shown a set of the different possibility offered by the graphical option Overlay spectra. The GUI offers the possibility to set any of the graphical parameters. Title, axis labels and numbers can be changed and resized to fit the requirements for standard publications. In B/W different line patterns or symbols can be used (Fig. 8A). Color figures utilize solid lines and void circles to plot data (Fig. 8B, E). Spectra can also be easily visualized in 3D in two different fashions (Fig. 8C, D) or in ''waterfall'' format ( Fig. 8E, F) adding labels and annotations (Fig. 8F).
An additional option offered by Overlay Spectra is the representation using multiple panels. Spectra can be plotted in one or in separated graphs. This last option is useful when for clarity figures must be kept separated as in the case of spectra together with baseline, fitting components and best fit. An example of this situation is shown in Fig. 9 representing the effect of a thermal annealing on the C1s spectrum acquired on silicon oxycarbide samples. In Fig. 9A it is shown the spectrum of the untreated sample. Raw data are represented in black, fit components with light gray dotted lines masking the baseline (a dashed blue line). Finally the best fit is represented in red. In the upper part of the panel it is reported the spectrum name and the data file name to identify the parent sample. Fig. 9B displays the C1s of the acquired on the silicon oxycarbide sample annealed at 1000 • C. As it can be observed a strong reduction of the C1s intensity is obtained likely due to the carbide decomposition. The figure conventions are the same as those of Fig. 9A.

Conclusions
RxpsG is an open source project based on the R platform. This ensures the software to be compatible with the more diffuse operating systems such as UNIX, Windows, MacOS. Aim of this package is to provide an easy and powerful tool to analyze XPS and AES data. At present, the software reads data-files in VAMAS, PXT and ASCII formats. However, the software is very flexible and its compatibility may be extended to any other kind of data format by implementing an appropriate routine to load data. The possibility to read ASCII data enables the software to be utilized to analyze any kind of data. Smoothing, background subtraction, peak fitting may be easily performed in RxpsG as well as the visualization of data using different styles. RxpsG is published under an open-source GNU license and is an open project. The authors encourage the readers to contribute implementing additional features to satisfy further needs. This will help to maintain the software independent from any manufacturer and, at the same time, up-to-date.

Declaration of competing interest
No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.softx.2019.100282.