1 INTRODUCTION

Experimental capabilities to determine any physical quantity are limited by the characteristics of the measuring instruments in use. For this reason, directly measured distributions are called instrumental. However, real distributions of physical quantities or at least their estimates are of interest for analysis and physical conclusions. Differences between instrumental and real spectra can be due to finite energy, angular, and spatial resolutions of instruments and to the presence of noise and other sources of false or distorted signals in measuring instruments. Information on such instrumental effects and the possibility to reconstruct them in a mathematical model of an instrument allow one to obtain qualitative estimates of real distributions within a special mathematical approach, where the so-called unfolding problem is formulated [13]. Methods of solving this problem were reviewed in detail in, e.g., [48].

Statistical estimates of measured quantity distributions are obtained using various methods, which can be conditionally classified into two large groups:

• histogram, where the entire range of a physical quantity is divided into discrete intervals (bins);

• binless, not requiring such division.

In this work, we consider unfolding methods applied to histogram representations of measurements because they are widely used in various areas of modern experimental physics, in particular, in experiments on the physics of elementary particles and space physics (see, e.g., [912]).

To reconstruct distributions in the histogram approach, the set of values of a continuous physical quantity (or a neighborhood of the desired value of a discrete quantity) is divided into intervals. In this case, the description of instrumental distortions is reduced to the probabilities of detection of a random variable in different bins under the condition that the real value of the quantity falls in its bin. These probabilities constitute the so-called migration matrix [1, 2]. The distortion of the distribution appearing in the measurement implies the convolution of the real distribution with this matrix. Then, the real distribution is estimated by folding the measured distribution with the inverse migration matrix. However, this estimate is unstable [3, 13, 14]; consequently, it is necessary to choose other approaches.

Some unfolding methods allow one not only to take into account information on the character of distortions but also to use a priori assumptions on the features of the reconstructed distribution of the physical quantity (e.g., continuity or smoothness) [15]. This idea underlies the so-called regularization algorithms [3, 5, 16] widely used to correct one-dimensional distributions in modern experiments [4, 1720].

However, the considered physical quantity in the general case can be multidimensional; e.g., its components can be a set of characteristics of an elementary particle (the energy, mass, direction of motion, etc.). In this case, the adaptation of unfolding methods is complicated and is still unsolved for some algorithms. In this work, we propose an approach to transfer well-proved unfolding algorithms applied in the one-dimensional analysis to the case of multidimensional distributions. This approach is also tested using simulation data for cosmic rays detected by the PAMELA space instrument [21], including the magnetic rigidity and polar and azimuth angles of entry of the particle into the instrument.

2 FORMULATION OF THE PROBLEM AND REVIEW OF METHODS FOR THEIR SOLUTIONS IN THE ONE-DIMENSIONAL CASE

Let the characteristics of N particles be measured in the experiment using the appropriate instruments (the detection of the particle by an instrument is called an event). We divide the set of real values of a quantity into n cells (multidimensional bins) \(({{\Delta }_{1}},{{\Delta }_{2}}, \ldots ,{{\Delta }_{n}})\), where \({{\Delta }_{i}} \subset {{\mathbb{R}}^{k}}\). The discretization of the real distribution of the physical observation is a set of probabilities of entry of the characteristics of the particle into the corresponding bins: \({\mathbf{p}} = ({{p}_{1}},{{p}_{2}}, \ldots ,{{p}_{n}})\). The set of mathematical expectations of the numbers of particles in these bins, \(\boldsymbol{\tau} = ({{\tau }_{1}},{{\tau }_{2}}, \ldots ,{{\tau }_{n}})\), where \({{\tau }_{i}} = N{{p}_{i}}\), is also called the real distribution. The set of measured values of the quantity is also divided into intervals \((\Delta _{1}^{'},\Delta _{2}^{'}, \ldots ,\Delta _{{n'}}^{'})\), where \(\Delta _{i}^{'} \subset {{\mathbb{R}}^{k}}\) (the same division into bins is often used for real and measured values, but such divisions can be different in the general case). The set of the numbers of particles detected in the experiments in the selected bins is denoted as \({\mathbf{m}} = ({{m}_{1}},{{m}_{2}}, \ldots ,{{m}_{{n'}}})\) and is called the measured multidimensional spectrum.

Distortions caused by a finite resolution of instruments, as well as by noise or other instrumental effects, are described by the migration matrix R, where the element \({{R}_{{ij}}}\) is the probability that the real value from the jth bin is recorded as that falling in the ith bin. Then, the mathematical expectations \({\kern 1pt} \boldsymbol{\nu} = ({{\nu }_{1}},{{\nu }_{2}}, \ldots ,{{\nu }_{{n'}}})\) of the numbers of events detected in the selected bins are calculated as ν = Rτ.

The main aim is to develop methods of а consistent statistical estimate of unknowns p or mathematical expectations τ proportional to the unknowns from the measured spectrum m. This problem is called the deconvolution or unfolding problem. The comparative analysis of unfolding methods is of applied interest, including the comparison of methods with each other and the comparison of estimates of the real distribution obtained with different realizations of one method (with different algorithm parameters, methods of selection of bins, and features of the migration matrix) [16, 22].

Under natural assumptions, the sequence of detected events in each bin can be treated as a Poisson flow [2], which makes it possible to obtain maximum likelihood estimates of the realnumber of events in the selected bins. The solution of the corresponding problem is given by the estimate \({\kern 1pt} \hat {\boldsymbol{\tau} } = {{R}^{{ - 1}}}{\mathbf{m}}\) [3, 23], which also corresponds to a “naïve” estimate of the real distribution (i.e., under the assumption that ν = m, the system of equations Rτ = m is solved by inverting the migration matrix). Such an estimate of the real distribution is consistent and unbiased and has the smallest standard deviation among all unbiased estimates [3, 24]. Nevertheless, its standard deviation is often fairly large and the estimate appears unstable (sensitive to perturbations of the measured distribution); for this reason, its application in practice is complicated [3, 13, 14, 23].

Therefore, it is necessary to search for other methods to obtain estimates, in particular, those based on the artificial introduction of bias. In practice, the following two approaches are particularly outstanding [25].

(i) Iterative methods where the statistical estimate of the real distribution is successively refined with the early end of the iterative process. This idea underlies the D’Agostini Bayesian algorithm [26] and some its modifications, as well as some other iterative procedures [27].

(ii) The minimization of the likelihood function with an introduced “penalty” term (idea of regularization). Such an approach is used in the singular value decomposition (SVD) method [28] involving the Ti-khonov regularization, in the TUnfold algorithm [16, 29], and in some other methods [30].

Program implementations of the D’Agostini method, SVD method, and TUnfold algorithm are included in the libraries of the ROOT program package [31], which is used in experimental high-energy physics, as well as in the related RooUnfold package [32], which is designed to solve the unfolding problem. The authors of methods to reconstruct the spectrum use their own program developments implementing particular problem solution algorithms [3337].

The regularization approach is attractive because it can be used to obtain estimates reflecting the expected features of the reconstructed spectrum. In particular, the Tikhonov regularization used in the SVD method reflects the smoothness of the distribution, but the existing implementations of this algorithm are applicable to reconstructing only one-dimensional spectra because the corresponding regularization function is based on the difference between values in the linearly enumerated neighboring bins. At the same time, in the multidimensional case, there is no enumeration of bins where the spatially close bins have close numbers; consequently, the mentioned algorithms in the existing form are inapplicable to reconstructing multidimensional distributions. Nevertheless, it is possible to search for a modification of the regularization function based on the degree of proximity of one- or multidimensional bins, which would allow one to adapt the existing algorithms to a wider class of problems and to increase the accuracy of statistical estimates of one-dimensional spectra. A possible variant of such a modification is proposed in the next section.

3 MULTIDIMENSIONAL REGULARIZATION METHOD

Before the description of the proposed method, we list some main concepts used to adapt the approaches described above.

(i) Migration between bins described by the matrix R should be taken into account and included in the multidimensional case.

(ii) The continuity or smoothness of the real multidimensional distribution should be taken into account. The regularization approach is natural in this context.

(iii) The regularization term corresponding to these features of the spectrum is based on the differences between values of quantities in spatially close bins (it is necessary to distinguish bins that are close or neighboring from those that are not).

(iv) In the limit of one-dimensional quantities, the method should be reduced to existing approaches (e.g., the SVD algorithm), for which features of the application to reconstruct the spectrum are studied.

We search for an estimate of the real distribution τ as a solution of the minimization problem:

$$\Phi (\boldsymbol{\tau} {\kern 1pt} ) = (R\boldsymbol{\tau} - {\mathbf{m}}{{)}^{{\text{T}}}}(R\boldsymbol{\tau} - {\mathbf{m}}) + \alpha S(\boldsymbol{\tau} ) \to \mathop {\min }\limits_{\boldsymbol{\tau}} ,$$

where the first term corresponds to the likelihood function without additional bias (this term is minimal at τ = R–1m), the regularization function S(τ) describes given features of the desired distribution [38], and the numerical parameter α reflects the contribution of the regularization term to the function used to estimate the spectrum.

In the one-dimensional case, neighboring bins are defined naturally: bins are enumerated in the order of increasing characteristic and bins with numbers differing by unity are considered as neighboring. In particular, according to this definition, the spectrum can be estimated using the regularization function

$$S(\boldsymbol{\tau} {\kern 1pt} ) = \sum\limits_{i = 2}^{n - 1} {{({{\tau }_{{i - 1}}} - 2{{\tau }_{i}} + {{\tau }_{{i + 1}}})}^{2}},$$

which reflects the smoothness of the desired distribution [28]. This function can also be written in the form

$$S(\boldsymbol{\tau} {\kern 1pt} ) = (C\boldsymbol{\tau} {{)}^{{\text{T}}}}(C\boldsymbol{\tau} ),$$

where C is the matrix representing the proximity of neighboring bins and has the form

$$C = \left( {\begin{array}{*{20}{r}} 1&{ - 1}&0&0& \ldots &0&0 \\ { - 1}&2&{ - 1}&0& \ldots &0&0 \\ 0&{ - 1}&2&{ - 1}& \ldots &0&0 \\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots \\ 0&0&0&0& \ldots &2&{ - 1} \\ 0&0&0&0& \ldots &{ - 1}&1 \end{array}} \right).$$

The proposed modification of the general approach is based on the regularization function with the matrix that represents complex relations of neighborhood or proximity of bins. For a binary relation (two bins are either neighboring or not) in the one- or multidimensional case, we use the Kirchhoff matrix K = (kij), where

$${{k}_{{ij}}} = \left\{ {\begin{array}{*{20}{l}} {\deg ({{\Delta }_{i}}),}&{{\text{if}}\quad i = j;} \\ { - 1,}&\begin{gathered} {\text{if}}\quad i \ne j,\;{\text{and}}\;{\kern 1pt} {{\Delta }_{i}}\; \hfill \\ {\text{and}}\;{{\Delta }_{j}}\;{\text{are neighboring}}; \hfill \\ \end{gathered} \\ {0,}&{{\text{otherwise}}.} \end{array}} \right.$$

Here, degΔi is the number of bins neighboring the bin Δi. For example, spatial bins having a common segment of the boundary can be considered as neighboring. It is easy to verify that such matrix K in the one-dimensional case with the standard binary neighborhood relation coincides with the above matrix C, and the estimate obtained for the real spectrum corresponds to the results of regularization algorithms described above [28, 29].

The restriction of the binary approach becomes noticeable for multidimensional bins: first, bins that do not have a common segment of the boundary can also been considered as neighboring (close) and, second, the degree of proximity of adjacent bins can be different (and can depend on the size of the adjacent segment, on the sizes of bins themselves, and on other parameters). The development of the proposed idea can lead to nonbinary proximity relations for bins. To  this end, we introduce the weight coefficient wi, Δj) ≥ 0 of the neighborhood of the bins Δi and Δj (if i ≠ j). The binary case corresponds to wi, Δj) = 1 for neighboring bins and wi, Δj) = 0 for non-neighboring bins. However, a flexible definition of the weight coefficients is possible: they can be considered proportional to the degree of proximity defined by a prechosen method (by the adjacent region, size of bins, distance between the centers of gravity of bins, and other parameters). There are no stringent algorithms of choosing elements of the matrix K; the choice of the weight coefficients wi, Δj) depends on the features of the problem under consideration.

In this notation, the regularization function is constructed with the generalized Kirchhoff matrix K whose elements have the form

$${{k}_{{ij}}} = \left\{ {\begin{array}{*{20}{l}} {\sum\limits_{v \ne i} w({{\Delta }_{i}},{{\Delta }_{v}}),}&{{\kern 1pt} {\text{if}}\quad {\kern 1pt} i = j;} \\ { - w({{\Delta }_{i}},{{\Delta }_{j}}),}&{{\kern 1pt} {\text{if}}\quad {\kern 1pt} i \ne j.} \end{array}} \right.$$

Correspondingly, the estimate of the real distribution is obtained as the minimum of the function

$$\Phi (\boldsymbol{\tau} {\kern 1pt} ) = (R\boldsymbol{\tau} - {\mathbf{m}}{{)}^{{\text{T}}}}(R\boldsymbol{\tau} - {\mathbf{m}}) + \alpha {{(K\boldsymbol{\tau} )}^{{\text{T}}}}(K{\kern 1pt} \boldsymbol{\tau} ).$$

For the multidimensional case, the successive (linear) enumeration of bins is used. For this reason, the structure of the function does not change, but relations between bins are determined by the form of the matrix K rather than by their enumeration. Since the proximity relation is now defined for not only for neighboring bins, it is possible to obtain estimates of the distribution for the case of strongly correlated parameters.

The point of minimum is determined as the solution of an overdetermined extended system of equations:

$$\left[ {\begin{array}{*{20}{c}} {R{{K}^{{ - 1}}}} \\ {\sqrt \alpha {\kern 1pt} I} \end{array}} \right]K\boldsymbol{\tau} = \left[ {\begin{array}{*{20}{c}} {\mathbf{m}} \\ 0 \end{array}} \right],$$

where I is the identity matrix.

Since the matrix K is degenerate, it should be perturbed to calculate the inverse matrix and to correctly solve the problem. This can be done by adding a constant ξ to its diagonal elements, i.e., by using the matrix K + ξI instead of the matrix K. Further, the matrix K–1 is calculated under the assumption of this substitution.

To solve the system, we applied the approach used in the initial SVD unfolding algorithm [28]. Let the system of linear equations be represented in the form Aτ = m, where A is a matrix. The singular decomposition of \(l \times k\) matrix A is its representation in the form \(A = US{{V}^{{\text{T}}}}\), where U and V are the orthogonal \(l \times l\) and \(k \times k\) matrices, respectively, and S is the diagonal \(l \times k\) matrix called the singular matrix (diagonal elements\({{s}_{i}}\) are nonnegative and are called singular values). The matrices U and V can be chosen such that the singular values are aligned in decreasing order. If \(l = k\), all matrices are square.

The exact solution of the system of linear equations Aτ = m using the singular decomposition can be described by the following algorithm [28].

(i) The system USVTτ = m is written.

(ii) The substitution z = VTτ reduces the system to the form USz = m.

(iii) The multiplication of the latter system from the left by the inverse matrix U–1= UT gives the system Sz = UTm.

(iv) The substitution d = UTm reduces the system to the form Sz = d.

(v) The matrix S is diagonal; therefore, the components of the solution are constructed as zi = di/si.

(vi) The solution of the system is written in the form τ = Vz.

For ill-conditioned matrices A, di and si are small; hence, small perturbations of these values result in large errors in the solutions zi and the solutions τ of the initial system.

The extended system of equations is solved similarly with the singular expansion, but intermediate solutions zi now have the form [28]

$${{z}_{i}} = \frac{{{{d}_{i}}}}{{{{s}_{i}}}}\frac{{s_{i}^{2}}}{{s_{i}^{2} + \alpha }}.$$

At \(s_{i}^{2} \gg \alpha \), the solutions almost coincide with the considered direct solution of the problem. At small singular values si, the correction coefficient becomes noticeable. The authors of the SVD algorithm [28] recommend to take the square of the “last large” singular value as the regularization coefficient, i.e., to align the diagonal elements si of the matrix S in decreasing order and to take \(\alpha = s_{k}^{2}\), where sksk+ 1.

To reduce errors, the equations of the initial linear system are renormalized before its solution (for each equation to make the same “contribution”) [28].

4 RESULTS

The algorithm was tested in application to the simulation of the proton flux with the parameters corresponding to the Galactic component of cosmic rays, which is measured in the PAMELA space experiment [21]. The magnetic rigidity of the particle and the direction of arrival characterized by the zenith and azimuth angles were measured for each event. The simulation provided a set of pairs of real and measured values of the rigidity and zenith and azimuth angles. The region of real values of these quantities (rigidity from 0.1 to 200 GV, zenith angle from 0° to 45°, and azimuth angle from 0° to 360°) was divided into three-dimensional bins: the considered region of the celestial sphere was divided into domains with nearly equal areas, and the rigidity range was nonuniformly divided in each aperture segment. The resulting set of data was divided into two samples: one of them was used to construct the migration matrix and the other was involved in the test. The three-dimensional real and measured spectra of the considered quantities, as well as the migration matrix, were constructed for the resulting three-dimensional bins.

The three-dimensional spectrum was reconstructed using the proposed modified SVD algorithm. The Kirchhoff matrix was obtained using the binary neighborhood relation for three-dimensional bins: wi, Δj) = 1 if bins Δi and Δj have a nonzero adjacent area and wi, Δj) = 0 otherwise. The square of the largest singular value in the decomposition of the matrix of the regularized system was taken for the regularization parameter α (the procedure of choosing the parameters is described in Section 3).

We consider one-dimensional projections of the three-dimensional spectrum, e.g., on the magnetic rigidity axis for a certain aperture segment. The spectrum reconstructed by the proposed algorithm near the azimuth angle \(\phi = 42^\circ \) and zenith angle \(z = 17^\circ \) is shown by the thick solid line in the upper panel of Fig. 1 in comparison with the real and measured spectra given by the dotted and thin solid lines, respectively. The relative deviation of the measured and reconstructed spectra from the real one is accepted as a measure of accuracy (error) of the method. These deviations for the measured and reconstructed spectra are represented by the empty and filled circles in the lower panel of Fig. 1, respectively.

Fig. 1.
figure 1

Reconstruction of the spectrum in the chosen range of the zenith and azimuth angles. (Upper panel) (Dotted line) Real, (thin solid line) measured, and (thick solid line) reconstructed spectra. (Lower panel) Relative deviations of the (empty circles) measured and (filled circles) reconstructed spectra from the real spectrum.

According to Fig. 1, the proposed unfolding algorithm underestimates the real spectrum at low magnetic rigidities (below 1 GV) and overestimates it at magnetic rigidities from 1.5 to 3 GV. The average error in the considered range from 0.35 to 40 GV decreases from \({{\delta }_{{{\text{meas}}}}} = 0.076\) to \({{\delta }_{{{\text{unf}}}}} = 0.040\). For comparison, the direct algorithm of spectrum regularization (the application of the inverse migration matrix to the measured spectrum) gives the reconstructed spectrum with the average error \({{\delta }_{{{\text{inv}}}}} = 0.088\), which exceeds not only the average errors of the proposed unfolding algorithm but also measurement errors (deviation of the measured spectrum from the real one).

The presented error was calculated for the algorithm with the single division of the sample into two subsamples used to construct the migration matrix and to test it. A deeper analysis of the systematic and statistical errors can be performed in a series of computational experiments each with the division of the initial sample and with further estimation of deviations of the reconstructed spectrum from the real one. This analysis will be reported elsewhere.

The proposed algorithm can also be used to reconstruct the spectrum in the one-dimensional case, more flexibly taking into account the neighborhood relation between bins. The only physical characteristic studied for each particle is the magnetic rigidity. Similar to the multidimensional case, the performed numerical simulation provides a set of pairs of real and measured values of the rigidity. Further, the real and measured spectra are constructed in a similarly specified range of rigidity nonuniformly divided into intervals. The migration matrix is constructed for this division and simulation data. The algorithm is applied to the measured rigidity distribution in a test sample, and the reconstructed spectrum is compared to the corresponding real spectrum (for the test sample).

The spectrum was reconstructed using the proposed algorithm, where the weight coupling coefficient between bins decreases exponentially as \(w({{\Delta }_{i}},{{\Delta }_{j}}) = c{{q}^{{ - |i - j|}}}\) (we consider the case c = q = 2) with an increase in the distance between them (in number). The result was compared to the estimate obtained by the initial SVD method [28]. Figure 2 shows the relative differences (in each bin) of the spectra reconstructed using the (solid line) proposed algorithm and (dash-dotted line) SVD algorithm from the real spectrum corresponding to the horizontal dotted line. For the correct representation, we excluded the first bin, where both algorithms give a high relative error.

Fig. 2.
figure 2

(Color online) Errors of the spectrum reconstruction with the (dash-dotted line) classical SVD method and (solid line) proposed modified SVD method versus the bin number.

It is seen that the proposed algorithm suppresses large oscillations in the estimated spectrum at high magnetic rigidities (right side of the spectrum). The average relative deviation is also reduced compared to the deviation for the classical SVD algorithm. Thus, the proposed algorithm provides a better reconstruction of the spectrum also in the one-dimensional state.

It is noteworthy that the developed algorithm involves the regularization term representing, in particular, the continuity and smoothness in each of the parameters of the multidimensional distribution. Consequently, it is applicable for such distributions, which is confirmed by the performed analysis. In particular, spectra of cosmic rays that are characterized by different exponents in different energy ranges are distributions with the listed properties. The estimate of the real distribution reconstructed with the developed algorithm preserves these kinks in the spectrum.

5 CONCLUSIONS

To summarize, an approach has been proposed to adapt the singular value decomposition (SVD) unfolding regularization algorithms (including the SVD algorithm widely used in the physics of elementary particles) to the application in a wider class of cases. In particular, we have described and applied an unfolding algorithm that allows the reconstruction of smooth multidimensional spectra and can also be used in the one-dimensional case. The possibility to apply the proposed algorithm to distributions characterized by, e.g., nonmonotonic changes in one or several variables should be analyzed separately.

The proposed modified algorithm at this stage with the initial data considered in this work is not worse and is even better than the initial algorithms in accuracy of the spectrum reconstruction (e.g., in average relative error in bins). For the multidimensional case, it also allows one to obtain sufficiently accurate estimates of spectra for the migration matrix that do not satisfy stringent requirements of diagonality. Consequently, there is the expectation that the proposed approach is promising and can be further adapted to solve relevant applied problems.