The problem of identification an unknown substance by the radiographic method

This paper considers the problem of partial identification of the chemical composition of an unknown medium by the multiple X-ray of this medium. A sample of an unknown substance is assumed to be homogeneous in its chemical composition, and the photon flux, collimated both in direction and in energy. A mathematical model for the identification problem is formulated. The approach proposed to solving the problem is based on the method of singular value decomposition of a matrix. At the first stage of the solution the problem is reduced to finding singular numbers and singular vectors for the series systems of algebraic equations linear with respect to products of unknown quantities. Then, based on the data obtained, a special function is built, called an indicator to the distinguishability of substances, which enables the sufficient conditions for the distinguishability of various substances. Based on the tabular data, calculations were made for a number of specific groups of chemical elements.


Introduction
The problem of identifying a substance as well as the problem of finding the chemical composition of a substance by radiographic methods, are interesting from the point of view of theory and have an undoubted practical value. Radiographic methods are useful in cases where it is required to perform nondestructive testing of the product or when the direct access to the object of study is difficult or undesirable. The number of scientific publications on this topic both by Russian and foreign researchers remains quite high; note among them [1,2]. At the same time, the authors use different approaches to solving the problem under consideration, and these approaches may significantly differ depending on a specific situation. In this paper, the problem of identifying an unknown substance is formulated, and the results of studying the behavior of some singular numbers associated with it are presented. An estimate of the maximum relative measurement error of radiation, at which the identification problem will be successfully solved for some specific groups of substances is also given.

Preliminary remarks and problem statement
Further, we consider that the investigated sample 0 of the unknown substance 0 is homogeneous in its chemical composition and all chemical elements that make up 0 are present in some predetermined list of elements 1 , 2 , … , , which we know. The sample is irradiated with a photon flux collimated both in direction and in energy and going along a certain fixed straight line. The use of collimators before and after the substance under study makes it possible to isolate from the initial radiation flux mainly only those photons that did not interact with the substance. We neglect the interaction of photons with the air medium and assume that the flux weakening occurs only in the section of the trajectory of the length passing through the sample. In the course of each measurement experiment, all photons have the energy from a fixed (discrete) set of radiation energies When carrying out calculations for specific substances and energies, we used numerical data taken from [3], where, in particular, all the information we need for 20 energy values is given. Therefore, we restrict ourselves to the case 2 ≤ ≤ ̅ = 20. Let ℎ = ℎ( ) and = ( ) be the density of radiation fluxes entering and leaving 0 , for the energy ; = 1,2, … , ̅ ; 0 = 0 ( )radiation attenuation coefficient for the substance 0 , = ( ), = 1,2, … , is radiation attenuation coefficients for 1 , 2 , … , ; 0 is the density of the substance 0 , is the density of , is the mass fraction of the element , which is part of the substance 0 . Taking into account the physics of the radiation transfer process, we assume that always ℎ , , 0 , , 0 , > 0; ≥ 0; = 1,2, … , ; = 1,2, … , ̅ . Having carried out X-raying of the sample 0 at different energies ′ 1 , ′ 2 , … , ′ , where all ′ are included in the list of energies (1), one can obtain the following system of equations and conditions [4].
We denote by ( ) = ( 1 ( ) , 2 ( ) … , ( ) ) the vector formed by the energies 1 ( ) < 2 ( ) < ⋯ < ( ) included in set (1). It is clear that there is a total of 20 = 20! ! ⁄ (20 − )! of such vectors and they are lexicographically ordered. The superscript in ( ) is the ordinal of the vector in the given ordering. By = ( ) we denote a set of positive integers {1,2, … , 20 }. We rewrite (2) as = or where = ( ) = ⁄ , = 0 , = ( ) = ln( ℎ / ). We will consider (4) as a system of linear algebraic equations in which the matrix and the vector = ( 1 , … , ) are known, and = ( 1 , … , ) = ( 0 1 , … , 0 ) is an unknown vector (the superscript T hereinafter means transposition). The specific value of the known value of will be insignificant for us, therefore, to simplify the notation, we further assume = 1 cm. The dimension of the components of the vector is g/cm 2 . As the previously conducted studies [4] have shown, for almost any choice of a set of energies ′ 1 , ′ 2 , … , ′ and chemical elements 1 , 2 , … , the matrix is not degenerate, but poorly conditioned. In what follows, we denote the solution space of system (4) by and assume it to be Euclidean with a natural (canonical) orthonormal basis 1 , 2 ,…, , so that = ∑ =1 = ∑ 0 =1 and the norm ‖ ‖ = (∑ ( 0 ) 2 Further, we will compare the unknown substance 0 with the substances known to us, included in the list of substances = { 1 , 2 , … , }, ≥ 1. We will assume that 0 and each ∈ contain only those chemical elements that are included in the list 1 , 2 , … , . Let us explain in which case we will consider substances 0 and ∈ to be different. Let Identification problem. Let for a matter 0 of unknown composition, as a result of X-raying carried out at a certain set of energies ′ 1 , ′ 2 , … , ′ and measurements, the values , ℎ , ; = 1, … , be known. It is required to establish whether the substance 0 is included in the list of substances = { 1 , 2 , … , }, of known chemical composition (that is, whether the equality 0 = holds for some ). Let us give an explanation.
1) The list may include, for example, some explosive, poisonous and hazardous substances prohibited for transporting. Lists of such substances, as a rule, are known in advance and are relatively small. If the unknown substance 0 is not included in the list , then it can be transported, and the chemical composition of 0 is not significant. If the substance 0 cannot be distinguished from some substance ∈ , then it must be studied by other methods. This approach to the problem is quite acceptable, for example, during the customs inspection of cargo.
2) In the problem, it is not required to determine the values 0 , 1 , … , , i.e. to find the chemical composition of a substance 0 .
3) The energies at which the X-raying of the sample 0 can be carried out can be chosen arbitrarily, but all of them must be included in list (1). The amount of these energies can also be arbitrary. In each measurement experiment, there is (with some measurement errors) a pair of values ℎ , and for definiteness we assume that only one measurement experiment is carried out at each energy ′ .
If there were no errors during measurements of the quantities ℎ and , then after solving the (non-degenerate) system (4) using relations (3) it would be possible to easily find all unknowns 0 ,

Estimation of measurement errors and perturbation of the solution of system (4)
Let us give the necessary explanations regarding measurement errors. Let the sample 0 be X-rayed at some energy ′ from the list (1), so that ∑ =1 = = ln( ℎ / ). Let ℎ = ℎ + ℎ , = + , where ℎ and are exact (unknown to us) values of the density of radiation fluxes entering and leaving 0 , ℎ and are the same values found as a result of measurements with some unknown errors ℎ and . Let and be the corresponding representation for the right-hand side, so that = ln( ℎ / ), = − = ln( ℎ / ) − ln( ℎ / ). Let ℎ and be the maximum relative measurement errors of the incoming and outgoing radiation that the measuring devices allow for any energy ′ , then | ℎ | ℎ ⁄ ≤ ℎ , | |⁄ ≤ . We will assume that always 0 ≤ ℎ + < 1. Using the results of [5] and denoting the total relative measurement error through = ℎ + , we can obtain the estimate Hence, we obtain ‖ ‖ = [∑ ( In what follows, we will assume that the quantity is much less than unity and to simplify the formulas, we assume that for any set of energies ( ) = ( 1 ( ) , 2 ( ) … , ( ) ), for which system (4) is considered, the (approximate) relation ‖ ‖ ≤ √ =̂ holds.
Let us estimate the perturbation of the solution to system (4) in the case when all perturbations of the right-hand side of (4) belong to the unit ball Ω = { | ∈ ℝ , ‖ ‖ ≤ 1}. Further, unless It was shown in [8] that the boundary Π of the set Π ⊂ is defined by the equality Π = {ω • Φ (ω)|ω ∈ S [ −1] }, that is, it coincides with the graph of the function ω • Φ in the polar coordinates. Now let (0) and (1) be points in corresponding to an unknown substance 0 and a known substance 1 ∈ (moreover, we do not know (0) ), (0) ≠ (1) and is the unit vector, (0; ) is the (found with an error) solution of system (4) based on the results of 0 X-raying on some set of energies ( ) , D ( ) = [ ( ( ) )] −1 Ω , ̂= √ . Then, in the above notation, the following is true [8] Statement. Taking into account that ̂= √ from item (c) and formula (6), we can obtain a simple (approximate) sufficient condition under which the inequality 0 ≠ 1 is always established from the results of one series of X-raying on the set of energies ( 1 ) , where 1 = 1 ( (1,0) ) ∈ ( ) is such a number at which the minimum is reached in (6). For example, when = 2 for elements hydrogen and carbon for a pair of substances 0 -divinyl (CH 2 ) 2 (CH) 2 and 1 -isooctane (CH 3 ) 3 CCH 2 CH(CH 3 ) 2 using the data from [3] and formula (6) from (7) one can get 2.3159, < 8.33 • 10 −3 . It is clear from what has been said that the function Φ turns out to be very useful in the analysis of the fact that how well (or badly) two specific substances can be distinguished, therefore, it can be called an indicator to the distinguishability of substances [8]. Before proceeding to the description of the calculations performed, we will give one more explanation. We will assume that for any and ∈ ( ) the singular values of the matrix = ( , ( ) ) are always numbered so that 1 ( ) ≤ 2 ( ) ≤ ⋯ ≤ ( ) , then It is clear that for every and ∈ ( ) the numbers [ 1 ( ) ] −1 , … , [ ( ) ] −1 are the lengths of the semiaxes of the perturbation ellipsoids of the solution of system [4] in the case when the perturbation norm of the right-hand side is ‖ ‖ = 1. Let us fix , and let ∈ ( ) be a number such that For the number = , the set of numbers (8) is denoted by 1 ≥ 2 ≥ ⋯ ≥ , and for the number = , a similar set of numbers (8) is denoted by 1 ≥ 2 ≥ ⋯ ≥ . In [8], it was shown that the number 1 is used to determine the best ensured accuracy of solving a chemistry problem. When we are dealing with an identification problem, the best result largely depends on the number .

Some results of the performed calculations
When performing the calculations, two main objectives were set. The first one consisted in elucidating the behavior of the sequence of values of the numbers and √ • , = 2,3, …, since the minimal possible error in solving system (4) depends on them. The second objective was to calculate the values of the quantities * (see formula (7)) for some specific groups of chemical elements and pairs of substances consisting of these elements. When performing the first part of the work, a computer program was written, which, based on the data from tables [3], for some groups of chemical elements formed all kinds of matrices = ( , ( ) ), = 2,3, … 10, ∈ ( ), found singular numbers and singular vectors for them and then performed all the necessary calculations. At = 2, the group of chemical elements consisted of hydrogen and carbon. Further, with an increase in , new elements were added to them, and at = 10 a group of elements {H, C, N, O, F, Na, Mg, P, S, Cl} was obtained. Some of the results obtained in this case are presented in table 1. corresponding to the given . As grows, the numbers decrease. The numbers √ in the third line also decrease. It is these numbers that are most important for us, since they are included in the denominator of formula (7) at (1,0) = • ( ) . A decrease in √ with increasing indicates that with an increase in the dimension of the identification problem, the requirements for the accuracy of radiation measurements will probably not increase, and the success of solving the identification problem for a pair of substances 0 and 1 will depend mainly on the distance ‖ (1) − (0) ‖ between the points (1) and (0) and the relative position of the unit vectors (1,0) and • ( ) (see formula (6)).
The second objective of the calculations was to find the values of * (see inequality (7)) for some sets At = 2, a group of 40 hydrocarbons was selected. Figure 1 shows a dot diagram corresponding to the exact position of these 40 substances in the solution space . If , 1 , 2 are the exact values of density, mass fraction of carbon and hydrogen in (some) substance, then = ( 1 , 2 ) = ( 1 , 2 ) is the corresponding point in . Recall that = 1cm, therefore the dimension of the components of the vector is g/cm 2 . A closed curve similar to a strongly oblate ellipse is the boundary of the set (1, 0.1) + 2̂• Π for the case when the amplitude of the relative error of a single measurement = 10 -3 (see formula (5)). The number 559 marks the point corresponding to the substance isobutane (CH 3 ) 2 CHCH 3 , and the number 424anthracene C 14 H 10 . Point (1, 0.1) does not correspond to any substance and is chosen only for a convenient location of the central point of the set 2̂• Π. Figure 2 shows a fragment of figure 1, enlarged approximately 10 times.   The table shows the following. For = 2, 40 substances were used for calculations; they form 780 different pairs { ( ) , ( ) }, < . With the maximum relative error * = 3 • 10 −2 , 33 pairs are well distinguishable, they make up 4.23% of all pairs. With * = 10 −2 , 316 pairs are clearly distinguishable, they already account for 40.5% of the number of all pairs. Further, with a decrease in * , the proportion of well-distinguishable pairs increases, however, even with * = 5 • 10 −4 , a certain number of poorly distinguishable pairs remains. They include, for example, a pair of isomers C 6 H 10 , which is formed by substances with numbers 565 and 563. To achieve a good distinguishability of the substances of this pair, the (limiting) value * = 6.628 • 10 −5 is required. Both of these substances are marked in figure 2. For = 3 and = 4, we used 23 substances for the calculations; they form 253 pairs. Similarly, to the case = 2 the columns of table 2 for these cases indicate the number of well distinguishable pairs for different values of * . At = 5, only one pair of substances was found in the database compiled by the authorthese are the poisonous substances sarin (CH 3 ) 3 O 2 PF and soman (CH 3 ) 5 C 2 O 2 PFH. This pair becomes well distinguishable when * < 5,87 • 10 −3 .
In conclusion, we note that the obtained numerical results refer only to the set of substances consisting of the groups of chemical elements considered here. First of all, these are numerous organic compounds. In practice, situations are often possible when a group of chemical elements should be different and calculations for it may show otherwise. These results can also noticeably differ from those obtained here when changing the energy interval at which the problem is considered. Preliminary studies carried out for the same groups of chemical elements indicate that the expansion of the energy range considered here [0.1 -20] MeV to [0.001 -20] MeV should significantly improve the results obtained in this study.