Title : A Note on the Substitution Reaction Network of Azines

A total of 13 different azines derived from successive substitution of ring −CH− of benzene by N atoms is investigated. All the species are placed in different nodes of the substitution-reaction network diagram (Hasse-diagram), following the substitution pattern. The ground-state ab-initio total energies and corresponding internal energies of all the 13 species are computed. All these energies along with the aromaticity index NICS(0) values, from the literature, are noted to follow a partial ordering. The total energies, associated internal energies, and NICS(0) values are systematically analyzed by the “poset average” method, the cluster-expansion method, and the least-square method. We also utilize the cluster expansion method and “splinoid model” to predict densities and refractive indices of six azines for which experimental data are unavailable. Results obtained from both the methods agree well with each other and with that obtained from standard software methods. A very good agreement between the computed results and that determined from our analysis supports the idea of considering “partial ordering” as a new chapter in stereo-chemistry.(doi: 10.5562/cca2321)


INTRODUCTION
During the last four decades, various studies on reaction networks 1 have immensely contributed to the area of "combinatorial chemistry".Broadly, these studies aim at two goals: firstly, to generate a family of the chemical compounds by studying a "guided" set of reactions; and secondly, to systematically study the properties of different chemical compounds situated at different nodes of the reaction network having "precursor-product"type relationship.Among many reaction networks, those involving degenerate rearrangements have attracted much attention from chemical graph theorists.There are significant contributions from many researchers of this field, but one of the pioneers was Balaban who has also written a complete review article on this topic. 2Apart from this, many biochemical reaction networks are cyclic in nature.But also there exist many networks with an overall direction which provides a mathematical insight of potential use.In this note, we will mainly focus on such "directed" reaction networks.A nice example of such a network is a substitution reaction network which considers progressive substitution of a fixed molecular skeleton at several locations one after another keeping any previous substitution intact.This For present note, we illustrate this by taking the example of the substitution reaction network of chlorobenzene, 6 where an arrow in the Hasse diagram, directed from a structure ξ to a another structure ζ having a "precursor-product"-type relationship, represents a single minimal step of substitution. 6In other words, each element or a member of the poset P is placed at a node, or a vertex, and two such nodes are connected by a directed line if there is no intermediate element.This Hasse diagram does not show a line for each partially ordered pair, but the full posetic relation follows, via "transitive extension" where ξ ζ if either ξ = ζ or ξ is connected to ζ via a sequence of downward directed lines. 5,6t is quite natural to think that different physical and chemical properties of the elements of a poset might be ordered in consonance with the ordering of the poset.][7][8][9][10][11][12][13][14][15] It is also worthwhile to mention here a special issue of MATCH Communications in Mathematical and Computer Chemistry and a book on partial order in environmental sciences and chemistry. 4,16n this note, we focus on a poset where the ring carbon of benzene is substituted by nitrogen resulting in the formation of different azabenzens or azines.Starting from benzene, a total of 13 members have been reported.The full poset is shown in Figure 1.Here we have considered five different properties of the azines, namely, total electronic energy, internal energy, Nuclear Independent Chemical Shift (NICS), density and refractive index.Out of these five, the first three properties for each node of the poset are obtained via quantum chemical calculations.We use the UB3LYP/6-311+G(d,p) level of theory to compute ab-initio energies and related internal energies of each azine species in the poset shown in Figure 1.The NICS(0) values are taken from the work by Schleyer et al.where these values are calculated at the PW91/IGLOIII/B3LYP/6-311+G(d,p) level of theory. 17For density and refractive index, the known experimental data for the first five members of the poset are used to estimate the unknown densities and refractive indices of the remaining members of the poset, details of which will be discussed in the sections to follow.

METHODOLOGIES
The posetic structure, as manifested in the Hasse diagram may be sought to be utilized in explicating properties and data, by making poset-sensitive correlation.

Poset Average Method
In "poset-average" method 6 one takes any measurable property value X of an element ξ of the reaction network (in the interior of the Hasse diagram) as the average of the mean preceding X-values and the mean succeeding X-values.In other words, at first, one takes the average of those compounds ζ' directly leading to ξ in the network and then, takes the average of those ζ" directly following ξ; and finally, averages these two averages.Therefore, it is written that 6 where   X' ξ is the mean of the X values for isomers immediately preceding ξ in the Hasse diagram, and   X'' ξ is the average for structures immediately following ξ.Thus the estimated "poset average" energy of 1, 2-diazine from the energy of pyridine (E(1)) and energy of 1,2,3-and 1,2,4-triazine (E(1,2,3) and E (1,2,4) respectively) is all in a.u.The method is simple to deal with, but many times it suffers from lack of sufficient experimental data of a property for all the connected nodes of the element to be interpolated.However, one can imagine extending this method to take into account the contributions from the next-nearest neighbors or beyond.In that case, the model might be applicable to the cases where the immediate neighbor values are unknown.Actually in the splinoid method, discussed later, one can utilize all available data of the full poset maintaining its full generality.On the other hand, the splinoid model can also be arranged is a particular way to give the poset-average method.So to say, the choices of poset-average method and splinoid method utilizes the natural choices for minimal and maximal amount of data used within the splinoid fitting method. 11,14In our present study, we apply the poset average method for predicting total energy, internal energy and NICS(0) values for each node of the substituted reaction network for which computed data are available.

Posetic Cluster Expansion
Apart from the poset-average method, one can also interpolate any missing data by using the method of "cluster-expansion".The origin of this method dates back even before the molecular-structure ideas were clearly recognized. 18In modern chemical literature, this technique of characterization of molecular properties has been established as substructural cluster expansion for different properties, say, heats of formation, and various other properties. 19,20,21In 1964, the idea of cluster expansion was framed by Gian-Carlo Rota in a systematic mathematical posetic framework. 22A property X(ξ) for a poset member ξ was expanded in terms of another related "cluster" property x(ζ) for the preceding members ζ of ξ as (3)   where x(ζ)s are the fitting parameters.In the present article we write the general cluster expansion as (4)   where X(ξ) is the scalar property X for structure ξ, f(ζ ,ξ) is the number of ways in which the configurational arrangements of ζ occur as substructures in a configuration C  ξ.For our present case, the expansion appears as where, to label X (ξ) and x(ξ) , we represent a member ξ in the poset by the substitution position numbers of nitrogens in the ring (Figure 1).We illustrate this by taking energy, internal energy and NICS(0) as the measurable scalar property X(ξ) of structure ξ, x(0) is the property of the unsubstituted benzene, x(i) is the scalar property for substitution at i-th place as shown in Figure 1, x(i, j) is the correction term arising due to the interaction between the pairs of site i and j, x(i, j, k) is higher order 3-site interaction, etc.It is assumed that the correction terms due to many site interactions diminish in size as number of sites increases.As a matter of fact, higher order interactions might be negligible.
One can invert the cluster expansion equations as, From the above set of equations x can be determined using the X values obtained from the less substituted molecules having position in a lower rank in a poset.Then neglecting higher order x (3-site or higher order interaction), one can estimate the more highly substituted X-values.

Splinoid Method
The splinoid method is another important method of interpolation to predict missing data for the nodes in a substitution reaction network (represented here as Hasse diagram).In this method, for each edge in the Hasse diagram denoted by i→j which connects between the two structures, n-substituted structure i to (n+1)-substituted structure j, a real variable x i→j ranging from 0 to 1 can be attached (x i→j = 0 at i and = 1 at j).Following the spline interpolation method, 9,10 we write the cubic spline polynomials f i for each x i→j of the Hasse diagram as   with a i→j , b i→j , c i→j , and d i→j as constants.Each node i of the poset is identified by a value α i and a slope β i .The splinoid fit is such that each f i with endpoint i matches with the value α i at the nodes iP, with the α i for iK for the known property values; second, the slope at each vertex i matches with β i to ensure the smooth fitting of the polynomials of the connected arrows of the Hasse diagram; and third, a relevant total "curvature" is minimized.Here K is the set of structures within the poset for which the experimental values for a property X are known.The unknown X values for the remaining chemical compounds in the network, that form a set U with vertices jK, can be estimated applying splinoid model for the vertices from K. In other words, the unknown values of X for the set U of chemical compounds can be determined in terms of the known values of X for the set K. Following the general splinoid model, we express the adjacency matrix of the Hasse diagram by A, and the oriented adjacency matrix by S (i.e. S = [S ij ]), with elements defined as Also we define two diagonal matices D and Δ which are where we denote the in-degree and out-degree on vertex iP by d →i , and d i→ , respectively.Furthermore, we The new matrix M and the other two matrices U and K are used to compute the unknown values of X contained in vector u  from the known values of vector k  via 9 The point here is that the matrix UMU T has to be invertible regardless of the number of available "known" data in the network, even at a point where very few (≤2) known data are available.Certainly the performance and reliability of the method deteriorates with decrease in number of known values.Nevertheless, Klein et al. have shown that even for a large number of unknown values this deterioration can in practice remain small. 9Also unlike the cluster-expansion method or poset-average method the a i→j , b i→j , c i→j , and d i→j coefficients appearing in the spline polynomials f do not explicitly appear in the splinoid formula for u  , but they are complicit in the derivation of this formula for u  .
The coefficients can be determined following Refs.9 and 10.Also the formula takes all the topological structures of the reaction network into consideration, considerably differing from the other two methods which essentially considers the adjacent members in the Hasse diagram immediately up or down in the rank.

RESULTS AND DISCUSSION
The method of "poset average" and "cluster expansion" described in an earlier section has been applied to the singlet ground-state total energies, internal energies and NICS(0) values for all the 13 members of the azineposet.However, granted the availability of sufficient amount of computed data for these three properties we prefer doing a "least-square" cluster-expansion analysis over the "splinoid" fitting method.The total energies and the internal energies are calculated using a hybrid UB3LYP functional in a DFT framework with a 6-311+G(d,p) basis set in Gaussian 09 program package, 23 while the NICS(0) values have been taken from Ref.
In this work, we apply the cluster expansion to predict X values of the n-site isomers neglecting all the higher order terms, leaving out the terms beyond x(1) ≡ α, x(1,2) ≡ β 1 , x(1,3) ≡ β 2 and x(1,4) ≡ β 3 where α and βs are the abbreviated form of the x parameters.Alternatively, one can also write all the different X values in terms of x and then a least-square fit of the parameters for all the available X values could be done.For the present azabenzene reaction network, the cluster expansion parameters are shown in Table 4a, and those for least-square fitting are shown in Table 4b for various physically measurable properties.The results of the cluster expansion and least-square fitting are shown in Tables 1, 2 and 3.
From Table 1, 2 and 3 it is evident that the fitted energies we get, using the method of "poset-average" and "cluster-expansion" (either by inversion or by "least-square"), are very close to the calculated values and also the results obtained from either method are quite good.Moreover, following Figures 2a, 2b and 2c, one could visually compare the efficiency of the three methods by comparing their square-deviations at each data point from the data obtained through Gaussian calculations.However, from the RMSD values σ it is clear that the least-square fitting procedure is better than the other two methods for total energy and internal energy.On the other hand, poset-average method gives marginally better result than the least-square method in case of NICS (0) values.Another point to note is that the α term of the cluster expansion is the most important one followed by the three β values.A more accurate result (assuming calculated values to be actual values) can be obtained with inclusion of higher order terms, since some β values are larger than the standard deviation.At the same time, it is also apparent from the dif-ferent β-values that the pair-wise interaction tends to decrease in size with increase of distance between the pair of sites.A rationale for using a fair amount of computational data is due here.The present work faces a major challenge of unavailability of authentic experimental data for the molecules under consideration.This is because, some elements of the poset are either not yet synthesized or are very unstable.Therefore the available data sets for various properties of these structures are incomplete.
However, in an attempt to obtain the density and refractive index of all the elements of the poset, we rely on the available experimental data for the first five members of the poset and then apply the "cluster expansion" and the "splinoid model" for predicting density and refractive index of the other azines for which experimental data are not available.For the "cluster expansion" a little change here from our prior discussion is that in this case the physical property values are not referenced against that of the unsubstituted benzene, i.e., we adopt     cal .

X ξ
X ξ  This leaves the cluster term as x(0) ≠ 0. For the "splinoid method" the required matrices A, S, D and Δ are constructed considering the Hasse diagram in Figure 1.All the matrices are explicitly shown in the supporting material (Table 1S, 2S, 3S and 4S) with a slightly different molecule-labeling system depicted in figure 1S of the supporting material.The results obtained from both the methods are reported in Table 5.We also compare our results with ACD/LAB Software 24 data from Refs. 25 and 26.The reported result indicates a good agreement among the predicted values obtained using different methods and at least for the density of one element (A(1,3,5)) the splinoid method gives a better value than ACD data.

CONCLUSION
The present discussion can be extended beyond our chosen system due to the generality and ubiquity of the general ideas described and used here.As a matter of fact, the general idea of a periodic table can be viewed as a basis for the concept of progressive reaction poset.Mendeleev's table of the elements can be considered as an excellent example of such a periodic table.That is, chains in the poset can be thought of having close resemblance with the relations down the columns of the table, also including interconnection from an element in an A column to the element in the corresponding B column one row down in the table.Although, in the periodic table of elements, reaction from one element to that below are purely hypothetical nuclear reactions, the chemical properties of elements tends to follow a certain partial ordering.Such an idea was  0) values (in ppm) from that obtained using quantum chemical calculations of azines.The estimations are done using poset average method (•), cluster expansion method (■), least-square analysis (♦) for different members of the substitution reaction network.
Croat.Chem.Acta 86 (2013) 545.used in building up the "periodic table of alkanes" by Randić [27][28][29] or the "formula periodic table of benzenoids" by Dias', 30,31 or the "periodic table of all acyclic hydrocarbons" by Klein et al. 32,33 The ideas and techniques developed in the context of our substitution reaction posets thence could have somewhat wider applicability.
Although the poset related cluster approaches have been very successful, it suffers from some limitations.Firstly, the mathematical development of this approach is not very familiar to chemists and secondly, the user's confidence in the success of each computational approach is based largely on the previous success of this approach.In fact, in many well accepted works this has been used for "low-order" applications [19][20][21] without identifying the posetic nature of the expansion.Following the method established by Klein et al. for substitution reaction poset, we here provide a further illustration that the identification of the poset offers new methodology, and seemingly new insightwhich was shown to represent 34 a discrete analogue of the conventional Taylor series expansion for a continuous variable.It is clearly evident that the cluster expansion approach converges rapidly not only for abinitio ground-state energies and associated total internal energies, but also for aromaticity index (NICS(0)).The results obtained indicate an advantage of considering interpolative posetic methods, such as the clusterexpansion method or the splinoid method, for predicting unknown physical properties of the poset elements.This observation is in agreement with the previous studies made by Klein et al. for a few other such posets. 6,7,8,11This certainly points towards the power and promise of this method for this type of substitution reaction network.Supplementary Materials.-Supporting informations to the paper are enclosed to the electronic version of the article.These data can be found on the website of Croatica Chemica Acta (http://public.carnet.hr/ccacaa).Table 1S.The adjacency matrix A used for the splinoid method for azines.We have used the molecule designators as shown in figure 1S.

Gaussian output of Energies and Geometries of different Azines
fairly general class of directed reaction networks may be viewed as a partially ordered set or poset.More formally, a set P is called a partially ordered set when there exists a relation , designated as a partial ordering, between pair of elements ξ, ζ  P satisfying three conditions (1) ξ ξ, (2) if ξ ζ and ζ ξ, then ξ = ζ , and (3) the transitivity, i.e., if ξ η and η ζ , then ξ ζ.A poset can be diagrammatically represented by the socalled Hasse diagram, obtained using nodes and line segments such that each node or vertex corresponds to an element ξ  P and the line segment connects ξ to its immediate greater neighbor ζ  P and located below ξ.For further discussion on posets one can see Refs.3-6.

Figure 1 .
Figure 1.Substitution reaction network of azines with each element labeled by the substitution position.

Figure 2 .
Figure 2. Plots of square deviations of estimated (a) total energy (in a.u.)(b) internal energy (in kcal/mol) and (c) NICS(0) values (in ppm) from that obtained using quantum chemical calculations of azines.The estimations are done using poset average method (•), cluster expansion method (■), least-square analysis (♦) for different members of the substitution reaction network.

1 Figure 1S .
Figure 1S.Substitution reaction network of azines.In this figure the nodes are numbered serially starting from the first member benzene.This numbering scheme has been adopted to construct the matrices (Table 1S-4S) used for splinoid method.
The matrix S, used for the splinoid fitting for various properties of azines.We have used the molecule designators as shown in figure1S.The diagonal Matrix D, used for the splinoid method for azines.We have used the molecule designators as shown in figure1S.The diagonal Matrix Δ, used for the splinoid method for azines.We have used the molecule designators as shown in figure1S.
9,10ne two sub-matrices of the unity matrix I by U (|U|×|P| matrix with rows indexed by the elements of |U|) and K (|K|×|P| matrix with rows indexed by the elements of |K|).In terms of these matrices, one could derive a new matrix M as9,10

Table 1 .
17. Calculated total energy values and internal energy values are shown in Table1and Table2respectively.Details of computed energy and computed geometry are given in supplementary material.All NICS(0) values are reported in Table3.A point of note here is that all of these physical property values are referenced against Computed (at UB3LYP/6-311+G(d,p) level) and fitted energies (in atomic unit) for different azines.The notations E cal , E inversion , E least-sqr and E stand for calculated energy, energy obtained from cluster expansion method, least-square method and posetaverage method respectively.All calculated total energy values are scaled against that of unsubstituted one with E cal = −232.311a.u.

Table 2 .
Computed internal energies (at UB3LYP/6-311+G(d,p) level and at 298.15K) and fitted internal energies (in kcal/mol) for different azines.The notations U cal , U inversion , U least-sqr and U stand for calculated internal energy, internal energy obtained from cluster expansion method, least-square method and poset-average method respectively.All calculated internal energies are scaled against that of unsubstituted one with U cal = 65.621kcal/mol

Table 4a .
Parameters obtained using cluster expansion method where the symbol E and U stands for the total energy and total internal energy respectively

Table 4b .
Parameters obtained using method of least square fitting where the symbol E and U stands for the total energy and total

Table 5 .
Experimental and predicted densities and refractive index of azines.The abbreviations CE(n)and S(n) stands for cluster expansion and splinoid method using n known values.Experimental data collected from Refs.25-26 a Datum used for splinoid fitting.b Not included in splinoid method calculations.c Reported specific gravity in Ref. 25. Croat.Chem.Acta 86 (2013) 545.