Novel 2D TOMOCOMD-CARDD Descriptors: Atom-based Stochastic and non-Stochastic Bilinear Indices and their QSPR Applications

Novel atom-based molecular descriptors based on a bilinear map similar to use defined in linear algebra are presented. These molecular descriptors, called " local (atom, group and atom-type) and total (global) bilinear indices " , are proposed here as a new molecular parametrization easily calculated from the 2D molecular information. The proposed non-stochastic and stochastic molecular fingerprints try to match molecular structure provided by the molecular topology by using the k th non-stochatic (Marrero-Ponce, Y. DOI: DO00017575) graph–theoretic electronic-density matrices, M k and S k , respectively. That is to say, the k th non-stochastic and stochastic bilinear indices are calculated using M k and S k as matrix operators of bilinear transformations. Moreover, chemical information is codified by using different pair combinations of atomic weightings (atomic mass, polarizability, van der Waals volume, and electronegativity). The prediction ability in Quantitative Structure-Property Relationships (QSPR) of the new molecular descriptors was tested by analysing regressions of these descriptors for six selected properties of octane isomers. It was clearly demonstrated that prediction ability was higher than those showed by other 2D/3D well-known sets of molecular descriptors. The obtained results suggest that with the present method it is possible to obtain a good estimation of these physicochemical properties for octanes. The approach described in the present report appears to be a prominent method to find quantitative models for description of physicochemical and biological properties.


INTRODUCTION
Molecular descriptors (MDs) have deserved more and more attention from chemists along the later years. 1 In this connection, a large number of QSAR/QSPR studies have been reported in recent literature that use MDs in prediction of the physicochemical and biological properties of molecules.MDs are numbers that characterize a specific aspect of molecular structure.The important common feature to all those MDs is the independence of their numerical values on renumbering atoms in a chemical structure. 3,41][12] In fact, our research group has proposed several atom-and bond-based topological and topographic MDs.3][14][15][16][17][18][19] On the other hand, González and co-workers have developed a new method based on the Markov chain theory, which has been successfully employed in QSPR and QSAR studies. 20 -23More recently, one of the present authors (M-P.[26][27] These MDs are based on the calculation of quadratic and linear maps similar to those defined in linear algebra. 287][38][39][40] In addition, these MDs have been extended to consider threedimensional features of small/medium-sized molecules based on the trigonometric-3D-chirality-correction factor approach. 27,41Finally, promising results have been found in the modeling of the interaction between drugs and HIV Ψ-RNA packaging-region in the field of bioinformatics using macromolecular indices. 42,43An alternative formulation of our approach for structural characterization of proteins was also carried out recently. 44,45e main purpose of the present paper is to present new sets of MDs, namely non-stochastic and stochastic bilinear indices and establish their abilities (both total and local) for the description of the molecular structure by correlating them with six selected physicochemical properties of octane isomers.

METHODOLOGY
In previous reports, we outline outstanding features concerned with the theory of 2D atom-based TOMOCOMD-CARDD descriptors.This method codifies the molecular structure by means of mathematical quadratic, linear and bilinear transformations.In order to calculate these algebraic maps for a molecule, the atom-based molecular vector, 5][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41] Such atom-adjacency relationships and chemical-information codification will be applied in the present report to generate a series of atom-based MDs, atom, group and atom-type as well as total bilinear indices, to be used in drug design and chemoinformatic studies.
Therefore the structure of this section will be as follows: 1) a background in atombased molecular vector and non-stochastic and stochastic graph-theoretic electronicdensity matrices will be described in the next subsections (2.1 and 2.2, respectively), and 2) an outline of the mathematical definition of bilinear maps and a definition of our procedures will be develop in subsections 2.3 and 2.4, correspondingly.
This approach allows us to encode organic molecules such as 3-mercapto-pyridine-4-carbaldehyde through the molecular vector x S9 ] (see also Table 1 for molecular structure).This vector belongs to the product space ℜ 9 .However, diverse kinds of atomic weights (x) can be used for codifying information related to each atomic nucleus in the molecule.These atomic labels are chemically meaningful numbers such as atomic Log P, 46 surface contributions of polar atoms, 47 atomic molar refractivity, 48 atomic hybrid polarizabilities, 49 Gasteiger-Marsilli atomic charge, 50 atomic masses (M), 51 the van der Waals volumes (V), 51 the atomic polarizabilities (P), 51 atomic electronegativity (E) in Pauling scale 52 and so on.

Table 1 (click here)
Now, if we are interested to codify the chemical information by means of two different molecular vectors, for instance, x = [x 1 ,…,x n ] and y = [y 1 ,…,y n ]; then different combinations of molecular vectors ( x ≠ y ) are possible when a weighting scheme is used.In the present report, we characterized each atomic nucleus with the following parameters: atomic masses (M), 51 the van der Waals volumes (V), 51 the atomic polarizabilities (P), 51 and atomic electronegativity (E) in Pauling scale. 52The values of these atomic labels are shown in Table 2. From this weighting scheme, six (or twelve if x M -y V ≠ x V -y M ) combinations (pairs) of molecular vectors ( x , y ; x ≠ y ) can be computed, x M -y V , x M -y P , x M -y E , x V -y P , x V -y E , and x P -y E .Here, we used the symbols x W -y Z , where the subscripts W and Z mean two atomic properties from our weighting scheme and a hyphen (-) expresses the combination (pair) of two selected atom-label chemical properties.4][55] In this particular case we are not dealing with a simple graph but with a so-called pseudograph (G).Informally, a pseudograph is a graph with multiple edges or loops between the same vertices or the same vertex.Formally: a pseudograph is a set V of vertices along a set E of edges, and a function f from E to {{u,v}| u,v in V} (The function f shows which pair of vertices are connected by which edge).An edge is a loop if f(e) = {u} for some vertex u in V. 24,25,56 In the earlier reports we have introduced new molecular matrices that describe changes along the time in the electronic distribution throughout the molecular backbone.5][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41] The coefficients k m ij are the elements of the k th power of M(G) and are defined as follows: [24][25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41] where E(G) represents the set of edges of G. P ij is the number of edges (bonds) between vertices (atomic nuclei) v i and v j , and L ii is the number of loops in v i .
The elements m ij = P ij of such a matrix represent the number of chemical bonds between an atomic nucleus i and other j.The matrix M k provides the numbers of walks of length k that link every pair of vertices v i and v j .For this reason, each edge in M 1 represents 2 electrons belonging to the covalent bond between atomic nuclei i and j; e.g.
the inputs of M 1 are equal to 1, 2 or 3 when single, double or triple bonds, correspondingly, appears between vertices v i and v j .On the other hand, molecules containing aromatic rings with more than one canonical structure are represented by a pseudograph.It happens for substituted aromatic compounds such as pyridine, naphthalene, quinoline, and so on, where the presence of pi (π) electrons is accounted by means of loops in each atomic nucleus of the aromatic ring.Conversely, aromatic rings having only one canonical structure, such as furan, thiophene and pyrrol are represented by a multigraph.In order to illustrate the calculation of these matrices, let us consider the same molecule selected in the previous section.Table 1 depicts the molecular structure of this compound and its labeled molecular pseudograph.The zero (k = 0), first (k = 1), second (k = 2) and third (k = 3) powers of the non-stochastic graphtheoretic electronic-density matrices are also given in this Table .As can be seen, M k are graph-theoretic electronic-structure models, like an "extended Hückel theory (EHT) model".The M 1 matrix considers all valence-bond electrons (σ -and π -networks) in one step and its power (k = 0, 1, 2, 3…) can be considered as interacting-electron chemical-network models in k step.The complete model can be seen as an intermediate between the quantitative quantum-mechanical Schrödinger equation and classical chemical bonding ideas. 57e present approach is based on a simple model for the intramolecular movement of all outer-shell electrons.Let us consider a hypothetical situation in which a set of atoms is free in space at an arbitrary initial time (t 0 ).At this time, the electrons are distributed around the atomic nuclei.Alternatively, these electrons can be distributed around cores in discrete intervals of time t k .In this sense, the electron in an arbitrary atom i can move (step-by-step) to other throughout the chemical-bonding network.
On the other hand, the k th stochastic graph-theoretic electronic-density matrix of G, S k , can be directly obtained from M k .Here, , is a square matrix of order n (n = number of atomic nuclei) and the elements k s ij are defined as follows: 30,31,34,35 where, k m ij are the elements of the k th power of M and the SUM of the ith row of M k are named the k-order vertex degree of atom i, i k δ .It should be remarked that the matrix S k in Eq. 2 has the property that the sum of the elements in each row is 1.An nxn matrix with nonnegative entries having this property is called a "stochastic matrix". 28The k th s ij elements are the transition probabilities with the electrons moving from atom i to j in the discrete time periods t k .It should be also pointed out that k th element s ij takes into consideration the molecular topology in k step throughout the chemical-bonding (σand π -) network.In this sense, the 2 s ij values can distinguish between hybrid states of atoms in bonds.For instance, the self-return probability of second order ( 2 s ii ) [i.e., the probability with which electron returns to the original atom at t 2 ], varies regularly according to the different hybrid states of atom i in the molecule, e.g. an electron will have a higher probability of returning to the sp C atom than to the sp 2 (or sp 3 arom )>p(C sp 3 )] (see Table 1 for more details).This is a logical result if the electronegativity scale of these hybrid states is taken into account.

Mathematical Bilinear Forms: A Theoretical Framework
In mathematics, a bilinear form in a real vector space is a 9][60][61][62][63] That is, this function satisfies the following axioms for any scalar α and any choice of vectors Let V be a real vector space in n ℜ ( (5) if we take the a ij as the nxn scalars ) , ( , That is, ) , ( Then, As it can be seen, the defined equation for b may be written as the single matrix equation (see Eq.
where n is the number of atoms in the molecule, and x 1 ,…,x n and y 1 ,…,y n are the coordinates or components of the molecular vectors x and y in a canonical basis set of The defined equations ( 9) and ( 10 Where can be calculated.

Non-Stochastic and Stochastic Atom-Based Bilinear Indices: Local (Atomic,
Group, and Atom-type) Definition.
In the last decade, Randić 64 proposed a list of desirable attributes for a MD.Therefore, this list can be considered as a methodological guide for the development of new TIs.One of the most important criteria is the possibility of defining the descriptors locally.This attribute refers to the fact that the index could be calculated for the molecule as a whole but also over certain fragments of the structure itself.
Sometimes, the properties of a group of molecules are related more to a certain zone or fragment than to the molecule as a whole.
Thereinafter, the global definition never satisfies the structural requirements needed to obtain a good correlation in QSAR and QSPR studies.The local indices can be used in certain problems such as: • Research on drugs, toxics or generally any organic molecules with a common skeleton, which is responsible for the activity or property under study.
• Study on the reactivity of specific sites of a series of molecules, which can undergo a chemical reaction or enzymatic metabolism.
• In the study of molecular properties such as spectroscopic measurements, which are obtained experimentally in a local way.
• In any general case where it is necessary to study not the molecule as a whole, but rather some local properties of certain fragments, then the definition of local descriptors could be necessary.
Therefore, in addition to total bilinear indices computed for the whole molecule, a local-fragment (atomic, group or atom-type) formalism can be developed.These Where k m ijL [ k s ijL ] is the k th element of the row "i" and column "j" of the local matrix . This matrix is extracted from the M k [S k ] matrix and contains information referred to the pairs of vertices (atomic nuclei) of the specific molecular fragments and also of the molecular environment in k step.The matrix or v j is an atomic nucleus contained within the molecular fragment but not both = 0 otherwise (15)   These local analogues can also be expressed in matrix form by the expressions: It should be remarked that the scheme above follows the spirit of a Mulliken population analysis. 65It should be also pointed out that for every partitioning of a molecule into Z molecular fragments there will be Z local molecular fragment matrices.
In this case, if a molecule is partitioned into Z molecular fragments, the matrix In addition, the atom-type bilinear indices can also be calculated.In the same way as atom-type E-state values, 11 for all data sets (including those with a common skeletal core as well as those with very diverse structures), these novel local MDs provide much useful information. For this reason the present method represents a significant advantage over traditional QSAR methods.The atom-type bilinear descriptors are calculated by adding the k th atomic bilinear indices for all atoms of the same type in the molecule.This atom type index lends itself to use in a group additivetype scheme in which an index appears for each atom type in the molecule.In the atomtype bilinear indices formalism, each atom in the molecule is classified into an atom type (fragment), such as -F, -OH, =O, -CH 3 , and so on. 11,66,67That is to say, each atom in the molecule is categorized according to a valence-state classification scheme including the number of attached H-atoms. 11The atom-type descriptors combine three important aspects of structural information: 1) Collective electron and topologic accessibility to the atoms of the same type (for each structural feature: atom or hybrid group such as -Cl, =O, -CH 2 -, etc), 2) presence/absence of the atom type (structural features), and 3) count of the atoms in the atom-type sets.

Sample Calculation
It is useful to perform a calculation on a molecule to illustrate the effect of structure on atomic and global bilinear indices values.For this we use the 3-mercapto-pyridine-4carbaldehyde molecule.The labeled (atom numbering) molecular structure of this chemical and the non-stochastic and stochastic (atom-level, group and atom-type as well as total) atom-based bilinear indices are shown at the Table 3.

COMPUTATIONAL STRATEGIES
All computations were carried out on a PC Pentium-4 2.0 GHz.The TOMOCOMD program for Windows package developed in our laboratory was used for computing the molecular descriptors for the dataset of compounds.This software is an interactive program for molecular design and bioinformatic research. 68It is composed of four subprograms; each one of them allows both drawing the structures

DATA SETS
The data sets for this study were taken from Consonni et at., 69 because the data of physicochemical properties of octane isomers have been carefully selected for testing MDs.2][73][74][75][76] In this sense, we analyzed the quality of the obtained QSPR models to describe the boiling point (BP), motor octane number (MON), heat of vaporization (HV), molar volume (MV), entropy (S), and heat of formation (∆ f H) of the octane isomers.
The use of octanes as a very suitable data set for testing TIs have been advocated by Randić and Trinajstić. 77,78This selection is recommended due to the most of the fact that physicochemical properties commonly studied in QSPR analyses with TIs are interrelated for data sets of compounds with different molecular weights, for instance for alkanes with two to nine carbon atoms.These correlations are not necessarily observed when the same indices are used in isomeric data sets of compounds, such as the octane data set.In addition, these properties are hardly interrelated when octanes are used as a data set. 74On the other hand; all the TIs are designed to have (gradual) increments with the increase in the molecular weight.By this way, if we do the present study by using a series of compounds having different molecular weights, we will find "false" interrelations between the indices by an overestimation of the size effects inherent to these descriptors. 71The same is also valid when the QSPR model is to be obtained.It is not difficult to find "good" linear correlations between TIs and physicochemical properties of alkanes in data sets with great size variability. 71In fact, the simple use of the number of vertices in the molecular graph produced regression coefficients greater than 0.97 for most of the physicochemical properties of C2-C9 alkanes studied by Needham et al. 79 However, when data sets of isomeric compounds are considered, correlations that typically have high correlation coefficients when molecules of different sizes are considered will no longer show such good linear correlation.In conclusion, if a new proposed MD is not able to model the variation of at least one property of octane isomers, then it probably does not contain any useful molecular information.

STATISTICAL METHOD
The k th total and local atom-based bilinear indices were used as molecular descriptors for derived QSPRs.One of the difficulties with the large number of descriptors is deciding which ones will provide the best regressions.[82][83][84][85] GAs are a class of algorithms inspired by the process of natural evolution in which species having a high fitness under some conditions can prevail and survive to the next generation; the best species can be adapted by crossover and/or mutation in the search for better individuals.[82][83][84][85][86][87] The BuildQSAR 88 software was employed to perform variable selection and QSPR modeling.The mutation probability was specified as 35%.The length of the equations was set for three terms and a constant.The population size was established as 100.The GA with an initial population size of 100 rapidly converged (200 generations) and reached an optimal QSAR model in a reasonable number of GA generations.The search for the best model can be processed in terms of the highest correlation coefficient (R) or F-test equations (Fisher-ratio's p-level [p(F)]), and the lowest standard deviation equations (s). 88The quality of models was also determined by examining the Leave-One-Out (LOO) cross-validation (CV) (q 2 , s cv ). 89

QSPR APPLICATION OF PHYSICOCHEMICAL PROPERTIES OF OCTANE ISOMERS
The decisive criterion of quality for any MD is its ability to describe structure-related properties of molecules.In order to illustrate the applicability of the novel TOMOCOMD-CARDD descriptors, we performed the QSPR models to describe six physicochemical properties of octane isomers.2][73][74][75][76] Precisely, to evaluate the quality of the models based on our novel atom-based MDs we have taken as references: 1) the models published by Randić [72][73][74] based on diverse TIs such as the Wiener matrix invariants, 2) the equation published by Diudea 76 based on the SP indices, and 3) the best models obtained from a set constituted by the topological (69), WHIM (99), and GETAWAY descriptors (197). 69e best linear models found using non-stochastic and stochastic total and atomtype bilinear indices are presented in Table 4, together with the statistical information for the best regressions with 1, 2, and 3 MDs published so far 69,[71][72][73][74][75][76] for each selected property of octane isomers.

Table 4 (click here)
As it can be appreciated from the statistical parameters of regression equations in Table 4, all of the physicochemical properties have significant differences with the preceding models obtained by applying the selection procedure to the set given by GETAWAY descriptors plus WHIM and topological indices.However, these properties were described with our approach better than using several TIs.
According to the obtained QSPR results, it is possible to conclude that the novel MDs encode some useful molecular information different from that of previously proposed descriptors.Moreover, they are quite diverse among themselves being able to describe well the variation of different properties of octanes.

CONCLUSIONS
The approach described in this report appears to be a prominent method to find quantitative models for description of physical, thermodynamic, or biological properties.The novel MDs proposed here have been shown to have some interesting features, such as: i.Their functional definitions are based on well-known and accepted algorithms and formulas in mathematics.These novel atom-based molecular descriptors are based on a bilinear map similar to those defined in linear algebra.The atom-and group-level as well as atom-type formalism will permit to expedite investigation of molecular mechanisms and rational design of molecules at the local level.
consider that the following vector set, set of n ℜ .This basis set permits us to write in unambiguous form any vectors x and y of V, where of the vectors x and y , respectively.That is to say,

4 .
7), where [Y] is a column vector (an nx1 matrix) of the coordinates of y in a basis set of ℜ n , and [X] T (a 1xn matrix) is the transpose of [X], where [X] is a column vector (an nx1 matrix) of the coordinates of x in the same basis of ℜ n .Finally, we introduce the formal definition of symmetric bilinear form.Let V be a real vector space and b be a bilinear function in VxV.The bilinear function b is called Non-Stochatic and Stochastic Atom-Based Bilinear Indices: Total (Global)DefinitionThe k th non-stochastic and stochastic bilinear indices for a molecule, from these k th non-stochastic and stochastic graph-theoretic electronic-density matrices, M k and S k as shown in Eqs. 9 and 10, respectively: [Y] is a column vector (an nx1 matrix) of the coordinates of y in the canonical basis set of ℜ n , and [X] T is the transpose of [X], where [X] is a column vector (an nx1 matrix) of the coordinates of x in the canonical basis of ℜ n .Therefore, if we use the canonical basis set, the coordinates [(x 1 ,…,x n ) and (y 1 ,…,y n )] of any molecular vectors ( x and y ) coincide with the components of those vectors [(x 1 ,…,x n ) and (y 1 ,…,y n )].For that reason, those coordinates can be considered as weights (atomic labels) of the vertices of the molecular pseudograph, due to the fact that components of the molecular vectors are values of some atomic property that characterizes each kind of atomic nuclei in molecule.It should be remarked that non-stochastic and stochastic bilinear indices are symmetric and non-symmetric bilinear forms, respectively.Therefore, if in the following weighting scheme, M and V are used as atomic weights to compute theses MDs, two different sets of stochastic bilinear indices, M-V s b k H (x ,y) and V-M s b k H (x ,y)[because x M -y V ≠ x V -y M ] can be obtained and only one group of non-stochastic bilinear indices ( descriptors are termed local non-stochastic and stochastic bilinear indices, The definition of these descriptors is as follows: and the k th power of matrix M [S] is exactly the sum of the k th power of the local Z matrices.In this way, the total non-stochastic and stochastic bilinear indices are the sum of the nonstochastic and stochastic bilinear indices, respectively, of the Z molecular fragments: and atom-type bilinear fingerprints are specific cases of local bilinear indices.Atomic bilinear indices, computed for each atom i in the molecule and contain electronic and topological structural information from all other atoms within the structure.The atom-level bilinear indices values for the common scaffold atoms can be directly used as variables in seeking a QSPR/QSAR model as long as these atoms are numbered in the same way in all molecules in the database.

Finally, these
local MDs can be calculated by a chemical (or functional) group in the molecule, such as heteroatoms (O, N and S in all valence states and including the number of attached H-atoms), hydrogen bonding (H-bonding) to heteroatoms (O, N and S in all valence states), halogen atoms (F, Cl, Br and I), all aliphatic carbon chains (several atom types), all aromatic atoms (aromatic rings), and so on.The group-level bilinear indices are the sum of the individual atom-level bilinear indices for a particular group of atoms.For all data set structures, the k th group-based bilinear indices provide important information for QSAR/QSPR studies.

(
drawing mode) and calculating molecular 2D/3D descriptors (calculation mode).The modules are named CARDD (Computed-Aided 'Rational' Drug Design), CAMPS (Computed-Aided Modeling in Protein Science), CANAR (Computed-Aided Nucleic Acid Research) and CABPD (Computed-Aided Bio-Polymers Docking).In the present report, we outline salient features concerned with only one of these subprograms, CARDD and with the calculation of non-stochastic and stochastic 2D atom-based bilinear indices.The main steps for the application of the present method in QSAR/QSPR and drug design can be summarized briefly in the following algorithm: 1) Draw the molecular structure for each molecule in the data set, using the software drawing mode.This procedure is performed by a selection of the active atomic symbol belonging to the different groups in the periodic table of the elements; 2) Use appropriate weights in order to differentiate the atoms in the molecule.The weights used in this work are those previously proposed for the calculation of the DRAGON descriptors,51,69,70 i.e., atomic mass (M), atomic polarizability (P), van der Waals atomic volume (V), plus the atomic electronegativity in Pauling scale (E).The values of these atomic labels are shown in Table2;51,52,69,70 3) compute the total and local (atomic, group and atom-type) nonstochastic and stochastic bilinear indices.It can be carried out in the software calculation mode, where one can select the atomic properties and the descriptor family before calculating the molecular indices.This software generates a table in which the rows correspond to the compounds, and columns correspond to the atom-based (both total and local) bilinear maps or other MDs family implemented in this program; 4) Find a QSPR/QSAR equation by using several multivariate analytical techniques, such as multilinear regression analysis (MRA), neural networks, linear discrimination analysis, and so on.That is to say, we can find a quantitative relation between a property P and the bilinear fingerprints having, for instance, the following appearance,P = a 0 b 0 (x,y) + a 1 b 1 (x,y) + a 2 b 2 (x,y) +….+ a k b k (x,y) + c (20)where P is the measured property, b k (x,y) are the k th non-stochastic total bilinear indices, and the a k 's are the coefficients obtained by the MRA; 5) test the robustness and predictive power of the QSPR/QSAR equation by using internal cross-validation techniques.The atom-based TOMOCOMD-CARDD MDs computed in this study were the following: i) k th (k = 15) total (global) non-stochastic bilinear indices not considering and considering H-atoms in the molecule [b k (x,y) and b k H (x,y), respectively].ii) k th (k = 15) total (global) stochastic bilinear indices considering H-atoms in the molecule [ s b k H (x,y)]. iii) k th (k = 15) group (methyl group, -CH 3 ) non-stochastic and stochastic bilinear indices considering H-atoms in the molecule.[b kL(CH3) H (x,y) and s b kL(CH3) H (x,y), respectively].These local descriptors are calculated taken into account only one of the three bond types for carbon-hydrogen bonds (C primary -H) that there are for octanes data.
were well described by atom-based bilinear indices.In this table we can observe that the statistical parameters for the models obtained with atom-level bilinear indices to describe heat of vaporization (HV) (Eqs 25 and 26), molar volume (MV) (Eqs 27 and 28) and heat of formation (∆ f H) (Eqs 31 and 32) of octanes are better than those taken from the literature using 2D and 3D MDs.The second physicochemical property, that is MV, is well described exclusively by the bond-based linear indices (both non-stochastic and stochastic ones).In addition, the obtained QSPR model (Eq.32) for description of ∆ f H using only one stochastic linear indix [ V-Es b 5 H (x ,y)] showed better statistical results and predictive power that the equations developed from 2D and 3D (and their combination) MDs.It should be pointed out that in the models based on the bondlevel chemical linear indices, both regressions for the motor octane number (MON) (Eqs 23 and 24) are better-to-similar than the models published so far.Only the models found by us to describe boiling point (BP) (Eqs 21 and 22) and entropy (S) (Eqs 29 and 30)