A study on anti-malaria drugs using degree-based topological indices through QSPR analysis

.


Introduction
One of the life-threatening diseases world-wide is Malaria caused by the bite of female anopheles' mosquitoes.There are 5 such parasite species that affect humans of which 2 of these viz., P. falciparum and P. vivax are the greatest threats.These two deadliest parasites are prevalent in African continent and Saharan Africa.It is an acute febrile illness which causes rapid onset of fever with headache and chills only after 10 to 15 days after the mosquito bite.If the symptoms are not treated immediately, it will lead to fatal in just one day [1].
In the year 2020, there was a high risk for nearly half the world population of contracting malaria.Some population groups with high risk include infants, and young children, pregnant women, and people with low immunity.
A considerable number of mortality and complications of morbidity is caused by malaria that has become a major health problem globally.As the resistant parasite strains emerge, the therapeutic options are limited which has led to the large spread of malaria.This can be prevented by potential public health emergency in designing new drugs, providing single dose cures and by introducing some novel mechanism of action.New excipients may be included in the medicine that could be effective in fighting the new variants of the strain.Using the available genomic techniques, advancement of the study of biology of the parasites can be developed for the new therapy.
In recent years, various advanced drug interventions have been revealed.This focusses on the discovery of antimalarial agents using the latest scientific and technological advances.There are many antimalarial targets that include proteins such as proteases, plasmodium sugar, farnesyltransferase inhibitor and DNA replication.World malaria report shows an increase of cases from 2019 to 2020 and an increase of almost 69,000 deaths in 2020 compared to 2019 [2].
To test the molecular properties in a laboratory consumes more time and money.This inconvenience can be avoided using quantitative structure activity relationship models to forecast different properties of various chemical compounds using topological descriptors.Topological indices find significant applications in various areas of mathematical chemistry [3,4] such as isomer discrimination, chirality, molecular complexity, drug design, selection of database, QSAR/QSPR/QSTR studies in the recent years [5].
The main objectives of this study are:  QSPR analysis of the properties of the topological indices using regression models for the nine anti-malaria drugs  Computation of topological indices for the considered drugs  Comparison of topological indices with correlation coefficients of few physical properties and statistical parameters Identifying drug-like from non-drug-like molecules is very much necessary to reduce the cost and time with failed drug development.There are various approaches that screen chemical databases against biological targets in the progress of new potential leads.QSAR modelling is a significant approach in the discovery of drug that correlates the structure of the molecule with biological activities.There are different methods like 2D (topological indices) and 3D (structure-based) of which 2D requires less calculation time and hence used for the preliminary screening of the drug development.Academia, industry, and research institutions across the globe, widely use 2D approach.
The principal step involves the selection of the right descriptor for various reasons.Only a few descriptors give a better understanding of the results and their interpretability, can reduce the risk of overfitting from redundant descriptors, and provides speedy and cost-effective models.
Topological index represents translations of molecular structures into structural descriptors that are expressed as numerical indices.These topological indices find their applications in drug design where the QSAR/QSPR/QSTR studies are involved.The QSAR/QSPR methods are based on the assumption that the activity or the property, such as a drug binding to DNA or toxic effect, of a certain chemical compound related to its structure through a certain mathematical algorithm.
For the chemical compound under the study, a series of parameters, called chemical descriptors are computed.Then, an algorithm that provides a quite accurate value, similar to theoretical experimental value is found.The final step is to check if the obtained algorithm is capable of predicting the activity/property values.
QSAR/QSPR/QSTR models are used to predict the association between the molecular structure and its activity/property/toxicity.Over the years many algorithms have been proposed and applied in QSAR/QSPR studies.The model framework includes molecular structure (graph) representation, calculation of molecular descriptors (graph invariants) and multiple linear regression method.The model will be validated through statistical parameters (r and r 2 ).The same approach is employed in this work and the statistical parameters have shown significant results.
Recently, Adnan et.al. made a study on the QSPR analysis of anti tuberculosis drugs in which 15 drugs were considered for the regression analysis of 6 physico chemical properties.Eleven topological indices were computed for which the correlation analysis showed a high positive good correlation with all the 6 physico chemical properties.Shanmukha et.al.studied 21 breast cancer drugs for which QSPR analysis were carried out.Eleven topological indices were studied in the study and linear models for each index were carried out.In 2020, Sigarreta [6,7] provided new tools that obtained a unified way inequalities involving many different topological indices.Joseobtained new optimal bounds on the variable Zagreb indices, the variable sum-connectivity index, the variable geometricarithmetic index and the variable inverse sum indeg index.Recently, Jose computed new lower and upper optimal bounds for general (exponential) indices of a graph.In the same direction, new inequalities involving some well-known topological indices like the generalized atom-bound connectivity index and the generalized second Zagreb index were shown.Some extremal problems for their corresponding exponential indices were computed.
The numerical representation of the arrangement of a molecule of a compound refers to topological index.Wiener index was one of the first topological indices proposed by Harold Wiener in 1947 which proved that his index correlated well with the boiling points of alkanes.At present, there are various indices introduced by various researchers which are divided into three categories, namely, degree-based, neighborhood degree-based and distance based topological indices [8][9][10].Sixty years ago, QSAR was introduced that included modelling of biological activities of molecules which was extended to modelling of few physicochemical properties of molecules leading to QSPR studies.Initially, it was chemical interpretation of topological indices [11,12] which has now extended to modelling of structural characteristics of molecules with their biological activity leading to complex modelling of compounds in QSAR/QSPR/QSTR studies [13][14][15][16][17][18][19][20][21][22][23][24][25].
This model does not require any lab equipment to perform the analysis.It saves a lot of time and money and the results obtained using the QSPR model are compared with the actual values for further analysis.
A graph G(V, E) with vertex set V(G) and edge set E(G) is connected if there exists an edge between every pair of vertices in G.
The graphs used in this work are simple graphs and have no loops and multiple edges.The number of edges incident to vertex  is the degree of the vertex  , denoted by  .For graph terminologies and notations refer [26,27].Some of the topological descriptors which are used in this work are given below.
The earliest set of topological indices are the first and the second version of Zagreb indices.They have been found impressive in finding the total -electron energy of molecules.Gutman and Polansky [28] introduced these indices in the year 1986 and are defined as follows Harmonic index is proposed and defined by Fatjlowicz [29] as Forgotten index was first defined by Furtula and Gutman [30] in 2015.It became popular as its performance in prediction of the index analogues to that of the original Zagreb index and is defined as A novel graph invariant called the SS index of a graph is proposed by Zhao et al. [31] and is stated as Ranjini et al. [32] introduced redefined version of the second and third Zagreb indices and are defined as

Materials and methods
In this study, anti-malaria drugs are modelled by simple graphs.To compute the topological indices of the considered drug's structure, the employed methods are vertex partitioning, edge partitioning and computational techniques.

Results and discussion
In this work, degree-based topological indices are computed for the drugs used in the treatment of malaria.The QSPR analysis of the computed indices are discussed, and it is observed that these indices are highly correlated with the physico chemical properties of the drugs that are used in the treatment of malaria.Chloroquine Amodiaquine The nine drugs viz., Chloroquine, Amodiaquine, Mefloquine, Piperaquine, Primaquine, Lumefrantrine, Atovaquone, Pyrimethamine and Doxycycline used in the malaria treatment are considered for the analysis.The molecular structures of these drugs are shown in Figure 1.It is modelled as a graph where the atoms are referred to as vertices while the bonds by its edges which is represented in Figure 2.

Regression models
The six physical properties of the drugs viz., boiling point, enthalpy, flash point, molar refraction, molar volume and polarizability for the 9 drugs are studied.A linear regression model used in this work is given below.
Using the linear regression equation discussed, the regression model for the above considered topological indices are defined.
The physical properties of anti-malaria drugs are considered as dependent variables and the topological indices for molecular graphs of 9 drugs are considered as independent variables.A linear regression model is fitted using SPSS software where the constants A and B in the regression equation ( 8) are calculated by using the training set in Tables 1 and 2.
Using the linear regression equation discussed, the regression model for the above considered topological indices are defined.

Comparison of topological indices with correlation coefficients of few physical properties and statistical parameters
In Table 1, six physical properties of the anti-malaria drugs considered in the study are presented.Various degree-based topological indices mentioned above are computed and tabulated in Table 2.The correlation of the indices in this study against all the six physical properties are tabulated in Table 3.It is observed that the indices and their corresponding physical properties are highly correlated for most of the properties and the correlation coefficient of second Zagreb index with enthalpy being the highest value of correlation (0.978).The correlation coefficients of the discussed topological indices with physical properties and graphically presented in Figure 3.The statistical parameters viz., the sample size (N),constant (A), slope (b), correlation coefficient (r), the percentage of the dependent variable  , and p-value for the QSPR study for all the above seven topological indices are studied.The null hypothesis is tested for p-value of each term is where the coefficient equals zero while the higher the p-value infers those changes in predictor are not related to changes in response.In this case, all the regression coefficients of a null hypothesis are zero while testing gives rise to F value.In such a case, the model does not have predictive capability.Using this test, one can compare their model with zero predictor variables to decide their coefficients improve the model.
Tables 4-10 represent the statistical parameters such as number of drugs considered, constant, regression coefficient, correlation coefficient, Fisher's statistic, significant value, and standard error denoted by N, A, b, r, r 2 , F and p respectively, for all the considered topological indices and physical properties.Table 11 denotes the standard error of estimate for physical properties of drugs.Tables 12-17 denote the comparison of actual and computed values of all physical properties of anti-malaria drugs.Figure 3 depicts the graphical representation of physical properties and the topological indices.This work focusses on the drugs used for the treatment of malaria which is caused by mosquitoes.The analysis done in this paper may help the chemists and pharmaceutical industry researchers to design novel drugs using various excipients by the values of indices computed here.They may constitute different drugs for various ailments based on the available topological indices in the article.The correlation coefficients of various drugs help the chemists to choose the right composition based on the high correlation value to form a new drug for novel ailments.New tools to obtain a unified way of involving inequalities of various topological indices may be carried out.The extremal values for the considered indices may be computed for exponential version of the indices.

( a )
Boiling point on TI (b) Enthalpy on TI (c) Flash point on TI (d) Molar refraction on TI (e )Molar volume on TI (f) Polarizability on TI

Figure 3 .
Figure 3. Correlation between physical properties of drugs and Topological indices.

Table 1 .
Physical properties of drugs used for the tratment of Malaria.

Table 2 .
The values of the topological indices of the drugs.

Table 3 .
Correlation coefficients of physical properties of drugs.

Table 4 .
Statistical parameters for the linear QSPR model for M1(G).

Table 5 .
Statistical parameters for the linear QSPR model for M2(G).

Table 6 .
Statistical parameters for the linear QSPR model for H(G).

Table 7 .
Statistical parameters for the linear QSPR model for F(G).

Table 8 .
Statistical parameters for the linear QSPR model for SS(G).

Table 9 .
Statistical parameters for the linear QSPR model for ReZG2(G).

Table 10 .
Statistical parameters for the linear QSPR model for ReZG3(G).

Table 11 .
Standard error of estimate for physical properties of drugs.

Table 12 .
Comparision of actual and computed values for boiling point from regression models of TI.

Table 13 .
Comparison of actual and computed values for enthalpy from regression models of TI.

Table 14 .
Comparison of actual and computed values for flash point from regression models of TI

Table 15 .
Comparison of actual and computed values for molar refraction from regression models of TI.

Table 16 .
Comparison of actual and computed values for molar volume from regression models of TI

Table 17 .
Comparison of actual and computed values for polarizability from regression models of TI