QSAR study on inhibition of E. Coli by sulfonamides

The paper describes a QSAR study on inhibition of E. Coli by sulfonamides using distance-based topological indices. The sulfonamides discussed consists of 39 derivatives with substitution at 2-, 3-and 4-positions as well as having some di-substitution. Application of multiple linear regression analysis indicated that combination of distance-based topological indices with ad hoc molecular descriptors and indicator parameters yielded statistically significant model for modeling inhibitory activity (log1/C) of E. Coli . Predictive potential of the model is obtained by cross-validation parameters as well as by using a variety of other statistics. The results have indicated that there is no involvement of a positive hydrophobic term (logP) in the inhibition process, suggesting that the binding of the sulfonamides to the active center does not depend on hydrophobic interactions. Final selection of potential sulfonamide is made by molecular modeling.


Introduction
Bacteria have been one of the most studied test systems.The current data-base of Hansch 1 indicated that out of 1775 QSAR's on a single organism, 649 are for bacteria and that bacteria vary enormously in their response on treatment with various organic compounds acting as a drugs.It is worthy to mention the antibacterial activity of sulfa-drugs having similar structure is due to their activity as 4-aminobenzoic acid antagonist.These drugs inhibited the incorporation of 1,2 p-aminobenzoic acid into folic acid by folate synthetase.The folate synthetase incorporates the drug sulfamethoxazole in pterin, thus completely blocking folate synthesis 2,3 .A plethora of literature exists on the action of sulfa-drugs on all kind of bacteria.Needless to state that p-aminobenzoic acid is a component of folic acid and is needed by bacteria for their survival and multiplication.
Carbonic anhydrase(CA's), class of sulfa-drugs are inhibited by sulfonamides.The ring substituted benzenesulfonamides containing -SO 2 NH 2 groups have similar activities.In case of sulfonamides the inhibition of CA is caused mainly by the binding (coordination) of the SO 2 NH - anion to the Zn 2+ of the enzyme, miming the bicarbonate anion in the transition state 4 .Furthermore, we have shown that inhibition of carbonic anhydrase by sulfonamides and Schiff base derived from them can be modeled excellently using distance-based topological indices.Earlier, in many cases hydrophobic parameters were used for this purpose.
Recently 1 Hansch stated that "over the year we have been impressed by the great importance of hydrophobic effects in chemical-biological interactions as brought out by quantitative structureactivity relationships (QSAR's),…….itseems timely to examine those instances where hydrophobic terms are not significant".
Prompted by the above, and in continuation of our earlier work [4][5][6][7][8][9][10][11][12][13][14][15] , we have undertaken the present study in that we have made QSAR analysis on the inhibition of E.Coli by sulfonamides using a set of molecular descriptors consisting of some distance-based topological indices together with some adhoc molecular descriptors.Another objective of our study was to investigate the need (if any) of hydrophobic parameter (logP) in modeling inhibition of E.Coli by sulfonamides.We have used earlier data 1 for this purpose.(Figure 1, Table 1).

ARKAT
The topological indices used for the QSAR analysis were Wiener, 16 Szeged, [17][18][19] first order molecular connectivity 20 , and Balaban indices 21 .The adhoc molecular descriptors used were molar refractivity (MR), molar volume(MV), parachor(Pc), refractive index(η), surface tension(ST), density(d) and polarizability(α) in addition to logP and HE (hydration energy).Preliminary statistical analysis indicated the need of some indicator parameters for obtaining statistically significant models.We have, therefore, used six indicator parameters I P1 to I P6 accounting for the presence/absence of substitution at X: nitro-substitution, disubstitution, substitution at ortho-position, substitution at meta-position and substitution at para-position respectively.These indicator parameters are dummy parameters and assume only two values: one (for presence) and zero (for absence).3][24] The predictive ability of the model is discussed on the basis of predictive correlation coefficient.We have separated a set of potential inhibitors of E.Coli and finally we have aimed at the most appropriate model using molecular modeling.

Results and Discussion
The set of 39 sulfonamides and their adopted inhibition (MIC) values of E-Coli expressed as log1/C are presented in Table 1 which shows that log1/C are highly influenced by the substitution on the aromatic nucleus.The log1/C value is lowest for 2,3-di-Me substitution.The data presented in Table 1 also  (1) This eq (1) shows that 3-OC 2 H 5 , 3-OMe, 2-Me, 5-Cl, and 2-Me, 6-Cl substitution does not change the activity (log1/C)of the parent (unsubstituted) sulfonamide.Thus, the effect due to electronic nature of these substitutents mimic that of hydrogen substitution.Similarly, the substituents 3,5-di-Cl, 2-Me, 4-NO 2 , 2-Cl, on one hand and the substitutents 3-Cl, 2-OMe, 4-Cl, 2-Cl, 4-NO 2 on the other hand have an identical effect on the activity.Likewise, the three pairs (i) 4-CN; 4-NO 2 (ii) 4-CF 3 , 3,5 di-Cl and (iii) 4-NH 2 , 2-OC 2 H 5 have independently an analogous effect on the activity.
Consequent to the occurrence of degeneracy in the activity it became essential to examine the degeneracy in the molecular descriptors also.In Table 2 the calculated values of topological indices: W, 1 χ, J and Sz are recorded.6][27] However, inspite of their degeneracy they can be successfully used in drug modeling 28,29 .Comparison of the observed activity and the corresponding topology of the sulfonamides used shows that the topology of the sulfonamides alone is not responsible for the variation of the activity.The same is found to be the case with hydrophobic (logP) and other parameters (Table 3) used for modeling the activity.
The intercorrelatedness among molecular descriptors including topological indices with the activity shows (Table 4) that except J all other topological indices are highly mutually correlated, while this is not the case with the other physicochemical parameters used.Furthermore, data presented in Table 4 show that none of the molecular descriptors, including topological indices correlate well with the activity (log1/C).From this we conclude that no single variable model is capable of modeling the activity and that the refereed descriptors can be combined to obtain a statistically significant multiparametric model for modeling the activity.Also, that models containing two or more topological (except J) indices may suffer from defect due to correlation 22- 24 .However, such cases are nicely dealt with Randic 30 and we will use his recommendations to analyze such cases.
Initial regression analysis indicated that out of the 12 molecular descriptors used Sz in combination of physicochemical descriptor plays a dominating role in modeling the activity.However, statistically significant models are obtained when three descriptors are used and that the quality of the model goes on improving with higher parameteric modeling (Table 5).The triparameteric model containing three descriptors (Sz, MV, d) is found as below: log1/C = 7.5491 + 0.0016(±3.4783x10-4 )Sz -0.0281(±0.0076)MV+ 0.9825(±0.3415)d (2) n = 39, Se = 0.3550, R = 0.7074, R 2 A = 0.4576, F = 11.686The Szeged index (Sz) is the modification of Wiener index (W) for cyclic (cycle containing) compounds.Its positive coefficients indicate that the presence of an aromatic nucleus is essential for the exhibition of the activity.The same is the case with the density parameter d.However, the negative coefficient of MV indicates that the activity goes on decreasing with the increasing value of MV.This molar volume (MV) is one of the important polarizability parameters, thus we can safely say that polarizability plays a negative role in the exhibition of the activity.
When an indicator parameter I NO2 is added to the above model (Eq.2) then its statistics is significantly improved.Thus, the resulted tetra-parametric model containing Sz, MV, d and I NO2 gives the following mode: log1/C = 8.6451 + 0.0025(±5.6663x10 - )Sz -0.0408(±0.0095)MV+ 1.2998(±0.3620)d-0.5351(±0.2620)INO2 (3) n = 39, Se = 0.3400, R = 0.7450, R 2 A = 0.5027, F = 10.602The negative coefficient of the indicator parameter I NO2 indicates that the presence of an NO 2 group is not favorable for the exhibition of the activity.The physical significance of the remaining three parameters (Sz,MV and d) is the same as discussed for the model expressed by eq (2).The successive regression analysis indicates that a penta-parametric model containing Sz,MV,d, I NO2, and ST yielded still better results: log1/C = 5.6033 + 0.0029(±5.3481x10-4 )Sz -0.0359(±0.0089)MV+ 0.6914(±0.3991)d-1.0246(±0.2997)INO2 +0.0404(±0.0148)ST(4) n = 39, Se = 0.3117, R = 0.7981, R 2 A = 0.5820, F = 11.581The physical significance attached to Sz, MV, d, and I NO2 are the same as discussed for the model expressed by equation 3. The positive coefficient of ST indicates that the activity goes on increasing with the increase in the magnitude of ST.In our recent publication 31 we have stated that surface tension (ST) can be considered as an inverse steric parameter, which thus is responsible for the improved statistics of above model.

ARKAT
The quality of above model is significantly improved by the addition of an indicator parameter I P .This hexa-parametric model containing Sz, MV, d, I NO2 , ST and I P is found as below: log1/C = 0.2971 + 0.0035(±6.1001x10 - )Sz -0.0394(±0.0087)MV+ 0.6326(±0.3821)d-1.2293(±0.3147)INO2 +0.0533(±0.0154)ST-0.2669(±0.1283)IP (5) n = 39, Se = 0.2971, R = 0.8248, R 2 A = 0.62003, F = 11.345The physical significances attached with Sz, MV, d, I NO2 , and St are the same as discussed above.The added indicator parameter I P has a negative coefficient.This parameter is responsible for the presence of substituents at the para-position.Thus, its negative coefficient indicates that substitution at the para-position is not favorable for the exhibition of the activity.
Further, analysis of equation 5 indicates that compounds 1,33,38 and 39 gave high residues i.e., the difference between observed and calculated log1/C.Thus, they can be considered as outliers and can be removed from the regression procedure.When we did so a tremendous improvement in the statistics was observed so that the correlation coefficient increases from 0.8248 to 0.9110.Furthermore, this improved model requires less correlating parameters.This is a penta-parametric model containing Sz, MV, I NO2 , ST and I P and is found as below: log1/C = 4.0682 + 0.0037(±4.3912x10 - )Sz -0.0355(±0.0062)MV+ 0.0663(±0.0096)ST-1.3108(±0.2282)INO2 -0.1817(±0.0936)IP (6) n = 35, Se = 0.2127, R = 0.9110, R 2 A = 0.8005, F = 28.288The physical significance of parameters contained in the model is the same as discussed above.And there is no significant improvement in the statistics when the parameter d is added to the above equation.Instead, the resulting six parametric model suffers from the defect in that coefficient of the d (0.0219) was smaller than its standard deviation (0.3051).][31][32][33] In order to confirm our findings we have compared the calculated log1/C values from equations 5 and 6 with the observed values of log1/C.Such a comparison is shown in Table 6 and demonstrated in Figures 2 and 3.The residual, that is the difference between observed and calculated log1/C indicates that the model expressed by equation 6 is the best for modeling log1/C.Further, evidence in our favour are obtained by calculating predictive correlation coefficients (R pred ) using Figures 2 and 3, which comes out to be 0.6802 and 0.8298 respectively, supporting that the model expressed by eq (6) is the best model.In order to investigate the role of hydrophobic parameter (logP) for modeling the antibacterial activity of the sulfonamides against E.Coli we have used logP as one of the correlating parameters.In majority cases we obtained model in that the coefficients of logP term was considerably smaller than its standard deviation.][31][32][33]

Conclusions
From the results and discussion made above we conclude that the distance-based topological indices can be used successfully for modeling inhibition of E.Coli by sulfonamide and that for the present set of sulfonamides Szeged index (Sz) is found to be prominent.Also, that this index (Sz) yields statistically significant models upon combination with other molecular descriptors.Like earlier cases [4][5][6][7][8][9][10][11][12][13][14][15] here also a model containing five descriptors yielded excellent results.The consistent increase in R 2 A value as we pass from bi-to penta-parametric models supports this conclusion.

Experimental Section
Antibacterial activity against E-Coli.The set of 39 sulfonamides used in the present study is given in Table1.The MIC values expressed as log1/C for the inhibition of E-Coli by these sulfonamides were taken from the literature 1 .

Topological indices.
All the used topological indices were calculated using all hydrogen suppressed graph by deleting all the carbon hydrogen as well as hateroatomic hydrogen bonds from the structure of the sulfonamides.The calculations of these indices are well documented in the literature and therefore, their detailed calculations are not given here.However, below we have given the final expression for the calculation of these indices.

Wiener index (W)
Wiener index W = W(G) of G is defined 16 as the half-sum of the elements of the distance matrix: W = W (G) = ½ Σ Σ (D) ij i=1, j=1 where, (D) ij is the ij th element of the distance matrix which denotes the shortest graphtheoretical distance between sites i and j of G.

Szeged index (Sz)
The Szeged index, Sz = Sz(G), is calculated [17][18][19] according to the following expression: Sz = Sz (G) = Σ n u .n v edges where n u is the number of vertices lying closer to one end of the edge e = uv; the meaning of n v is analogous.Edges equidistance from both the ends of an edges, e = uv are not taken into account.

Balaban index (J)
The Balaban index J = J(G) of G is defined 21 as: J = M/µ+1 ∑(d i .dj ) -0.5 Bonds where M is the number of bonds in G, µ is the cyclomatic number of G, and di (i = 1,2,3,….N; N is the number of vertices in G) is the distance sum.The cyclomatic number µ = µ(G) of a cyclic graph G is equal to the minimum number of edges necessary to be erased from G in order to transform it into the related acyclic graph.In case of monocyclic graph µ = 1 otherwise it is calculated by means of the following expression Μ = M-N+1 Physicochemical parameters.various physicochemical parameters (descriptors) Viz, MR (molar refraction), MV (molar volume), Pc (parachor), η(refractive index), Surface tension (ST), density (d), α (polarizability) and logP (logarithm of octanol/water partition coefficient) were calculated using ACD Labs software. 35The expressions for the calculation of these parameters are available in the literature.
Regression analysis.We have used maximum R 2 method 22-24 and adopted step-wise regression for obtaining statistically significant models.

Figure 1 .
Figure 1.General structure of sulfonamides used in the present study.

Table 1 .
Structural details and assumed indicator parameter for the sulfonamides used in present study X = 1 if halogen substitution, I NO2 = 1 if NO 2 substitution, I DS = 1 if di-substitution, I O = 1 if substitution at ortho position, I M = 1 if Substitution at meta position, I P = 1 if substitution at para position.

Table 2 .
Calculated values of distance-based topological indices used in present study

Table 3 .
Hydrophobic (logP) and other physicochemical parameters used in the present study

Table 5 .
Regression parameters and quality of correlation of the proposed models

Table 6 .
Comparison of observed and calculated antibacterial activity against E-Coli using different models *Data point not incorporated in calculations.