Unify Markov model for rational design and synthesis of more safe drugs . Predicting multiple drugs side effects

a Applied Chemistry Research Center, Central University of “Las Villas”, Santa Clara, 54830, Cuba b Chemical Bioactives Center, Central University of “Las Villas”, Santa Clara, 54830, Cuba c Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, Santiago de Compostela, 15782, Spain d Deparment of pharmacy and pharmaceutical technology, University of Velencia, Valencia, Spain. e Universitat Rostock, FB Chemie, Albert-Einstein-Str. 3a, D 18059 Rostock, Germany.


INTRODUCTION
The spectrum of the undesirable effects of a chemical substance can be wide and not very defined.In therapy, a drug produces typically numerous effects, but only one of them is generally looked for as main objective of the treatment; almost all the other ones are considered side effects of that drug for that therapeutic indication.Very few doctors believe that a drug, for trivial that are their actions, it can be exempt of producing toxic effects.
The use of terms like "safe" or "inoffensive" may cause unnecessary misunderstandings between the responsible organisms, the medical profession and the consumers of drugs, that is translated in a non justified trust and an expectation in the safety of the drugs by the public consumer of drugs.On the other hand, focus alarmists of the consequences of the side effects of the drugs affirm that thousands of patients die unnecessarily due to secondary effects of the drugs.As it almost always happens, the truth is on the middle of the way between the two tops, but it is not known with accuracy, not even approximately, where (Goodman and Gilman, 2001).
A prominent goal in pharmaceutical industry is the development of new drugs to avoid more serious side effects.Consequently, novel paradigms for drugs discovery and synthesis have been introduced recently, based on the ability of large chemicals libraries and robotic system for bioassays.This system of highthroughput biochemical assays allow for the synthesis and testing of hundreds of compounds each day (Lutz et al., 1996).
During last year, the pharmaceutical industries have reoriented their research strategies in order to give more attention to those mathematical methods that permit the "rational" selection or design and synthesis of novel compounds whit the desired biological properties (Briggs et al., 1996;Wess, 1996).In these sense, Quantitative Structure-Toxicity Relationships (QSTR) are used as a predictive tools for a preliminary evaluation of the hazard of chemical compounds by using a computer aided mathematical models (Cronin, 1998;Lewis, 1992;Cronin and Dearden, 1995).These mathematical models represent an alternative to the "real" world of assaying chemical compounds for determining their toxicological properties on live organisms in the laboratory avoiding the expensive, time-consuming and in many cases animal aggressive bioassays, which are now made only after preliminary predictions with computational models (Dearden et al., 1995;Roberts, 1987).
In general, González M.P. et al., have recently discussed that QSTR mathematical models can be applied to congeneric and non-congeneric data sets of compounds.The first permits the understanding of specific biological mechanism of toxic action for molecules structurally related as well as to identify the different toxicological power of groups or substituents in such chemicals.On the other hand, the use of QSTR models for non-congeneric data sets permits the generalization of such mechanism to structurally diverse compounds as well as the identification of possible toxicophores of different structural nature (González et al., 2004a, b, c;Morales et al., 2004).
On the other hand, Markov models are well-known mathematical tools for characterizing biomolecules structure.Markov models have been used for analyzing biological sequence data and they have been used to find new genes from the open reading frames (Vorodovsky et al., 1994(Vorodovsky et al., , 1995)).Another use of these mathematical models is data-based searching and multiple sequence alignment of protein families and protein domains.Protein turn types and sub-cellular locations have been successfully predicted (Krogh et al., 1994;Chou, 1997;Yuan, 1999;Hua and Sun, 2001).Hubbard and Park (Hubbard and Park, 1995) used amino acid sequence-based hidden Markov Models to predict secondary structures.In this sense, Krogh et al. (Krogh et al., 1994) have also proposed a hidden Markov Model architecture.In addition, Markov's stochastic process has been used for protein folding recognition (Di Francesco et al., 1999).This approach can also be used for the prediction of protein signal sequences (Chou, 2001(Chou, , 2002)).Another seminar works can be found related to the application of Markov Chains (MCH) Theory to Proteomic and Bioinformatics.Chou applied Markov Models to predict beta turns and their types, and the prediction of protein cleavage sites by HIV protease (Chou, 1993).Anyhow, have not been very used Markov models to develop QSTR studies and predict drugs side effect.
In this connection, our group has introduced elsewhere a physically meaningful mathematical approach based on Markov models (MARkovian CHemicals IN SIlico DEsign: MARCH-INSIDE) encoding molecular backbones information, with several applications in mathematical biology.It allowed us introducing matrix invariants such as stochastic entropies and spectral moments for the study of molecular properties.Specifically, the stochastic spectral moments introduced by our group have been largely used for small molecules QSAR problems including design of fluckicidal, anticancer and antihypertensive drugs.Applications to macromolecules have been restricted to the field of RNA without applications to proteins (González-Díaz et al., 2002a, b, 2003a, b, c, 2004a).In addition, the entropy like molecular descriptors has demonstrated flexibility in many mathematical biology problems such as: estimation of anticoccidial activity, modeling the interaction between drugs and HIV-packaging-region RNA, and predicting proteins and virus activity (González-Díaz et al., 2003d, 2004b, c, d;Ramos de A., 2004a, b).In the field of QSTR our group has reported the first model to predict chemically-induced agranulocytocis by small-to-medium sized drug like molecules (González-Díaz et al., 2003e).
However, in spite of several QSTR studies reported there have not been seriously studied almost drug side effects.Unfortunately, the more than 1 500 molecular descriptors reported have not only been applied to study drug side effects but have very disperse theoretical definition and some times not very well established physical definition.Consequently, becomes a forefront problem applying molecular descriptors to drugs side effect study but at the same time represent them in unified mathematical framework giving better opportunities for physicochemical interpretation (Todeschini and Consonni, 2000).In the current paper we attempt to develop a more serious physicochemical interpretation of the MARCH-INSIDE descriptors in thermodynamic terms, which allow us to contrast the relationship among these descriptors and topologic, flexibility, and quadratic molecular descriptors (Kubinyi et al., 1990).These new interpretation allows us built up a molecular thermodynamic basis in free energy terms (Villa et al., 2003) for predicting how likely a given drugs cause a specific side effect with respect to others side effects.This approach is able to take into consideration not only the molecular structure of the drug but the specific system the drug affects too.In particular will be possible correlate more than one property at time, in our case, drugs side effects, making it superior weigh against most of molecular descriptors which simply permit to correlate no more than one property at time, this advantage may be appropriately used in preliminary biological, pharmacological or toxicological studies, specially for comparative studies in early drugs develop stages.In the present work were modeled 19 multiple drugs side effects using a non-congeneric data set of 270 cases represented by 178 drugs of diverse molecular structure.

Markov Thermodynamics for drug-target step-by-step interaction
Let be, a hypothetical situation in which a drug molecule is free in the space at an arbitrary initial time (t 0 ).It is then interesting to develop a simple stochastic model for a step-by-step interaction between the atoms of a drug molecule and a molecular receptor in the time on the induction of a side effect.For the sake of simplicity, we are going to consider from now on a general structureless receptor.Understanding as structureless molecular receptor a model of receptor which chemical structure it is not taken into consideration.The initial free energy of the drug-receptor interaction ( 0 g j ) is a state function so a reversible process of interaction may be separated on several elemental interactions between the j-th atom and the receptor (Villa et al., 2003).Afterwards, interaction continues and we have to define the free energy of interaction between the j-th atom and the receptor given that i-th atom has been interacted at previous time t k ( k g ij ).In particular, immediately after of the first interaction (t 0 = 0) takes place an interaction 1 g ij at time t 1 = 1 and so on.So, one can suppose that, atoms begin its interaction whit the structureless molecular receptor binding to this receptor in discrete intervals of time t k .However, there are several alternative ways in which such step-by-step binding process may occur.Figure 1 illustrates this idea.
Step 2 Step 1 Step 2 Step 3 Step 0 Step 1 Step 0 Step 3 Step 0 Step 1 Step 2 Step 3 The free energy 0 g j will be considered here as a function of the absolute temperature (T) of the system and the equilibrium local constant of interaction between the j-th atom and the receptor ( 0 k j ) (Villa et al., 2003).Additionally, the energy 1 g ij can be defined by analogy as 1 k ij : ( ) ( ) The present approach to drug-receptor interaction has two main drawbacks.The first is the difficulty on the definition of the constants.In this work, we solve the first question estimating 0 k j as the rate of occurrence (n j ) of the j-th atom on molecules inducing the effect under study by molecule-receptor interaction with respect to the number of atoms in the molecule (n).With respect to 1 k ij we must taking into consideration that once the j-th atom have interacted the preferred candidates for the next interaction are such i-th atoms bound to j by a chemical bond.Both constants can be then written down as: Where, α ij are the elements of the atom adjacency matrix, n j , n, 0 g j , 1 g ij have been defined in the paragraph above, and R is the gases universal constant.
The second problem relates to the description of the interaction process at higher times t k > t 1 .Therefore, a Markov chains model (MCH) (González-Díaz et al., 2002a, b, 2003a, b, c, 2004a) enables a simple calculation of the probabilities with which the drug-receptor interaction takes place in the time until the studied effect is achieved.In this work we are going to focus on drugs side effects.As depicted in Figure 1, this model deals with the calculation of the probabilities ( k π ij ) with which any arbitrary molecular atom j-th bind to the structureless molecular receptor given that other atom i-th has been bound before; along discrete time periods t k (k = 1, 2, 3, …); (k = 1 in grey), (k = 2 in blue) and (k = 3 in red) throughout the chemical bonding system.
The procedure described here considers as states of the MCH the atoms of the molecule.We can built up the corresponding absolute initial probability vector A π 0 and the stochastic matrix 1 Π, which has the elements A π 0 (j) and 1 π ij respectively.The elements A π 0 (j) of the above mentioned vector A π 0 constitutes the absolute probabilities with which the j-th atom interact with the receptor at the initial time with respect to any atom in the molecule: Where, a represents all the atoms in the molecule including the j-th, n a is the rate of occurrence of any atom a including the j-th with value n j .On the other hand, the matrix is called the 1-step drug-target interaction stochastic matrix 1 Π is built too as a squared table of order n, where n represents the number of atoms in the molecule.The elements ( 1 π ij ) of the 1-step drug-target interaction stochastic matrix are the binding probabilities with which a j-th atom bind to a structureless molecular receptor given that other i-th atoms have been interacted before at time t 1 = 1 (considering t 0 = 0): Where, δ is the valence of the j-th atom.The method arranges all the A π 0 (j) values in a vector ( A π 0 ) and all the 1 π ij constants as a squared table ( 1 Π) of n x n dimension.The calculation of A π 0 and 1 Π is illustrated in Figure 2. By using, both A π 0 and 1 Π and Chapman-Kolgomorov equations one can describe the further evolution of the system, determining the average constant of interaction between the j-th atom and the receptor at higher times.Summing up all the constants of interaction for each atom we can derive the stochastic absolute probability of interaction ( A π k ) between the drug and the receptor at a specific time: Where, A π 0 is a 1 x n vector whose elements are the A π 0 (j) probabilities for the n atoms in the molecule and k Π are the kth natural powers of the 1 Π matrix and u is an unitary vector.As the sum up to all atoms in the molecule of A π k (j) is ever equal to 1, the atoms were groped in sets or classes (s r ): s 0 = CSat = Saturated carbon atom; s 1 = CInst = Unsaturated carbon atom; s 2 = Hal = Halogens; s 3 = Het = Heteroatoms; or s 4 = HX = Hydrogen bonded to heteroatom in order to describe local aspects of molecular structure.
Such a model is stochastic per se (probabilistic step-by-step atom-receptor interaction in time) but also considers molecular connectivity (the step-by-step atom union in space throughout the chemical bonding system).The selection of a Markov chain process (Gnedenko, 1978;Freund and Poschel, 2000) is not arbitrary.Due to atoms interactions are not dependent of previous atoms interactions we can affirm that a MCH-based model of a stochastic drug-target step-by-step interaction obeys perfectly to the main characteristics of MCH (a memoryless property).This implies that the probability of the occurrence of an event (atom union) does not depend on the history of the process.In other words, such a model will not depend of atoms unions at previous times.

Data set methodology
The data set was conformed by series of the more frequently used drugs which produce side effects in different human organs systems, being these ones extensively tested in clinic and the side effects reported obtained by pharmacovigilance studies.The use of marketed drugs in data set confers a high confidence about the side effect reported.The set of drugs where extracted from a report of drugs side effects listed in literature (Garcia and Horga de la Parte, 1994).The data set was conformed by 19 different drugs side effects grouped in 8 affected biological systems for 178 structurally diverse drugs (see Figure 2.), being 270 cases finally, taking into consideration that all side effects groups were statistically represented having each one at least 7 drugs in order to perform a balanced training series.

Statistical analysis
As a continuation of the previous sections, we can attempt to develop a simple linear QSAR using the MARCH-INSIDE methodology, as defined previously, with the general formula: Here, A π k (Sr) act as the local molecular descriptors, being Sr the above mentioned atoms sets.We selected Linear Discriminant Analysis (LDA) (Van Waterbeemd, 1995;Kowalski and Wold, 1982) to fit the classification functions.The model deals with the classification of a set of compounds with diverse side effects.A dummy variable (SE x ) codifies the side effect studied.This variable indicates either the presence (SE x = 1) or absence (SE x = -1) of side effect studied.In equation ( 8), b k represents the coefficients of the classification function, determined by the least square method as implemented in the LDA module of the STATISTICA 6.0 software package (STATISTICA, 2001).Forward stepwise was fixed as the strategy for variable selection (Van Waterbeemd, 1995;Kowalski and Wold, 1982).
The quality of LDA models was determined by examining Wilk's U statistic, Fisher ratio (F), and the plevel (p).We also inspected the percentage of good classification and the ratios between the cases and variables in the equation and variables to be explored in order to avoid over-fitting or chance correlation.Validation of the model was corroborated by re-substitution of cases in four predicting series (González-Díaz et al., 2003b).
Clustering of compounds was carried out after previous perform of a canonical analysis using the algorithms implemented in the advanced options for LDA in the STATISTICA 6.0.This analysis offers as outputs the scores of every case for successive canonical roots which are orthogonal centred equations explaining decreased amounts of variance.Consequently we can plot the scores for each compound in a Cartesian system of coordinates and using a symbol code visually exploring the possibility of clusters formations (González-Díaz et al., 2003b).

Back-Projection Analysis and MARCH-INSIDE
In order to calculate the total atom contribution to a specific side effect in the current approach, we make use of the decomposition of total molecular descriptors into local descriptors.More specifically, we decompose the total molecular descriptors into atomic descriptors of the atom in the molecule.For example, the molecular descriptors of chloroform may be decomposed as follows: Afterwards, the values of the atomic descriptor for each atom are substituted in the QSTR equation, obtaining the contribution of the atom to the specific side effect where zones shown in gray (shown in red) are those that have a low (high) contribution to the specific side effect.Only the zones that contribute to the specific side effect were quantified.Estrada and González have explained this procedure in detail for bond spectral moments (Cabrera, 2002).The method, called Back-Projection Analysis (BPA), is general for any molecular descriptor, defined a priori as a sum of local descriptors, at least for linear QSTR/QSARs.(Stief, 2003).The main importance of BPA is that it offers a clear and direct interpretation of results in structural terms.Here we adapt a BPA approach to MARCH-INSIDE and LDA methodology.The present study is aimed on the selection of novel drug candidates for synthesis.Then, we select the different structural synthetic blocks of the molecules as molecular regions for the BPA.As LDA predicts the probability of occurrence of the side effect, we preferred to standardize all of the contribution in order to express them as the percentage of activity that each group accounts for.

Mathematical model
Equation ( 7) constitutes in mathematical terms a vector-matrix-vector form.Panoply of these transformations has been previously used in QSAR studies for a long time.For instance, the first molecular descriptor defined in a chemical context the Wiener index W (equation 9) is a quadratic form (Todeschini and Consonni, 2000).In addition, several other classic Zagreb indices M 1 (equation 10) and M 2 (equation 11), Harary number H (equation 12), Randic invariant χ (equation 13), valence connectivity index χ v (equation 14), the Balaban index J (equation 15), the MTI index (equation 16), and Moreau-Boroto autocorrelation ATS d (equation 17) just to mention a few examples, may be expressed all of them as quadratic forms, linear or in general vector-Matrixvector forms (Todeschini and Consonni, 2000).Unfortunately, many of them have not a direct physical interpretation.The same lack in physical sense can be detected for recent quadratic q k (X) (equation 18 Where, D, A, D -k , m B, M, and S are matrices related to distance, atom adjacency, sparse, pseudograph matrices and others.On the other hand, u, v, v', v'', and w, are vectors related to unitary, vertex degree, Randic atom degree, valence degree, atom weight (electronegativity).All the vectors and matrices used in expressions (9) to (20) have been exhaustively explained in the literature reported and references therein cited, see therein for details (Todeschini and Consonni, 2000;Marrero-Ponce et al., 2004a, b;Marrero-Ponce 2004a, b).
In the present work we propose to call all these molecular indices, except equation (20), the deterministic vector-matrix-vector forms by opposition to stochastic forms.The stochastic form, equation ( 20), very recently introduced by Marrero-Ponce (2004b) unfortunately lacks of physical sense.By the contrary, the main advantage of our stochastic forms is the possibility of deriving average thermodynamic parameters depending on the probability of the states of the MCH, which fit on more clearly physicochemical sense with respect to classic vector-matrix-vector forms.In specific, this work introduces by the first time a Markov form to calculate thermodynamic parameters of the drug-target interaction process considering in a unified scheme: time, chemical structure, and system including drug side effects.
Another advantage of the present stochastic vector-matrix-vector forms with respect to Marreo-Ponce forms, which are derived from a multigraph, constitute the fact of it was not necessary considering different rates of occurrence for atoms of the same element but having different configuration e. g.: sp3, sp2, and sp carbons all were considered with the same atoms weights (rate of occurrence herein) for a specific side effect, the rate of carbon atoms (Marrero-Ponce et al., 2004a, b;Marrero-Ponce 2004a, b).It was possible due to the p ij values clearly distinguished among these atoms because of the different connectivity (see Figure 3).It is clear from Figure 2 that atoms with different connectivity or configuration will have a different probability of union to the structure-less molecular receptor in spite of having the same rate of occurrence.All these are the reasons for the selection in this work of our stochastic forms instead of others.Once we perform a representative and balanced training series selection it could be used to fit the classification functions.The models where subjected to the principle of parsimony.Then, we chose a function with high statistical significance but having few terms b Sr,k x A π k (Sr) as possible to each of 19 studied side effects.In order to derive a classification function that permits the classification of drugs as positive (presence of side effect) or negative (absence of side effect) we use the LDA in which stochastic absolute probabilities of interactions A π k (Sr) are used as independent variables.The classification models obtained to each studied side effect are given below in table 1 together whit the statistical parameters of the LDA, validations of the current model by re-substitution of cases in four predicting series results and percents of good classification to each model.

Table1.
Overall train accuracy, cross-validation (CV) predictability, and models for different drugs side effects.

Side Effects Train CV Model Gastrointestinal Manifestations
Constipation or ileo (CoI) 90.9 97.7 CoI = -4.78+ 19.In the models the coefficient U is the Wilk's statistics and F is the Fisher ratio.The Wilk's U-statistic is the standard statistic that is used to denote the statistical significance of the discriminatory power of the current model (González-Díaz et al., 2002a;Franke, 1984).Results displayed in table 2 prove the robustness and predictability of the mathematical models obtained.In order to simplify the equations for the purposes of interpretation and the possibility of graphical representation, we performed a canonical analysis (Van Waterbeemd, 1995) for gastrointestinal manifestations side effects group with the only purpose of illustrate the capability of the equations obtained to condense more than two side effects groups in only one simple equation (Root function) and its ability to discriminate between several side effects groups.The main root obtained (Root 1) proved to be a simple equation centered to 0: This canonical root presented an eigen-value of 1.32 and an acceptable regression coefficient of 0.75, which it is statistically significant (p-level<0.05),together whit a Chi-squared statistic of 51.28.
Aimed on finding some similarity whit others descriptors we could contrast our stochastic vector-matrixvector forms A π k (Sr) whit Toporov optimization of correlation weights of local graph invariants (OCWLI) named flexible descriptors (Toropov and Toropova, 2001, 2002, 2003;Toropov and Benfenati, 2004) (do not confound with flexibility descriptors).In this sense, both descriptors take into consideration more than one parameter.In flexible descriptors case, it is taken into consideration the abstract parameters (weights) which can be optimized in function of the pursued objectives.On the other hand, the parameters ours molecular descriptors take into consideration cannot be optimized, but have a direct physicochemical interpretation, such aspect has been analyzed in previous paragraphs of methods section.

Back-projection analysis
Finally, we applied a BPA in order to carry out a physical interpretation in structural terms of the models obtained.BPA graphics for two ulcerogenic drugs (Piroxicam and Droxicam) where developed.As was explained in the Methodology section, zones shown in gray (shown in red) are those that have a low (high) contribution to the specific side effect.
The ulcerogenic ability of the non steroid anti-inflammatory drugs (NSAID) generally is due to the inhibition of the synthesis of the prostaglandins E1 and E2, both depressors of the gastric secretion and intestinal mucosa vasodilator, this could promote the gastric secretion and to cause gastric vasoconstriction.This effect could derive in an ischemic necrosis that would originate a loss of gastric mucosa able to degenerate in a chronic gastric ulceration.
A hypothesis based on the structure-activity relationships for indomethacine and several NSAID; propose a receptor for these drugs, consistent in two non coplanares hydrophobic regions and a cationic core.Figure 4 shows the proposed receptor.This receptor consists essentially on an extensive flat surface, a hole to accommodate a group outside of the plane (for example an aromatic ring), and a cationic core in charge of associating to the acid anion (or a protonated amine) (Gund and Shen, 1977).and Shen, 1977).
In both cases (piroxicam and droxicam) the back-projection analysis shows concordant results with this receptor model, droxicam three fusionated rings and piroxicam's benzothiazine core have a significant contribution to the ulcerogenic ability (66.68% and 55.2% respectively).In both cases the pyridine ring show a contribution higher than 23%, coinciding with the receptor model too.In general the molecular regions implicated in the interaction with the receptor present a high contribution to the toxicity of droxicam and piroxicam (92.5% and 78.5% respectively).The same receptor model corresponds to the active center of the prostaglandin cyclooxygenase (COX), required enzyme for the biotransformation of the araquidonic acid to prostaglandins, causing the ulcerogenic effects above mentioned as a consequence of the inhibition of the synthesis of the prostaglandins E1 and E2 (Gund and Shen, 1977).
Additionally, the conformational geometry of both molecules was optimized using a PM3 semi-empirical method implemented in the software HyperChem Release® 7.03 for Windows® (HyperChem, 2002) proving the conformational affinity of these drugs for the COX receptor.This point may be confirmed in Figure 6.Equation obtained for this drug side effect (PoHU = -10.96+ 27.29 A π 2 (CInst) + 29.54A π 1 (Het) -5.35A π 5 (HX)) show a positive contribution of unsaturated carbons and heteroatoms and a negative contribution of the hydrogen bonden to a heteroatom, this probably due to the formation of intramolecular hydrogen bonds which could decrease the probability of union of the molecule to the molecular receptor.This tentative interpretation is supported by the results obtained in the BPA, proved in structural terms.

CONCLUDING REMARKS
The fusion of high throughput screening and QSAR/QSTR (González-Díaz et al., 2002a, b, 2003a, b, c, d, e, 2004a, b, c, d;Ramos de A., 2004a, b) techniques in attempt to develop new drugs avoiding more serious side effects and minimize the costs in terms of time, financial, human and animal resources is becoming a viable alternative to rational design, massive screening and synthesis of novel compounds.The results described here have demonstrated that MARCH-INSIDE methodology encode molecular backbones information, with several applications in mathematical biology.Specifically, stochastic absolute probabilities of interaction A π k (Sr) is able to provide a physicochemical direct interpretation for drug-target step-by-step interaction taking into consideration not only the molecular structure of the drug but the specific system the drug affects too.In particular, thru this molecular descriptor will be possible correlate more than one property at time (in our case, drugs side effects) having a more serious physicochemical interpretation in thermodynamic terms.This fact make the present descriptors superior weigh against most of molecular descriptors, which correlate no more than one property at time (Cabrera and Bermejo, 2004).This advantage may be appropriately used in preliminary biological, pharmacological or toxicological studies and synthesis of new drugs, especially for comparative studies in drug development early stages.

Figure 3 .
Figure 3. Definition and calculation of Π 1 matrix for a specific compound in three particular cases of side effects.The element symbol is used to denote the value of the rate of recurrence [i.e., Cl represents the rate of recurrence (n Cl ) of chlorine atom for the specific side effect].Thr: Thromboembolism, Pat: Pancreatitis, PhD: Photodermatitis.

Figure 4 .
Figure 4. Model for the prostaglandin synthetase cyclooxygenation site proposed by Gund and Shen(Gund and Shen, 1977).

Figure 5 .
Figure 5. Back-projection graphic for two drugs classified as able to induce peptic or hemorrhagic ulceration (Piroxicam and Droxicam).P = posterior probability of produce Hemorrhagic or peptic ulceration.

Figure 6 .
Figure 6.Conformational geometry optimization of Droxicam and Piroxicam using a PM3 semi-empirical calculation method.

QSTR:
Quantitative structure-toxicity relationships MCH: Markov chains HIV: Human immunodeficiency virus MARCH-INSIDE: Markovian chemicals in sílico design QSAR: Quantitative structure-activity relationships RNA: Ribonucleic acid LDA: Linear discriminant analysis BPA: Back-Projection Analysis OCWLI: Optimization of correlation weights of local graph invariants

Table 2 .
Cross-validation (CV)predictability and robustness for different drugs side effects.

75.8 78.8 72.7 77.3 84.0 72.0 80.0 79.2 78.8 Dermal Manifestations Predictability Robustness
* % of good classification based on posterior probabilities for four different training and predicting sets; predictability refers to compounds within predicting sets and robustness to compounds within training ones.