QSAR model based in the TOPS-MODE approach used to predict chromosomal aberrations in bioactive phenolic compounds

The in silico characterization of bioactive substances which are constituents of functional foods or nutraceuticals is the methodology described in this communication. The aim of this work is to show the potential of the TOPS-MODE approach as chemical-informatics method to study the structure/clastogenic activity (chromosomal aberrations) and identify structural alerts related to genotoxicity. The results of QSAR studies were analyzed for several classes of phenolic compounds (flavonoids, phenolic acids and coumarins) for which was required the use of software STATISTIC and MODESLAB and a mathematical model encoding topological information of a substructural level. It was observed that the criteria for maximum clastogenicity are the methoxy and hydroxyl polisubstitutions (methoxy > hydroxyl) and the polarity of the substituents. This was observed for all the analyzed subclasses. The confirmation of these results is based on the percentage of good classification for the used external databases, the recognition of physical chemical descriptors of polarity (μ1 y μ2) and in the calculation of the fragments contribution. It can be conclude that QSAR methods, and in particular the used topographical approach, may constitute a predictive tool for the design and evaluation of bioactive components of functional foods or nutraceuticals.


Introduction
The increasing advance of industrialization and the changing lifestyles have produced changes in the work environments and dietary habits.It is estimated that worldwide there are more than 70,000 chemicals and about 1,000 new substances annually.
Exposure to these compounds, either in its development or production can sometimes have adverse effects on the health of workers.Also, their subsequent use as additives, new-type foods, nutraceuticals or drugs can potentially affect the consumers.These effects cannot be related to immediate and apparent injuries, but can take years to be manifested.Some of these substances may be genetically active, presenting the capacity of interacting with genetic material, causing DNA damage (1).This damage is commonly measured by the breakage of single or double chains and chromosomal aberrations (2,3).Chromosomal aberrations are easily observable structural changes in the metaphase of the cell cycle (clastogenic processes), which are caused by breakage of the DNA strands unrepaired or improperly repaired (4).This process is considered, together with the mutations, endpoints of oxidative damage to DNA (1).
There is a relationship between exposure to genotoxic substances (either occupational, accidental or due lifestyle) and the increased cancer risk.The exposure to potential genotoxic agents, both physical and chemical, can produce, depending on the type of damage induced on DNA, chromosomal abnormalities such as chromosomal aberrations, making these agents "clastogens" (4).Genotoxicity that can be caused by some bioactive compounds, such as phenolic compounds derived from plants and foods, is an area of current interest.Some phenolic compounds present in vegetables consumed by us, which have been studied for their antioxidant activity, also showed in vitro pro-oxidant and clastogenic activity (5).
The bioactive compounds are mainly used in the development and characterization of a "healthy food".They are been giving rise to special interest after the development of new concepts such as functional foods, due to their claim to provide, through regular consumption, potential preventive effects as health protective or promoters.(6).
Functional foods could be naturally occurred, but also can be modified by adding bioactive compounds, which must have been previously tested in different experimental systems.Due to the high structural variability that may be considered for phenolic compounds in nature, structure-activity relationship studies based on chemoinformatic methods (QSAR/Quantitative Structure-Activity Relationships) can provide a low cost alternative for predicting possible chromosomal aberrations in response to exposure to such agents.Therefore, the aim of this work is to show the potential of the TOPS-MODE approach as chemoinformatic method to study the structure/clastogenic activity (chromosomal aberrations) and identify structural alerts related to genotoxicity.The objectives are: i) analyse the theoretical fundament of the QSAR methodologies, in particular the TOPS-MODE approach, ii) show evidences of virtual screening using this approach and a validated clastogenic model, which encodes sub-structural topological information from molecular descriptors based on Graph Theory.

Methods
The current study was focused on natural phenolic compounds present in foods.They formed a series for external prediction.The screening was carried out following a sequence shown in Figure 1.

Preparation of databases (DB) of chemical compounds presenting reported prooxidant activity.
The search for compounds with pro-oxidant activity was performed on the basis of reports in electronic sources of scientific information, using the keywords: pro-oxidant (+ food, + chronic diseases, + oxidative damage), structureactivity relationship, functional food, healthy foods, antioxidant, oxidative stress and QSAR study.The selection of the compounds were carried out taking into account the following criteria: a) Inclusion: the presence of scientific reports, regardless of the method of determination of the activity and experimentation system; a chemical entity that was recognized by a CAS b) Exclusion: chemical entities as radicals, ions, salts, macromolecules and/or isomers.

Development of QSAR study using the TOPS-MODE approach.
All the chemical compounds selected in the different DB (flavonoids, cinnamic acids and coumarins), were drawn in ChemDraw software version 10.0 (7), allowing the obtainment of the Smiles for each compound.Molecular descriptors were calculated (spectral moments) using software MODESLAB for Windows available on http://www.modeslab.com(8).
The structural alerts were identified from the generation of new compounds congeners of each of the studied families, representing pro-oxidant activity structures capable of damaging the DNA.Calculations were made of the contributions of the fragments to a sample of compounds belonging to the identified alerts following the calculation procedure implemented in the TOPS-MODE "fragment contributions" implemented in MODESLAB.

Results and Discussion
QSAR studies and TOPS-MODE approach.QSAR methods are based on the physical, physicochemical, chemical and biological properties of the organic compounds, depending ultimately from their molecular structure.Its own philosophy makes possible to predict activities of new bioactive candidates, reducing the cost of experimental techniques (7).These methods can be applied to different sciences, among them food science, whenever it is important to reach one of the two main objectives of these studies: a) provide a way to estimate the studied activity/property of new compounds with a acceptable accuracy and b) obtain a structural interpretation in terms of the studied activity.
Generally, there are three basic components, which characterize these studies: i) the descriptors of molecular structure; ii) the property or biological activity and iii) the statistical technique used to establish the relationship.TOPS-MODE is a molecular design approach with a graph-theoretical base.As cutting phenomenological needed to reach quantitative relationships using sample data that are processed statistically.This makes a general method, which requires no knowledge of the mechanisms involved in a process to describe it.This approach operates on the adjacency matrix between links in the molecule, excluding the hydrogen atoms.One of the topological descriptors used are so-called spectral moments (µk) of the matrix which were proposed by Estrada et al. (9).Some ideas in spectral moments have been generalized and extended to biomolecules by Estrada.
The theoretical background about the spectral moments of bond adjacency matrix, has been described in many papers (9).
The spectral moment of order k is defined as the trace of the k-th power of the matrix E and its symbol is µ k .Also, the trace is defined as the sum of the values of the main diagonal.One advantage is that µ k can be expressed as linear combination of the number of occurrences certain fragments (sub-graphs).MODESLAB facilitates the calculation of global spectral moments selecting any weight (weight) for different bonds in the molecule, while local spectral moments for single bonds or a defined fragment in the molecule can be estimated in the same way after select the link/excerpt.The results are in file format, ready to be manipulated by statistical packages such as STATISTICA (10).
In the current work, it was selected the model of structure-activity clastogenic relationship proposed by Estrada et al. (11), which has been internally and externally validated (11-13) (Equation 1). . (1) The clastogenic activity (AC)-relationship model was conformed by 372 organic compounds including known carcinogens, drugs, food additives, agrochemicals, cosmetic materials, medicinal products, and household materials (11).Here the letter Ω is used to indicate that the corresponding variable in brackets was orthogonalized respect to the rest of the variables included in the model.The classification model obtained is given below together with the statistical parameters of the linear discriminant is the squared analysis, where λ is the Wilks' statistics, D 2 Mahalanobis distance and F is the Fisher ratio (Wilks´-λ= 0.629; F(14.194)=8.148;D 2 =2.353; p<0.0000).There were formed three external DB that allowed externally validate the model (Equation 1).These DB have been published (14,15) and a new report was also performed (Table 1).

Structural alerts generation and bond
The main foods and beverages rich in bioactive compounds of phenolic nature are: tea, wine, coffee, grape, apple, strawberry, artichoke, broccoli, carob and cocoa (16,17).They can be divided according to their basic structure in at least ten different classes.The most described are flavonoids (18), accounting that they represent twothirds of the phenols from the diet (19).Flavonoids are formed by two phenyl rings (A and B) joined through a pyran ring (heterocyclic ring C) (Figure 2).We analyzed the functional group effect for flavonoid subclasses with reported pro-oxidant activity and compared them with 12 "none reported" flavonoids and 16 "designed compounds" (14).Benzoic acid derivatives are phenolic acids with pro-oxidant report (15).To identify the structural alerts, different aspects were analyzed: a) effect of the amount of hydroxyl groups on the benzene ring, b) effect of the amount of methoxy groups in the ring, c) effect of the position of the hydroxyl and methoxy groups in the ring (ortho, -meta and -para), d) effect of hydroxyl and methoxy groups isolated or accumulated, e) effect of the replacement of hydroxyl by methoxy substituents on the benzene ring.It was identified that the main structural feature of the benzoic acid derivatives which are associated with a maximum clastogenicity is the amount of hydroxyl and methoxy groups.Methoxy groups will further increase clastogenic activity regardless of how they are near and far apart.
Coumarins are also found in fruits, green tea and other plants (20).The natural coumarins were the elements for the third prepared DB (21).As shown in Table 1, most of these compounds were predicted as inactive.The virtual screening performed with the three external DB (flavonoids, benzoic acid and coumarin derivatives) was validated through the percent of good classification (14,15).In addition, the analyzed alerts were corroborated from calculation of fragments.Table 2 shows an example of calculation of fragments.Methoxy and hydroxyl substituents contributed positively to the activity, aspect corroborated analyzing the fragment contribution.As said before, the contribution of methoxy groups is quantitatively higher than the hydroxyl ones.

Conclusions
Regarding possible clastogenic activity of different compounds described as prooxidants, it can be summarized that the main structural features, which are associated with the clastogenicity effect, are the amount of hydroxyl and methoxy groups present in the structure.These structural modifications represent an indicator of the toxicity and also a good strategy for the design of new derivatives, which do not exhibit this activity.This study represents an interesting tool to better understand the properties of natural substances in food.Therefore, it is also helpful in the development of functional foods or nutraceuticals and drug design.

Figure 1 .
Figure 1.Methodology used for the QSAR studies.
contribution.It was possible to study the contribution of different fragments at a substructural level, allowing definition of the structural alerts for the flavonoids, phenolic acids and naturals coumarins.They are compounds with low molecular weight, which share a chemical skeleton (C6-C3-C6), (C6-C1), (C6-C3) skeleton, respectively (Figure 2).

Figure 2 .
Figure 2. Numeration system of the flavonoids, phenolic acids and coumarins.
the bond contributions.The criteria for maximal clastogenicity of pro-oxidant flavonoids were detected as: a) the presence of the 3-hydroxyl group in the C ring, b) the hydroxyl groups at the B ring, principally in ortho position and agglutinated, c) the presence of the methoxy and phenyl groups, d) the presence the 4-cabonyl group at the C ring and e) no presence of 2,3-double bond in the C ring(14).

Table 1 .
Classification Obtained from the Virtual Screening.
a classification generated by LDA (Lineal Discriminant Analysis), where G_2: 1 corresponds to the group of active compounds and G_1:-1 to the group of inactive compounds; b posterior probability.
Methoxy and hydroxyl groups stimulate the molecule to be active (OCH 3 > OH).The presence of a methoxy group at positions 5 y 9 of the coumarin ring (compounds bergapten y xanthotoxin, respectively) is an important structural factor.The presence of alkyl chains inactivates the molecule, caused mainly by the presence of vinilic and methyl groups (compound oroselone y columbianetin, respectively).

Table 2 .
Representative Examples of the Bond Contributions Obtained from the TOPS-MODE Classification Model.