Validation of Structures of Novel Eudesmane Sesquiterpenes Using Scatter Plots

Aim: This study explores the potential of scatter plots as a tool in validating proposed structures for novel Eudesmane Sesquiterpenes. Methodology: Substituents on the skeletons of several Eudesmane compounds were coded and plotted against the 13 C chemical shift values for each Carbon position on the skeleton (C 1 -C 15 ).


INTRODUCTION
Sesquiterpenes are formed from countless biogenetic pathways and therefore produce several types of carbon skeletons. This makes the elucidation of their structures very challenging. The biological activities exhibited by sesquiterpenes (including compounds that are insect growth regulators, antifeedant, antifungal, antitumor, antibacterials) makes relating their structures to function even more imperative. The current study focuses on Eudesmane-type compounds which are one of the most representative skeletons of sesquiterpenes. This class of compounds has been the subject of numerous phytochemical, pharmacological and synthetic studies [1][2].
The structure of any natural product is conventionally divisible into three sub-units: (i) the skeletal atoms; (ii) heteroatoms directly bonded to the skeletal atoms or unsaturations between them; and (iii) secondary carbon chains, usually bound to a skeletal atom through an ester or ether linkage [3]. Procedures that could be employed for the identification of the skeleton and substructures present in a compound have been previously described [4][5][6][7]. Artificial Neural Networks (ANNs) methods have been reported to give fast and accurate results for identification of skeletons and for assigning unknown compounds among distinct fingerprints (skeletons) of aporphine alkaloids [8]. In a previous work, we have shown that Generalized Regression Neural Networks (GRNN) could predict substituents types and positions on Eudesmane-type sesquiterpenes [9]. When the chemical shift values proposed for each of the fifteen (15) Carbon positions on the Eudesmane skeleton is used as input for the GRNN, this procedure could be used in validating the structures of novel Eudesmanes. In the current work, we use scatter plots to determine the 13 C chemical shift ranges (for the 15 carbon atoms on the Eudesmane sesquiterpene skeletonshown in Fig. 1) over which different substituent types may exist. We discuss its potential application in validating structures proposed for natural products using Eudesmane sequiterpenes as reference.

METHODOLOGY
The structural (skeletal) 13 C data, substituents and stereochemical information of 325 compounds (out of 350 compounds) reviewed and published by Olievera et al. [1] was used in this study. Twenty-five of these compounds were left out owing to their structural complexity. This information can be extracted from data of Eudesmane sesquiterpenes published in literature by isolating 13 C values of the skeletal (carbons) from those of the substituents.
Each substituent type (on first encounter) was assigned 3 number codes. These codes serve to identify the substituent while also taking into account its possible stereochemistry (α or β) in various positions of the skeletons in other compounds.

Fig. 1. The eudesmane skeleton
Carbon positions without substituents were assigned a code of 0 while α and β positions without substituent(s) were assigned codes of 1 and 2 respectively. For example, OH group was given a code of 3, an α-OH is given a code of 4 while a β-OH was assigned a code of 5. (The different substituent types and the corresponding codes assigned to them are shown in Appendix 1). Thereafter, 30 columns containing, alternately, all the possible 13 C chemical shift data for each of the 15 positions (C 1 -C 15 ) on the Eudesmane skeleton for all the 325 compounds and the corresponding codes for the substituents attached to each position in each of the compounds, were prepared on an Excel sheet. A scatter plot of the codes (of the substituents) against the 13 C chemical shift values for each Carbon position on the skeleton (C 1 -C 15 ) was plotted. From this, the range of chemical shift values (for each Carbon position) over each substituent type may be obtained was determined. Where there are multiple possible substituent types within a particular carbon range, the probability (in percentages) that a substituent would occupy this position was determined relative to the total number of points within the range.

C15
Oliveira et al. [1] described the use of two component programs (TIPCARB and PICKUP) of the system, SISTEMAT, in the search for heuristic rules (practical rules obtained from the experience of specialists, or originated from programs which perform "learning from machine" routines, and are aimed at solving a specific problem). TIPCARB can determine which carbon atom is present in each position on a skeleton whether or not a carbon atom is substituted and the kind of substituent. After the position and types of substituents attached to each carbon atom have been defined, the fragments, denominated substructures, are coded in the PICKUP program that performs the search of the database for the chemical shift range for 13 C data of the carbons in the substructure. The authors then utilized the PICKUP program to determine several chemical shift ranges that characterize several substructures present in eudesmanes. A summary of the substituent types that may be obtained over different 13 C ranges for each of the fifteen (15) positions on the Eudesmane skeleton using scatter plots are presented in Table 1.  An assigned substituent to a position on the skeleton of a novel Eudesmane compound would be deemed wrong when the 13 C NMR chemical shift value assigned to its position on the skeleton does not fall within the correct chemical shift range obtained for that substituent type (from the scatter plot). The probability values (given in parenthesis) are an indication of the likelihood that the substituent would be found in that position on the skeleton of an unknown compound. The accuracy of the carbon ranges in validating the substituents proposed for each carbon position in an unknown eudesmane compound would depend on the degree of representativeness of their skeletal and/or substituent types among those used in plotting the graphs.

CONCLUSION
With the availability of sufficiently broad database on the 13 C chemical shift values of the carbon atoms on the Eudesmane skeleton, scatter plots may be a useful complementary tool in the elucidation of structure of this class of compounds.