Molecular Markers to Access Genetic Diversity of Castor Bean: Current Status and Prospects for Breeding Purposes

The spurge family (Euphorbiaceae) is one of the most diverse and numerous clades of the angiosperms, including several species of great economic importance as rubber tree (Hevea brasiliensis), cassava (Manihot esculenta), and some oil seed crops, as candlenut (Aleurites moluccana), physic nut (Jatropha curcas) and castor bean (Ricinus communis). Castor bean, the single member of the African genus Ricinus (subfamily Acalyphoideae), presents a wide variation regarding vegetative traits such as leaf and stem colors, number and size of leaf lobes and presence of wax covering the stem (Popova & Moshkin, 1986; Savy-Filho, 2005; Webster, 1994; see Fig. 1). Depending on the environmental conditions, even the vegetative habit may vary, although it is more likely in a shrubby form (Webster, 1994). However, the most conspicuous variability is related to reproductive characters, as color shape and size of seeds, number of flowers per raceme, peduncle length and fruit dehiscence (Figs. 1 and 2) as described by Popova & Moshkin (1986).


Introduction
The spurge family (Euphorbiaceae) is one of the most diverse and numerous clades of the angiosperms, including several species of great economic importance as rubber tree (Hevea brasiliensis), cassava (Manihot esculenta), and some oil seed crops, as candlenut (Aleurites moluccana), physic nut (Jatropha curcas) and castor bean (Ricinus communis). Castor bean, the single member of the African genus Ricinus (subfamily Acalyphoideae), presents a wide variation regarding vegetative traits such as leaf and stem colors, number and size of leaf lobes and presence of wax covering the stem (Popova & Moshkin, 1986;Savy-Filho, 2005;Webster, 1994;see Fig. 1). Depending on the environmental conditions, even the vegetative habit may vary, although it is more likely in a shrubby form (Webster, 1994). However, the most conspicuous variability is related to reproductive characters, as color shape and size of seeds, number of flowers per raceme, peduncle length and fruit dehiscence (Figs. 1 and 2) as described by Popova & Moshkin (1986).
Castor oil, which has a long history of use for medicinal purposes (see Gaginella et al., 1998), has been considered a promising raw material for the production of renewable energy in tropical countries. Besides, castor bean has been traditionally cultivated for the production of lubricants and paints (see Berman et al., 2011;Ogunniyi, 2006;Scholz & Silva, 2008). Mainly in the semi-arid regions, a xerophytic-like as castor bean can be grown in areas with higher farming limitations, not intended for other crops (Ogunniyi, 2006). Furthermore, the biodiesel derived from castor oil has several advantages over other vegetable oils due to the presence of 5% more oxygen, low levels of residual phosphorus and carbon, high cetan number, solubility in alcohol and absence of aromatic hydrocarbons (Ogunniyi, 2006;Scholz & Silva, 2008). The high viscosity of the castor oil is due to the high percentage of ricinoleic acid (a hydroxycarboxylic acid), which is a limiting factor for the use of pure castor bean diesel in the engines (Pinzi et al., 2009). However, the employment of this biodiesel blended with petrodiesel can be exploited in regions with severe winter. This is a highly www.intechopen.com recommended procedure because of its low freezing point and the lubricant power afforded by castor oil, as well as all other advantages associated to the utilization of renewable energy resources (see Berman et al., 2011;Demirbas, 2007;Ogunniyi, 2006;Pinzi et al., 2009;Singh, 2011). Fig. 1. Different raceme types observed in castor bean accessions held by Embrapa Algodão (Brazil). Inflorescences of a cultivar ('BRS Nordestina') and a dwarf lineage ('CSRD-2') are shown in (a) and (b), respectively. Observe in (b) a pistillate flower with red stigmas in the left-superior corner and a multi-staminate flower in the right-inferior corner. A large raceme, typical of the cultivar 'BRS Energia', is shown in (c). Racemes with long peduncles of the accessions  and (e), respectively. Compact raceme characteristic of a castor bean subespontaneous population from northeastern Brazil, (Buíque -PE) (f), and of the cultivar 'BRS Paraguaçu' (g). Spineless fruits of the lineage 'BRA 10740' in (h).
The development of new cultivars with traits of interest and adapted to specific microclimates is only possible when there is available knowledge about the extant genetic diversity of the species (Gepts, 2004). Despite the recent publication of the castor bean genome , little is known about the actual genetic diversity of this species. Genetic diversity analyses of castor bean germplasm collections worldwide have showed low levels of variability and lack of geographically structured genetic populations, regardless of a marker system used (e.g. Allan et al., 2008;Foster et al., 2010;Qiu et al., 2010). Thus, the remarkable phenotypic variation observed in castor bean do not seem to reflect a high genetic diversity, similarly to the reported for physic nut, in which variations in epigenetic mechanisms may have a more important role in the diversity of the species than genetic variability per se (Yi et al., 2010). In this work, we provide a review on the current status of genetic diversity analysis in castor bean. Moreover, we present the results of our data mining efforts on screening for genomic simple sequence repeat (SSR) primers in addition to the previously reported expressed sequence tag-SSR (EST-SSR) sequences. We performed genotyping among castor bean accessions with inter simple sequence repeat (ISSR) primers from the University of British Columbia (UBC) set and primer combinations from amplified fragment length polymorphism (AFLP) Starter Primer Kit (Invitrogen, Carlsbad, USA) for amplicon generation. Furthermore, we have tested the characterization of distribution of large microsatellite clusters along the castor bean chromosomes by means of fluorescent in situ hybridization (FISH). Our results in addition to compiled data from literature will be highly useful for breeding programs, providing information about genetic diversity and tools for genetic mapping in this important crop.

Diversity analyses with molecular markers in castor bean
Several molecular markers are available for germplasm characterization and identification of cultivated plant varieties. The profile analysis of multilocus DNA markers, also called DNA fingerprinting, is a potential source of informative marker bands, which allows a reliable differentiation among cultivars (Tanya et al., 2011), wild populations (Andrade et al., 2009), species and even related genera (Simon et al., 2007). Additionally, molecular markers are very stable, in contrast to morphological characters, which may be influenced by environmental factors and having continuous variation and high plasticity (Weising et al., 2005).
Unlike other important oilseed crops, as oil palm (Elaeis guineensis), soybean (Glycine max), sunflower (Helianthus annuus), and some Euphorbiaceae species, as cassava and rubber tree, castor bean diversity is still poorly characterized by means of molecular marker systems (see Billotte et al., 2010;Feng et al., 2009;Sayama et al., 2011;Sraphet et al., 2011;Talia et al., 2010). In fact, the species had been overlooked until the late 2000s, when analyses regarding genetic diversity of germplasm collections were first published (see Allan et al., 2008). However, castor bean was the first member of the Euphorbiaceae family with the whole www.intechopen.com genome published , a fact that will be of great importance for characterizing the genetic base of the species.

Genetic diversity characterization with dominant markers
AFLP, ISSR and random amplified polymorphic DNA (RAPD) are among the most widely used marker systems in DNA fingerprinting. Although the differentiation between allelic types is hampered in the output data from these molecular markers, many features have made them quite widespread, such as low costs and the possibility of generating a large amount of informative marker bands in a short time. Besides, there is no need for prior knowledge about DNA sequences of the studied organism when this kind of molecular markers is used (Weising et al., 2005).
As mentioned above, just a few analyses were carried out using dominant markers to access polymorphisms among castor bean accessions. Despite the great potential of ISSR and AFLP in characterizing the genetic diversity of several crops Weising et al., 2005), it is noteworthy that these powerful marker systems have been underused in genetic diversity analyses with this species. To the best of our knowledge, the only study in which AFLP markers were used to describe the genetic diversity of the species was performed by Allan et al. (2008). In a preliminary application of 16 AFLP primer combinations, these authors reported low levels of variability among 14 castor bean genotypes from different regions of the world. Thereafter, the authors selected the three most polymorphic primer combinations and applied them to a wider number of accessions (41 in total) that indicated weak geographically structured populations among germplasm collections of the five continents. These results were quite similar to those obtained with genomic SSR markers in the same work (discussed below), and this was the first indicative of a narrower genetic base than first thought for the species. However, due to the small number of generated marker bands (only 119), the low polymorphism sampled by Allan et al. (2008) could be an underestimation of the factual levels of information that this marker system might reach within the species.
Differences regarding polymorphism levels of AFLP markers have been reported for many other crops (Weising et al., 2005). The average percentage of polymorphic markers obtained by Tatikonda et al. (2009) for physic nut, for instance, was higher than that found by Pamidimarri et al. (2010) (88.2% and 61.2%, respectively). Additionally, even when the same primer combination (E-ACA + M-CAT) was used, different polymorphism levels were obtained [82.8% by Tatikonda et al. (2009) and 68.1% by Pamidimarri et al. (2010)]. This fact may be occurred due to differences in genetic diversity levels between the two sampled germplasm collections.
Results concerning lack of genetic structure through geographic distribution, as first indicated with AFLP markers, have been obtained by other research groups. Analyzing 32 castor bean lines from different countries using an association between RAPD markers and quantitative phenotypic traits (volume and weight of seeds, root length, time of germination and first flowering), Milani et al. (2009) obtained a certain degree of convergence between the resultant clusters regarding data from these two different approaches. Once again, accessions from different origins were put together, confirming the previously reported lack of geographically structured clusters among castor bean genotypes.

www.intechopen.com
Afterwards, Gajera et al. (2010) have published a wider analysis with these low cost dominant markers, in which 200 RAPD primers and 21 ISSR primers were tested for generation of informative characters among Indian castor bean lines, and thus 30 and five, respectively, were selected for further polymorphism screening using 22 genotypes. Like in the previous RAPD analysis, the authors have found a remarkable level of polymorphism with both marker systems, in particular with RAPD analysis, in which 80.1% of the 256 marker bands were polymorphic. However, the lower variability obtained with ISSR markers by Gajera et al. (2010) may be occurred due to the smaller number of used primers (only five) compared to the RAPD approach. In general, ISSR markers tend to be more polymorphic because of its target site in the genome. Microsatellite sequences are known as one of the most variable and widespread types of repetitive DNA (Edwards et al., 1991;Weising et al., 2005). Thus, these results obtained by amplifying ISSR markers, which is a p o w e r f u l t o o l t h a t h a s b e e n w i d e l y u s e d to detect polymorphism either among crop cultivars or among wild populations of plants (Reddy et al. 2002), may not reveal the real polymorphism level in castor bean. For walnut (Juglans regia), for instance, Christopoulos et al. (2010) found a higher level of polymorphism with ISSR markers (82.8%) than the reported values by Nicese et al. (1998) for RAPD markers (25%).
Therefore, in order to increase the repertory of available ISSR markers for diversity analysis of castor bean, we have tested 60 primers from the UBC set (Table 1) for amplification among three genotypes of the species ('BRS Nordestina', 'BRS Paraguaçu' and 'Epaba 81'), using the protocol described by Bornet & Branchard (2001). PCR conditions of cycle intervals and annealing temperatures were used as described by Amorim (2009). Extraction and purification of genomic DNA were according to the methodologies described by Weising et al. (2005;CTAB protocol I) and Michaels et al. (1994), with minor modifications. Our results have revealed a preference for amplification of regions with AG/CT repeats in sampled primers and annealing temperatures used (Table 1). Most primers directed to AT microsatellite repeats have not amplified any DNA fragment with PCR conditions herein referred, although there is a high density of these regions in castor bean genome (as presented below). Possibly, the relatively high annealing temperatures [see, Gajera et al. (2010) and Tanya et al. (2011)], which were used to increase the PCR stringency, may have affected the amplification capability of the primers. However, the higher specificity of the DNA amplification, which was propitiated by this measure, ensures the validity of generated markers and the reproducibility of results.
Additionally, we carried out amplification tests with all 64 primer combinations from the previously cited AFLP kit (Table 2) with genomic DNA from the same used castor bean genotypes in the ISSR assay, following the protocol recommended by the manufacturer. All combinations of primers MseI-CAA, MseI-CAG, MseI-CAT and MseI-CTA successfully amplified fragments among used accessions and can be used for further analyses regarding genotyping and characterization of castor bean germplasm collections, complementing the possibilities of markers to be used in genetic diversity studies in the species. On the other hand, the primer MseI-CTG only worked when used with the primer EcoRI-AAC. These novel AFLP markers that can be used to access polymorphisms among and within castor bean germplasm collections will certainly be of great help during the process of genetic improvement of the species. Primers that were either successfully amplified (+) or not (-) using the annealing temperatures (Ta) indicated by Amorim (2009) for cowpea (Vigna unguiculata) are indicated.

Genome sequencing and the use of co-dominant markers
As mentioned above, sequencing castor bean genome    markers for characterizing the genetic variability because of their capability to distinguish allelic types providing valuable information about the heterozigosity state of a given species . However, there are factors that may restrict the use of these markers, as the high cost and the demanded time to make the DNA sequences available (Weising et al., 2005).
Among the most used co-dominant marker systems in evaluating plant diversity are microsatellite markers (or SSR) and single nucleotide polymorphisms (SNP) . While the former have been widely employed since its publication in the early 1990s (see Morgante & Olivieri, 1993), SNP markers are becoming more popular as information about the genomes of plant species are increasing (e.g. Amar et al., 2011;Dong et al., 2010;Li et al., 2010).
In a worldwide-range germplasm characterization, Foster et al. (2010) evaluated the genetic diversity among 488 castor bean accessions from 45 countries using 48 SNPs, observing a molecular variance far higher within populations (74%) than among populations (22%) and countries (4%). These results also confirmed a very weak geographic structuration among castor bean populations, confirming previous results obtained with dominant markers (Allan et al., 2008;Milani et al., 2009).
Even within a minor geographic range, among 188 castor bean accessions from 13 wild populations from Florida (USA), distribution patterns of SNP alleles were not clear and indicated extensive homogenization either due to a high gene flow or because of multiple introductions (see Foster et al., 2010). Despite the great number and the wide distribution of sampled germplasm collections and better marker coverage compared to the previous report (Allan et al., 2008), the genomic coverage of the 48 SNPs described by Foster et al. (2010) was quite lower than the coverage of the soybean genome achieved by Li et al. (2010) who used 554 SNPs and 303 accessions. Chan et al. (2010) estimated that more than half of the castor bean DNA consists of repetitive sequences, and SSR motifs are supposed to be widely spread through the species genome. In the last years, microsatellite markers have been increasingly employed to characterize genetic diversity within castor bean germplasm collections (Allan et al., 2008;Bajay et al., 2009Bajay et al., , 2011Qiu et al., 2010) although still there is not an estimate of the extent of SSRs in the whole genome of the species. Qiu et al. (2010), analyzing microsatellite repeats associated to expressed sequence tags (ESTs), have reported a higher density of SSRs (excluding monorepeats) in castor bean genic sequences (1/5.0 kbp) than the average described for other crops, such as maize (Zea mays) with 1/8.1 kbp, tomato (Solanum lycopersicum) with 1/11.1 kbp) and cotton (Gossypium hirsutum) with 1/20.0 kbp (Cardle et al., 2000), for instance. Qiu et al. (2010) suggested that such a high SSR density in ESTs may be associated to the small genome size of castor bean (~350 Mbp; Chan et al., 2010).
In order to search for occurrence and distribution of microsatellite repeats across the whole genome of castor bean, we have run the SciRoKo (Kofler et al., 2007) software, using the genome assembly available at http://castorbean.jcvi.org/downloads.php. Excluding 18,718 mononucleotide repeats (ca. 97% comprising poli-A SSRs), more than 95,000 SSR sequences were revealed in the analysis (Table 3), with one microsatellite occurring each 18.4 kbp, a density far below the described for EST-SSRs by Qiu et al. (2010), just as have been reported for plants in general (see Morgante et al., 2002).   were quite different, in which the major proportion of the repeats were constituted by trinucleotide motifs (61.06%), followed by di-repeats (32.02%), tetra-repeats (3.63%), pentarepeats (1.01%) and hexa-repeats (2.28%). This higher percentage of trinucleotide repeats (as well as the hexa-repeats with a higher proportion than the penta-repeats) in EST-SSRs might be related to the function of this type repetitive DNA within transcribed regions. It is in agreement with results reported for other plants (Morgante et al., 2002). Due to the structure of tri-repeats and hexa-repeats, these types of SSR may change in the number of repetitions without affecting the reading frame of the gene (Metzgar et al., 2000). The general occurrence pattern of specific microsatellite motifs has diverged between EST-SSRs  and genomic SSRs (Table 3). While Qiu et al. (2010) observed the AG-based dinucleotide repeats as the most frequent in the EST-SSRs (22.29%), our results showed that the AT mofit was the most abundant in genomic SSR sequences of castor bean (20.23%), as described by Morgante et al. (2002) for Arabidopsis thaliana. Likewise, we have found that AAT was the most prominent tri-repeat motif within genomic microsatellites (13.39%), as the AAG motif was more frequent in transcribed regions (14.35%; Qiu et al., 2010).

SSR motif
SSR motifs are also abundant in heterochromatic regions, which are quite difficult to sequence because of the extremely repetitive nature of this class of chromatin. In this case, cytogenetic tools as fluorescent in situ hybridization (FISH) with SSR-like probes in mitotic chromosomes may be helpful in characterizing distribution and polymorphisms of large microsatellite repeats along the chromosome set (Cuadrado & Jouve, 2010). Cuadrado & Jouve (2007), for instance, analyzing the distribution pattern of trinucleotide repeats in barley (Hordeum vulgare) chromosomes by means of FISH, observed a preference of these large microsatellite clusters for heterochromatic regions, except for the ACT-based probe. In relation to the distribution of heterochromatin through castor bean chromosomes, large heterochromatic blocks have been related, including all pericentromeric regions (Jelenkovic & Harrington, 1973;Paris et al., 1978;Vasconcelos et al., 2010), in which the SSR sequences may be important constituents. Thus, in the present work an in situ hybridization was performed with the synthetic oligonucleotide (TGA) 6 as probe, aiming to test the potentiality of using large microsatellite clusters to characterize castor bean accessions. Cell preparations, image documentation and FISH conditions followed Vasconcelos et al. (2010); probe preparation was done according to Cuadrado & Jouve (2007).
In contrast to the observed for barley chromosomes, hybridization signals of the TGA-based probe, which is directed to the same target of the (CAT) 5 oligonucleotide used by Cuadrado & Jouve (2007), were observed in all chromosomes, mostly associated to GC-rich heterochromatin [evidenced by cromomicin A 3 (CMA)] (Fig. 3). While the chromosome E presented a non-heterochromatic site of the sampled repeat, the chromosomes B and D were the only without a pericentromeric site (Fig. 3). Taking into account the successful hybridization of the oligonucleotide in castor bean chromosomes, it is clear that the use of FISH to analyze microsatellite distribution through the species genome may be a very useful approach.
In characterization of castor bean germplasm collections through SSR markers, all studies conducted so far indicated congruent results with other marker systems (Allan et al., 2008;Bajay et al., 2009Bajay et al., , 2011Qiu et al., 2010). As mentioned above, sampling 41 genotypes from 35 countries, Allan et al. (2008) first indicated a relatively narrow genetic diversity in the species by using only nine genomic SSR markers and three AFLP primer combinations. Although SSR markers have yielded more polymorphism than AFLPs in the same analysis, both marker systems led to similar results of molecular variance indexes (Allan et al., 2008). Subsequently, Bajay et al. (2009Bajay et al. ( , 2011 developed and tested a total of 23 SSR markers from a microsatellite-enriched library in two subsequent analyses of genetic diversity within two Brazilian castor bean germplasm collections. Similarly to previous results, these two studies have revealed relatively low heterozigozity levels among castor bean genotypes. After searching for SSR markers derived from the ESTs of castor bean, Qiu et al. (2010) selected 118 primer pairs (out of 379) that were used to estimate relationships among 24 accessions. The proportion of polymorphic amplicons (41.1%) generated in the analysis can be considered as satisfactory, taking into account that Raji et al. (2009) observed 50.6% of polymorphism using EST-SSR markers to analyze genetic diversity among cassava, a crop with a more recent history of cultivation, in comparison to castor bean. Likewise, PIC and heterozigozity values observed for castor bean EST-SSRs were relatively high and quite similar to those in cassava (Raji et al., 2009). In contrast to the results observed by Allan et al. (2008) and Foster et al. (2010), some degree of geographically structured clusters was observed among the accessions used by Qiu et al. (2010), although the authors recognized the small number of sampled genotypes, which may have still hindered the results.
It is clear that there is a great difference in the availability between EST-SSRs and genomic SSR to evaluate the extant genetic diversity in castor bean accessions. Albeit less frequent, the genomic microsatellites are less likely to suffer mutations with deleterious effects than the EST-SSRs, a fact that makes genomic SSRs more prone to polymorphisms (see Kalia et al., 2011; www.intechopen.com Varshney et al., 2005;Weising et al., 2005). Thus, in order to provide a wider range of genomic microsatellite markers for castor bean genotyping, we have performed a data mining through the whole genome of the species by running the online software WebSat  to locate SSR motifs (minimum size of 30 nucleotides, excluding all microsatellites composed by mono-repeats, either simple or compound) and design primers using default settings.
Covering more than 11 Mbp of the castor bean genome (approximately 3%), a total of 134 primer pairs were herein designed (Table 4). Despite the low genomic coverage, especially if compared to the work carried out by Cavagnaro et al. (2010), in which the whole genome of cucumber (Cucumis sativus) was scanned, the analyzed fraction of the genome was close to  Table 4. SSR primer pairs obtained through data mining in a fragment of the castor bean genome sequence. Sequences, melting temperature and estimated allele sizes (EAS) of primer pairs (F: forward; R: reverse) are indicated in the panel.
the value obtained by Qiu et al. (2010) for genic sequences (13.68 Mbp -approximately 4% of the genome). Moreover, our stringent criteria for selection of microsatellites to be used sharply reduced the final number of annotated primer pairs. Without the adopted restriction of 30 nucleotides and using default parameters of the software, the total number of scored microsatellites increased from 134 to 696 (data not shown). Therefore, due to the higher number of repetitions of the targeted microsatellites, these molecular markers may be more liable to polymorphisms than the smaller SSRs.

Conclusion
Despite the recent efforts to characterize castor bean germplasm collections, there are relatively few molecular markers available. Curiously, the use of widely spread and lowcost anonymous markers, as RAPD and ISSR, in genetic diversity analyses is still problematic and insufficient. Even the powerful and reliable AFLP marker system was poorly used to describe the extant polymorphism in castor bean germplasm collections. However, still there is a need for selection of robust molecular markers able to distinguish accessions and/or for association with phenotypic traits of interest such as oil production, resistance to abiotic stress and pathogens. Thus, our results, in addition to compiled data from literature, will be very useful for breeding programs by providing important information about genetic diversity of this important crop. Furthermore, our efforts in describing novel molecular markers certainly should help the development of the first genetic map for castor bean. Modern plant breeding is considered a discipline originating from the science of genetics. It is a complex subject, involving the use of many interdisciplinary modern sciences and technologies that became art, science and business. Revolutionary developments in plant genetics and genomics and coupling plant "omics" achievements with advances on computer science and informatics, as well as laboratory robotics further resulted in unprecedented developments in modern plant breeding, enriching the traditional breeding practices with precise, fast, efficient and cost-effective breeding tools and approaches. The objective of this Plant Breeding book is to present some of the recent advances of 21st century plant breeding, exemplifying novel views, approaches, research efforts, achievements, challenges and perspectives in breeding of some crop species. The book chapters have presented the latest advances and comprehensive information on selected topics that will enhance the reader's knowledge of contemporary plant breeding.

How to reference
In order to correctly reference this scholarly work, feel free to copy and paste the following: