APPLICATIONS OF MOLECULAR MARKERS AND GENOMIC TOOLS IN PEARL MILLET

Thisreviewsummarizesthemainmolecularmarkersandtheirapplicationson pearl milletaswellasasummaryof the discoveriesonitsreferencegenome.Molecularmarkers,unlikemorphologicalandbiochemicalmarkers,arehighlypolymorphicandneutral. Theirgreatliabilitycomesfromthefactthattheydirectlyconcern the DNA.Theyhavebeenwidelyusedonpearl millet,rangingfromlow andmedium-throughputtohigh-throughput markers, targetingspecificregionsorcharacterizinggermplasmat thegenomelevel. Many studiesrelatetomappingusingdifferentpopulationsandhaveidentifiedQTLslinkedtoimportantagronomictraits(floweringtime,tolerancetodrought,to mildew,phosphorus absorption),iron content...Studieshavealsobeenconductedondomesticationsyndromeandshowedtheir importance of genes flowfromwildmilletstocultivatedvarieties. Genotyping-by-Sequencing - a rapid, cost-effective and reduced representation sequencing method â€“ has been used to assess genetic diversity, population structure, LD and heterotic pool formation in pearl millet. A draft genome sequence that can serve as a reference for further development of genomics-assisted breeding is now available. It is an important milestone in generating genomic resources for pearl millet. Annotation of 24,000 genes indicates that enrichment of wax biosynthesis genes providing potential genetic mechanisms for heat and drought tolerance. Althoughmolecularmarkersarewidelyappliedtomillet,geneticandgenomicresourcesarestilllimitedcomparedtootherimportantcereals.However,theavailabilityofacollectionofinbredlinesrepresentativeofgermplasmandareferencegenomeoffernewperspectivesintheimprovement of pearl millet.

Thisreviewsummarizesthemainmolecularmarkersandtheirapplicationso n pearl milletaswellasasummaryof the discoveriesonitsreferencegenome.Molecularmarkers,unlikemorphologic alandbiochemicalmarkers,arehighlypolymorphicandneutral. Theirgreatliabilitycomesfromthefactthattheydirectlyconcern the DNA.Theyhavebeenwidelyusedonpearl millet,rangingfromlow andmedium-throughputtohigh-throughput markers, targetingspecificregionsorcharacterizinggermplasmat thegenomelevel. Many studiesrelatetomappingusingdifferentpopulationsandhaveidentifiedQTL slinkedtoimportantagronomictraits(floweringtime,tolerancetodrought,to mildew,phosphorus absorption),iron content...Studieshavealsobeenconductedondomesticationsyndromeands howedtheir importance of genes flowfromwildmilletstocultivatedvarieties. Genotyping-by-Sequencinga rapid, cost-effective and reduced representation sequencing methodhas been used to assess genetic diversity, population structure, LD and heterotic pool formation in pearl millet. A draft genome sequence that can serve as a reference for further development of genomics-assisted breeding is now available. It is an important milestone in generating genomic resources for pearl millet. Annotation of 24,000 genes indicates that enrichment of wax biosynthesis genes providing potential genetic mechanisms for heat and drought tolerance. Althoughmolecularmarkersarewidelyappliedtomillet,geneticandgenomi cresourcesarestilllimitedcomparedtootherimportantcereals.However,the availabilityofacollectionofinbredlinesrepresentativeofgermplasmandare ferencegenomeoffernewperspectivesintheimprovement of pearl millet.

ISSN: 2320-5407
Int. J. Adv. Res. 9(06), 681-690 682 reproducible research material. Various research institutes have developed molecular markers that have made it possible to have important genetic resources and to establish a reference genome for millet.This article presents a review on the main genetic markers, their applications and the results obtained on this species, followed by a summary presentation of the pearl millet genome.

Main molecular markers
Molecular markers are very polymorphic, neutral, reliable and in almost unlimited numbers. They are independent of the environment and the stage of development of the plant. A molecular marker is a polymorphic locus whose genotype provides information on the genotype of neighboring loci. Allelic variability at the marker locus should ideally not have effects other than those that determine its genotype. Depending on the detection method and the throughput, the molecular markers can be divided into three groups: low-throughput markers, based on restriction fragment length polymorphism (RFLP), medium-throughput markers based on PCR: RAPD, AFLP, SSR..., highthroughput markers based on amplicon polymorphism/restriction fragments (DArT) or nucleotide sequence polymorphism (SNP). Apart from SSRs, low-and medium-throughput markers have now been little used since the advent of high-throughput sequencing. We review them here to take into account that high-throughput nextgeneration sequencing (NGS) tools are not available everywhere.

RFLP
If the DNA of two individuals differs, its digestion by different restriction enzymes will generate fragments of different lengths. Single-stranded DNA fragments are separated by electrophoresis, transferred and fixed to a membrane. A restriction polymorphism is shown by hybridization of complementary molecular probes, followed by an X-ray; the probe-enzyme pair is a marker. RFLP are co-dominant and reproducible markers. However, their detection is long and laborious; the technique is almost no longer used.

RAPD
It makes possible to detect polymorphism in the absence of information on amplified sequences. The amplification is carried out from a random sequence primer that hybridizes with the sequences that are complementary to it. If two sites of hybridization are close and in the opposite direction, PCR allows amplification. If the DNA of two individuals differs at the hybridization sites, a polymorphism of presence / absence of bands can be demonstrated on an electrophoresis gel. Although these dominant markers can detect polymorphic loci, they are neither locus-specific nor reproducible.

AFLP
It is based on the joint revelation of restriction polymorphisms and hybridization of arbitrary primers. The first step is to cleave the DNA by two different enzymes and bind the fragments to complementary adapters. PCR amplification is performed using primers of sequences homologous to those of adapters extended in 3 '. The primers are oriented so that only fragments having a different restriction site at each end are amplified. The bands are visualized by acrylamide gel electrophoresis. AFLP are dominant and non-locus specific. Although they are anonymous, their level of reproducibility and sensitivity are high. However, the method is laborious and is not automatable.

SSR
Simple Sequence Repeats or microsatellites are tandem repeated sequences of nucleotide patterns distributed throughout the genome. Their polymorphism is based on the variation in the number of repetition units. Microsatellites are very polymorphic, codominant and locus-specific; the technique is automatable and does not require a large amount of DNA. These advantages mean that, despite a high detection cost, these markers are still widely used.

DArT
Diversity Array Technology is based on selective amplification of a subset of amplicons obtained by digesting genomic DNA with a pair of restriction enzymes. The fragments are labeled and hybridized to a DNA chip. A collection of previously identified polymorphic amplicons are fixed on the chip. If an individual produces a particular amplicon, it will recognize its complement on the chip and it will result in a positive signal. If the amplicon is absent, no hybridization signal is obtained at this position. The technique is reproducible and does not require a large amount of DNA but the markers are dominant.

SNP
Single Nucleotide Polymorphisms have proven to be the markers of choice in genetics and plant breeding with the emergence of high-throughput sequencing technologies and platforms, accompanied by significant cost reductions.

Definition and characteristics in plants
An SNP represents a polymorphism of nucleotides between alleles at a locus. SNPs are the most important source of variability between individuals at the molecular level. Polymorphism can be a mutation by transversion or transition or an insertion/deletion (InDel). In the case of changes in two nucleotides or InDel-type events of some nucleotides, one speaks of "polymorphisms of simple nucleotides" [3]. Most of the variability in quantitative traits is related to SNPs. Theoretically polymorphism can concern the four nucleotide variants. In practice, SNPs are generally biallelic and variations take place at different frequencies [4]. The weakness of SNP polymorphism due to bi-allelism can be compensated by their higher frequency [5]. In coding sequences, SNPs can be synonymous and not alter the amino acid sequence. If they are not synonymous, significant polymorphisms resulting from differences in amino acid composition can be identified. The frequency of an SNP is given according to the frequency of the minor alleles; an SNP with a minor allele frequency C of 0.40 implies that 40% of a population has the C allele compared to the more common allele which is found in 60% of the population.

Applications
The increasing importance given to SNPs comes from their applications in population mapping and characterization. It is assumed that most of the quantitative traits are due to SNPs not yet identified. Their abundance and the rapid evolution of genotyping technologies make it possible to generate new genetic maps or saturate existing ones. As mentioned earlier, the limited information associated with their bi-allelic nature is compensated by a high frequency; thus, a map of 700 to 900 SNPs was equivalent to a map of 300 to 400 SSRs [6].

Genotyping technologies
Genotyping of SNP involves discriminating alleles at a given locus based on the change in one of the four nucleotides. A large number of techniques and platforms are available [7]. Most technologies are based on hybridization, enzymatic methods or the physical properties of DNA. Those based on DNA hybridization probes complementary to SNP sites include: dynamic hybridization of specific allele, molecular tags and biochips. The principle of a biochip is the convergence of DNA capture on a solid surface, its hybridization and visualization under a microscope. There are two main approaches: first, a single primer extension hybrid upstream of the SNP, a DNA polymerase incorporates a complementary labeled ddNTP; secondly, a specific allele primer extension corresponding to the variant is extended by PCR [7]. The melting temperature and conformation of a single strand of DNA are the basis for allele discrimination in technologies based on the physical properties of DNA. These include the single-strand conformation polymorphism and the temperature gradient on electrophoresis gel. There are many other technologies and platforms including genotyping-by-sequencing (GBS) which uses restriction enzymes and has the advantage of incorporating the simultaneous discovery of SNP and genotyping of individuals.

Main applications in pearl millet
Important molecular genetic resources have been obtained on millet. The main applications are reviewed here.

Genetic map
Several genetic maps have been established using mainly self-fertilized populations or fixed recombinant lines [8]. The first map drawn up with RFLP consisted of markers spread over seven linkage groups (LG) [9]. The analysis of nineteen genotypes with 200 DNA probes from the F2 progeny highlighted the very high genetic variability of millet. This map had a density of 181 loci for a coverage of 303 cM. A second map improved the first one with the combined use of 353 RFLP markers and 65 SSR markers [10]. The number of mapped loci increased to 242 with a coverage of 473 cM. This was followed by a more saturated genetic map from fixed recombinant lines, consisting of 258 DArT markers and 63 SSR markers, with 321 loci [11]. Although this map was larger (1148 cM), the number of markers obtained by population was less. The discovery of microsatellites in expressed regions has made it possible to develop EST-SSR markers in a simple and straightforward way from EST databases. This combination allows the selection of markers according to the physiological or biochemical properties expressed. A consensus map was obtained using a combination of 99 EST-SSR markers on F7 populations of recombinant lines [12]. This study improved coverage and filled in the gaps on previous maps.
The emergence of new high-throughput sequencing technologies represents an important step in genetic mapping. This is how GBS was developed and proven its effectiveness in important crops such as corn and barley [13], [14].

684
In millet, GBS involving a combination of PstI-MspI restriction enzymes resulted in 3,321 SNP markers that were used to establish a more saturated map, with a population of 93 F2 individuals from a cross between the SOSAT-IBL197 parental lines and PS202-14 [8]; 314 SNP markers distributed evenly over seven linkage groups (LG) made it possible to build a map covering a genetic distance of 640 cM, with an average interval of 2.1 cM between markers. To bridge the gap between this map and the previous ones, 19 SSR markers are analyzed on the parents of the population; four of them were polymorphic and made it possible to establish a correspondence of four LG with the map of Qi et al. [10]. It is also with the GBS that the most saturated genetic map was obtained on millet [15]. 99% of it, developed using 150 recombinant lines, has intervals between neighboring markers of less than 5 cM. This map was compared with that of Rajaram et al. [12]: 191 markers were correctly aligned and 16 concordant but on different sites. The existence of a reference genome for millet offers the possibility of aligning the 4900 SNPs identified. Comparing the lengths of LG with the map obtained by Moumouni et al. [8], the distances between markers were more extensive on some LG. The differences observed between a previously established map [15] can be explained by the use of the Kosambi distance, unlike other maps based on the Haldane distance. The adaptability of millet to various constraints makes it a model species for tolerance studies. A collection called PMiGAP (Pearl Millet inbred Germplasm Association Panel) comprising 346 lines was created from about 1000 cultivars, accessions and relatives of mapping population from Africa and Asia. This panel offers new perspectives in the mapping of QTL. Linkage mapping using experimental populations is to track the transmission of QTL alleles and markers over generations. The segregation of markers in the population is related to the genotypic variability according of the phenotypes observed. A phenotypic difference suggests the segregation of alleles into a QTL bound to this marker. Advances in high-throughput SNP sequencing make association mapping a valuable tool increasingly used in plants [16]. It is an alternative or complementarity to linkage mapping. It can be divided into two categories: gene-candidate association mapping and genome-wide association studies (GWAS) that focus on the analysis of several phenotypes in statistically significant relation to SNPs over the entire genome. Here we present some examples of mapping applications of QTLs that are of great agronomic importance in millet.

Flowering time
The duration from sowing to flowering is an important factor in the genotype-environment interaction for yield [17]. The precocity of this trait allows millet to complete its cycle in conditions of water stress. A phenotype-genotype association methodology has been developed with the aim of identifying genes involved in the variation of this trait using SSR and AFLP markers; this study made it possible to associate the PHYC and MADS11 genes with the duration of flowering [18]. These associations have been validated by QTL analyses that suggest that polymorphisms within the PHYC gene may explain the variation in flowering time. Another study on the MADS11 block also identified a PgMADS11 polymorphism associated with the variation of this trait [19]. A mapping of agronomic traits including flowering time in a population of recombinant lines allowed the detection of six flowering-related QTLs on five chromosomes; a major QTL on LG3 was common to the duration of flowering and the height of the plant [20]. In addition, a map showed a QTL related to flowering and resistance to pyriculariosis [15].

Downy mildew tolerance
Caused by Sclerosporagraminicola, this disease causes significant damage to millet. Yield losses can be as high as 80%, with a decrease in the quality of grains and fodder [21]. Tolerance to downy mildew is quantitatively heritable [22]. QTL of high-effect resistance were detected on LG1, LG6 and LG7 against strains from India, on LG4 for strains from Niger and Nigeria and on LG2, LG6 and LG7 for pathogens from Senegal [23]. Some of these QTLs were systematically identified in repeated screenings but none were effective on all strains. Nine putative QTLs, one of which was common to the eight pathogen populations from Africa and Asia, were detected [24]. Mapping based on SSR markers using an F8 population of recombinant lines demonstrated five QTLs with a broad effect on tolerance to three isolates of the pathogen from India, at the seedling stage and in a controlled environment [21]. The introgression assisted by resistance marker saves time compared to the conventional method. An association of ISSR and SCAR markers was used to assist in the improvement of the parental lines of the HHB 67 hybrid for resistance to Downy mildew [17]. This combination of markers was also analyzed on a collection of millet; BAND 863 SCAR-ISSR was identified on all seven resistant lines and was absent on all five susceptible lines [25]. Although linkage mapping identified a number of important QTLs in millet, this method was limited by the resolution provided. Association mapping based on linkage disequilibrium (LD), considers more alleles and gives a greater resolution. Thus, a GWAS was conducted with 77 genotypes to identify SNP markers related to resistance to downy mildew but the results have to be validated [26].

Rust and pyriculariosis tolerance
Mapping using DArTs and SSRs on an F7 population of recombinant lines was performed to identify QTLs related to rust resistance caused by Puccinia substriata var. indica [11]. The map was used to identify a QTL on LG1 believed to confer stable rust resistance in India. A map has been validated for its usefulness in mapping QTLs related to resistance to pyriculariosis (Pyricularia grisea (Cke.) Sacc.) [15].

Drought tolerance
Drought tolerance is a complex trait strongly influenced by the environment and controlled by many genes [27]. Physio-morphological traits include changes in root and vegetative characteristics for efficient water absorption and conservation in the plant. Among the most significant factors, we can note: conductance, stay green, density and root depth, roughness and cuticular thickness, osmotic adjustment, accumulation of stress proteins... Pearl millet has a tillering potential that allows it to compensate for an early water deficit. However, it is difficult to accurately assess drought tolerance. Several studies have focused on the harvest index (HI). High HI has been shown to be related to the ability of tolerant varieties to conserve water during the vegetative phase and use it for grain filling; a QTL associated with phenotype was identified and introduced by backcrossing [28]. QTL related to delayed senescence, an SNP in a gene encoding acetyl CoA carboxylase linked to yield and harvest index, as well as an InDel in a gene bound to a chlorophyll synthesis protein significantly associated with stay green and water stress yield were mapped [29]. A QTL related to reduced transpiration in the vegetative stage was co-mapped with another of terminal drought tolerance [29]. Under salinity conditions, drought tolerance has been observed to be linked to a reduction in salt absorption and accumulation in leaves [30]. A study that aimed to identify the genetic components involved in early drought tolerance was conducted on 188 West African lines phenotyped under early water stress and non-limiting irrigation [31]. A GWAS carried out with 392,493 SNPs generated by GBS led to the identification of QTL controlling biomass production under conditions of early water stress and stay green. Genes involved in the synthesis of sirohaem and waxes co-localize respectively with these two QTLs [31].

Phosphorus absorption
In West Africa, pearl millet is grown on soils that are often very low in phosphorus. DArTs have identified nine markers significantly associated with phosphorus-related traits [32]. Some polymorphisms were divided between traits related to phosphorus absorption efficiency and yield. However, validation of these markers is necessary to determine their applicability in improvement programs.

Iron and zinc contents
On a population of recombinant lines from a cross where one parent has high iron and zinc levels, DArT and SSR markers were used to establish QTLs related to the content of these elements in millet [20]. The analysis revealed 11 QTLs related to the iron content in the grains and eight related to zinc. Three high-impact QTLs for these elements were co-mapped in the population. Favorable alleles are transmitted by the male parent with the highest levels of iron and zinc. Growing pearl millet varieties enriched with these elements will be valuable in areas where rural populations face malnutrition.

Genetic diversity
Accurate assessment of diversity is of great importance in the analysis of cultivar genetic variability, introduction of genes of interest, phylogeny or population structuring. Having evolved under environmental constraints, millet has a high level of polymorphism, both between and within accessions. Before the emergence of molecular markers, methods based on pedigree data, morphological, biochemical characteristics...have been used in diversity studies in millet [33]. Markers such as RFLP [35], AFLP [36], RAPD [37], SSR [38], [39], [40] and a combination of DArT/SSR [11] have been used in several diversity studies. Previously used polymorphic SSRs were used for neutral characterization, followed by adaptive genotyping characterization of the PgMADS11 and PgPHYC genes involved in flowering to assess the genetic diversity and adaptation of a collection of millet from Senegal [41]. The analysis shows a great diversity and differentiation into two pools between early flowering cultivars and those late flowering, according with the zones of culture. The adaptive diversity assessment shows the specificity of a PgPHYC gene SNP and an InDel at the PgMADS11 gene associated with the observed differences in flowering durations.
Genome-wide characterization of germplasm is necessary for a better understanding of millet genomic resources. Also, the genetic diversity of 248 cultivars from Senegal and 252 accessions from a global collection has been analyzed by GBS with a couple of PstI-MspI restriction enzymes that allowed the identification of 83,875 SNPs [42]. Senegalese cultivars had the highest levels of genetic diversity while accessions from Southern Africa and Asia 686 had the lowest levels of diversity (Figure 1). A clear structuring of the population between the varieties of Senegal and the world collection on the one hand, and between countries within the world collection on the other hand, was observed. A weak population structure was noted within cultivars from Senegal ( Figure 2). In addition, the decrease in the linkage disequilibrium (LD) was much rapid and delayed.The rapid decreaseof the LD and the lack of structuring along the agro-ecological zones of Senegalese millet open up possibilities for association mapping. A comparison with the genomes of Setariaitalica and Sorghum bicolor indicates large regions of synteny and largescale rearrangements in millet evolution.  (Hu etal., 2015) 687 Another application of the GBS generated 82,112 SNPs that were also used to analyze genetic diversity, population structuring and LD on a collection of millet, following the alignment of reads on the reference genome [43]. The study reveals the structuring into six groups (with genetic differentiation within them) and a more rapid decrease in LD in the West African subpopulation

Other applications
The phenomenon of heterosis has been studied a lot on many species. In millet, SSR markers were used to identify heterotic groups in a collection of hybrid parental lines, with the reference genotype Tift23 D2B1-P1-P5. as control [44]. These markers have made it possible to distinguish two groups that should be refined by increasing the number of crosses and to evaluate the hybrids obtained in more representative locations.The origin and the areas of domestication of millet have been the subject of much investigation. Whole-genome sequencing (WGS) data from 221 accessions consisting of wild forms and traditional varieties representative of the geographical diversity of millet and the modelling of allele flows have confirmed the West African origin of the species and the syndrome of its domestication south of the Sahara, about 190 years ago 4900 years [45]. The study shows that millet varieties grown in the world derive from common ancestors originating in the Sahelian Center. During its diffusion, the cultivated varieties would have hybridized with local wild millet, leading to the great genetic diversity observed in the Western and Eastern areas of the Sahel. The hypothesis of gene flow from wild populations during domestication in these regions was confirmed by the identification of 15 genomic regions that were the subject of adaptive introgression. Previous studies had hypothesized that Senegal and eastern Sahel could constitute centres of domestication of millet [42]; [46]. Double-use millet (grain and fodder) is of increasing interest to research. SSRs and GBS have been used on hybrid varieties to identify markers related to forage quality [47]. Between two successive cuts, a positive evolution in crude protein content and in-vitro digestibility of organic matter was observed.  [49].

Conclusion:-
The objective of this reviewwas to synthesize the resources obtained with molecular markers on pearl millet. This is how we presented the main genetic linkage maps established. Pearl millet evolved in Africa and Asia under difficult environmental conditions that gave it an adaptation to these environments. However, it still faces many biotic and abiotic constraints that limit its production of grain and quality fodder. Many QTLs related to interesting agronomic traits have been mapped. Genomic tools have been used to know the syndrome of domestication of millet in the 688 south of the Sahara and have shown the importance of genes introgressionfrom wild populations to cultivated varieties. Molecular markers have also provided a more accurate assessment of millet genetic diversity and phylogeny relationships. The strong population structure observed in wild accessions reflects a genetic diversity that deserves to be further exploited. The availability of a reference genome presents an important resource and offers immense prospects for improving this important cereal.